vtk-m

mirror of https://gitlab.kitware.com/vtk/vtk-m synced 2024-09-08 13:23:51 +00:00

Author	SHA1	Message	Date
Sujin Philip	8c242cef91	Switch from faux to true virtuals	2017-11-06 15:25:29 -05:00
Robert Maynard	a6eecbe9ac	ExecutionArrayInterface now can hint at how allocated memory will be used. Certain backends desire the ability to mark allocations as being used for reading versus writing to improve performance.	2017-11-02 10:12:57 -04:00
Matthew Letter	24d0e7766e	Merge remote-tracking branch 'remotes/origin/master' into cmake_refactor	2017-10-31 16:57:41 -06:00
Sujin Philip	5842da4921	Remove ArrayHandle CopyInto Fixes #170	2017-10-27 17:28:59 -04:00
Robert Maynard	ed8f4111ef	Update all the code to work with CMake 3.3 Obviously this does mean that CUDA is not supported with 3.3.	2017-10-27 15:30:14 -04:00
Robert Maynard	56c7362258	A thought on what CMake 3.9 would mean to VTK-m.	2017-10-27 15:29:51 -04:00
Li-Ta Lo	3acd7c37a1	Merge topic 'pointlocator' ed3a64a5 Coding style improvment 7fa800b7 Update TestingPointLocatorUniformGrid.h f1974cab Update TestingPointLocatorUniformGrid.h 508882fa PointLocatorUniformGrid Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !973	2017-10-25 10:42:07 -04:00
Allison Vacanti	5a99dd761b	Only use cuda hints for CUDA 8.0+.	2017-10-24 11:55:07 -04:00
Li-Ta Lo	508882fa21	PointLocatorUniformGrid Provide an accelerated neareast neighbor search of points in the dataset using a one layer uniform grid.	2017-10-19 11:44:36 -06:00
Allison Vacanti	1018d981a0	Check for overlap in CopySubRange. Some parallel copy implementations will not handle this sanely.	2017-10-11 16:52:32 -04:00
Sujin Philip	41679cb5f9	Add a CellLocator Implements a two-level uniform grid cell locator	2017-10-10 14:01:41 -04:00
Robert Maynard	f8f1adc962	Forward decleare DeviceAdapterAlgorithm correctly as a struct	2017-10-06 09:50:12 -04:00
Allison Vacanti	75f88b4c46	Add versioning to VTKM installed include/share dirs.	2017-10-02 11:39:10 -04:00
Robert Maynard	311618a15f	Enable highest level of warnings(W4) under MSVC This will make VTK-m warning level match the one used by VTK. This commit also resolves the first round of warnings that W4 exposes.	2017-09-22 13:04:28 -04:00
Kenneth Moreland	c3a3184d51	Update copyright for Sandia Sandia National Laboratories recently changed management from the Sandia Corporation to the National Technology & Engineering Solutions of Sandia, LLC (NTESS). The copyright statements need to be updated accordingly.	2017-09-20 15:33:44 -06:00
Allison Vacanti	00320e5dc0	Use vtkm::UInt64 for byte sizes. vtkm::Id could be just 32bits, which limits the number of values we would be able to store for large arrays.	2017-09-19 09:21:14 -04:00
Allison Vacanti	28ab480a40	Fix warnings on renar.	2017-09-18 15:33:02 -04:00
Robert Maynard	b9e69217ae	Merge topic 'typedef_to_using_round_4' f6863594 Convert VTK-m over to use 'using' instead of 'typedef' Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !885	2017-08-17 16:38:49 -04:00
Sujin Philip	72a6cf4a21	Change cuda calls to use the per-thread stream.	2017-08-17 11:03:02 -04:00
Robert Maynard	f68635941e	Convert VTK-m over to use 'using' instead of 'typedef'	2017-08-17 10:47:25 -04:00
Robert Maynard	89f439999a	Reduce the amount of typedef statements in DeviceAdapters By using the auto keyword and decltype we can reduce the number of complex typedefs that exist when writing device adapter algorithms. The goal being that it is easier for developers to see the actual algorithms being implemented, by reducing the amount of template 'noise'.	2017-08-16 14:23:21 -04:00
Allison Vacanti	326757b571	Remove ArrayHandleCuda. !861 (b0dba9a1) adds this functionality to basic ArrayHandles.	2017-08-10 15:21:52 -04:00
Allison Vacanti	0a828189ad	Reuse user-provided cuda device pointers when possible.	2017-08-03 17:21:13 -04:00
David C. Lonie	bd042ec567	Add CudaAllocator to encapsulate runtime managed memory logic. Unified memory is now used when we detect that the hardware supports it.	2017-07-31 09:08:27 -04:00
Robert Maynard	76532264c3	Correct signed to unsigned warnings in ArrayManagerExecutionCuda.	2017-07-20 10:42:09 -04:00
David C. Lonie	379c3a0fad	Use current device when allocating managed memory.	2017-07-13 12:55:22 -04:00
Li-Ta Lo	65910f139c	put return value back to ScanInclusivePortal	2017-07-10 12:11:51 -06:00
Li-Ta Lo	16b61d8697	Make ScanInclusiveByKey and ScanInclusiveByKey void functions. These two algorithms does not return meaningful return values. Generic interface and implementation are both void. Remove erronous return type and statement for CUDA backend.	2017-07-07 15:11:42 -06:00
Robert Maynard	09a08fea8d	Correct errors in TaskTuner found from the TestBuilds.	2017-07-07 08:40:03 -04:00
Robert Maynard	c11f29c093	Move the parameter sweeping code to a separate header. The parameter sweeping code is only enabled when tuning for new GPU's so we should move it to a separate header to make DeviceAdapterAlgorithmThrust easier to read.	2017-07-05 14:58:19 -04:00
David C. Lonie	b2c3e41645	Refactor array transfer logic for basic storage. The old templated array transfer mechanism generated a lot of code that ended up doing a simple, type-agnostic memcpy for most devices. This patch specialized array handles for basic storage and uses a fast-path array transfer implementation. This reduces the size of the vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).	2017-06-29 13:18:44 -04:00
Robert Maynard	cfda0593be	ArrayManagerExecutionThrustDevice stops generating casting warnings.	2017-06-16 08:50:32 -04:00
Robert Maynard	5dd346007b	Respect VTK-m convention of parameters all or nothing on a line clang-format BinPack settings have been disabled to make sure that the VTK-m style guideline is obeyed.	2017-05-26 13:53:28 -04:00
Robert Maynard	60a405ef65	Add TaskTiling1D/3D which use faux virtuals to reduce binary size. Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that we can re-use the same launching logic for all Worklets, instead of generating per worlet code. To keep the performance the same the TilingTask now is past a range of indices to work on, rather than a single index. Binary size reduction: WorkletTests_SERIAL old - 19MB WorkletTests_SERIAL new - 18MB WorkletTests_TBB old - 39MB WorkletTests_TBB new - 18MB libvtkAcceleratorsVTKm old - 48MB libvtkAcceleratorsVTKm new - 19MB	2017-05-25 11:00:01 -04:00
Kitware Robot	4ade5f5770	clang-format: apply to the entire tree	2017-05-25 07:51:37 -04:00
Kitware Robot	efbde1d54b	clang-format: sort include directives	2017-05-18 12:59:33 -04:00
Robert Maynard	022c36fa4f	Add vtkm::exec::TaskBase, and rename WorkletInvokeFunctor to TaskSingular Previously WorkletInvokeFunctor inherited from vtkm::exec::FunctorBase, which is also the base class for all users Worklets and for all functors based to DeviceAdapter::Schedule. This is done for a few reasons. The first is that we reduce the minimum size of user worklets. Previously the users worklet would hold a reference to the error message, and so would the wrapper class added when calling DeviceAdapter::Schedule. Now we only have the users worklet holding a reference. Second, by refactoring to have two base classes we can better improve the documentation on what responsibilities FunctorBase.h has, compared to TaskBase.	2017-05-02 16:38:43 -04:00
Sujin Philip	e9898cc5cf	Merge topic 'virtual-methods' 4049b5b2 Add ClipWithImplicitFunction Filter 82d02e46 Modify ImplicitFunctions to use Virtual Methods 968960c1 Add Virtual Methods Framework Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !750	2017-05-02 16:12:04 -04:00
Sujin Philip	82d02e46ef	Modify ImplicitFunctions to use Virtual Methods	2017-05-01 16:55:59 -04:00
Sujin Philip	968960c1a1	Add Virtual Methods Framework	2017-05-01 16:51:42 -04:00
Robert Maynard	80b9d74a23	Merge topic 'embed_more_into_vtkm_cont' ec6589d3 Only enable -fPIC on component static libraries when necessary. cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont. 355eea88 Get the vtkm cont cuda object to compile properly. 6ecc22bb First pass at compiling ArrayHandle into vtkm_cont. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !715	2017-04-26 13:47:10 -04:00
Li-Ta Lo	0ba9784082	Merge topic 'scanbykey' 5c735a38 this should resovle all the type conversion warnings d8b02329 Merge branch 'scanbykey' of gitlab.kitware.com:ollielo/vtk-m into scanbykey 58ef7c8d one more attemp to get the data type right 22d0e355 attempt to fix warning on type conversion ded4583a attempt to fix warning on type conversion 897b2f0f add tests for 1, 2 and ARRAY_SIZE elements for both ScanInclusiveByKey and ScanExclusiveByKey c05a2c32 Merge branch 'scanbykey' of gitlab.kitware.com:ollielo/vtk-m into scanbykey 0e97fcb9 handle the case of 0 or 1 element in the input for ScanExclusiveByKey ... Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !746	2017-04-25 15:54:25 -04:00
Li-Ta Lo	897b2f0f63	add tests for 1, 2 and ARRAY_SIZE elements for both ScanInclusiveByKey and ScanExclusiveByKey	2017-04-19 13:38:28 -06:00
Li-Ta Lo	a205f21043	make ScanExclusiveByKey return void, rearrange parameter ordering	2017-04-17 16:11:02 -06:00
Li-Ta Lo	7023266585	add both generic and Thrust ScanExclusiveByKey	2017-04-17 15:03:49 -06:00
Li-Ta Lo	e77f9fac6a	add CUDA implementation of ScanInclusiveByKey using Thrust library	2017-04-14 11:25:25 -06:00
David C. Lonie	4807b3c472	Silence warnings about unavoidable weak vtables. - Exception classes cannot be exported due to MSVC's design decisions. See http://stackoverflow.com/questions/24511376. We must leave these classes as header only and silence the warnings. - TransferResource in BufferState.h must remain a header-only class since there is no vtkm_interop library to compile the class into. - The VTKDataSetReader hierarchy must similarly remain header-only since there is no vtkm_io library. - The OptionParser Action classes are part of a header-only utility and cannot be easily compiled into a library. -	2017-04-13 14:06:33 -04:00
David C. Lonie	cbfe5fddd9	Fix up various issues with ArrayHandles in vtkm_cont.	2017-04-05 15:45:11 -07:00
Robert Maynard	355eea887c	Get the vtkm cont cuda object to compile properly.	2017-04-05 15:45:10 -07:00
David C. Lonie	6ecc22bb8c	First pass at compiling ArrayHandle into vtkm_cont.	2017-04-05 15:45:01 -07:00
Li-Ta Lo	2bdc0be5ca	add cuda calls for memory advise as per Tom Fogel	2017-03-14 14:19:01 -06:00
Li-Ta Lo - 194699	6ce8a0135a	Merge branch 'master' into unified-memory	2017-03-09 14:54:03 -07:00
Li-Ta Lo - 194699	b470175f98	new unified memory effort with the new Thrust device	2017-03-09 14:51:45 -07:00
Sujin Philip	9eddce6c99	Rename StreamCompact to CopyIf Plus, removes the version that uses one array as both input and stencil.	2017-03-06 11:08:27 -05:00
Sujin Philip	8c4bbc39ad	Use C++11 =delete keyword	2017-02-24 09:39:22 -05:00
Sujin Philip	a88807fd7e	Catch all exceptions by reference	2017-02-23 13:25:01 -05:00
David Lonie	a0016456cc	Merge topic 'default_device_to_public_header' 7a41621d Move default device selection out of private headers. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !693	2017-02-17 08:26:01 -05:00
David C. Lonie	7a41621d82	Move default device selection out of private headers. This will make the librarification of vtk-m easier as we tread that path. Refs #120.	2017-02-16 13:40:35 -05:00
Tom Fogal	31e20859d4	Fix typo.	2017-02-14 10:29:00 -08:00
Li-Ta Lo - 194699	835073dae2	clean up with custom allocator	2017-02-13 11:45:17 -07:00
David C. Lonie	f601e38ba8	Simplify exception hierarchy. Remove the ErrorControl class such that all subclasses now inherit from error. Renamed all exception classes via s/ErrorControl/Error/. See issue #57.	2017-02-07 15:42:38 -05:00
Robert Maynard	6e4e07b378	Merge topic 'all_array_portals_have_set' 629271bc Make sure all ArrayPortals have a Set method. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !676	2017-02-01 09:08:51 -05:00
Kenneth Moreland	629271bceb	Make sure all ArrayPortals have a Set method. The current design for ArrayPortalVirtual makes it a requirement for all array portals (that it wraps) to have Set defined. Thus, make sure Set is defined for all ArrayPortal. Where Set is invalid, an assert is thrown if something calls it at runtime.	2017-01-31 15:46:39 -05:00
David C. Lonie	575d74d143	Manage cuda device memory with cudaMalloc instead of thrust::vector.	2017-01-30 15:36:37 -05:00
Christopher Sewell	82c40a6374	First support for unified memory	2017-01-18 11:43:49 -07:00
Robert Maynard	bd1ff7a5ac	Allow vtkm errors to properly work with shared libraries.	2017-01-16 09:17:38 -05:00
Robert Maynard	d8d6fd1741	StorageTags are now always exported to resolve future dynamic_cast issues. Class that need to be passed across dynamic library boundaries such as DynamicArrayHandle need to be properly export. One of 'tricks' of this is that templated classes such as PolymorphicArrayHandleContainer need the Type and Storage types to have public visibility. This makes sure that all vtkm storage tags have public visibility so in the future we can transfer dynamic array handles across libraries.	2017-01-16 09:17:38 -05:00
Kenneth Moreland	7ec8c03489	Remove warning about constant conditional in VTKM_ASSUME It is common practice to swallow a semicolon in a macro using a do/ while(false) loop that only executes once. However, the Visual Studio 2013 compiler was stupidly warning about having a constant expression for the loop. Get around the problem by creating our own swallow semicolon macros that disable this warning as necessary.	2017-01-10 10:40:17 -07:00
Kenneth Moreland	98c8cb8657	Do not attempt to execute CUDA kernels with no blocks I noticed that when I attempted to execute a CUDA kernel with 0 blocks, it generated a CUDA error. The error did not really matter since nothing was really supposed to run anyway. However, once we are careful about checking CUDA errors, it will cause test failures and likely other misdiagnoses.	2016-12-15 11:33:48 -07:00
Kenneth Moreland	55c159d6f0	Check error codes from CUDA functions Most functions in the CUDA runtime API return an error code that must be checked to determine whether the operation completed successfully. Most operations in VTK-m just called the function and assumed it completed correctly, which could lead to further errors. This change wraps most CUDA calls in a VTKM_CUDA_CALL macro that checks the error code and throws an exception if the call fails.	2016-12-14 10:43:44 -07:00
Robert Maynard	b97b4cc7ef	Allow thrust::reduce to work when iterator and initial value types differ. natively thrust::reduce is unable to handle the use case where the iterator type and the initial value/return value are two different types. We use array handle cast to work around this problem when we detect this usecase.	2016-11-25 13:11:19 -05:00
Robert Maynard	2cfc9743e3	Reduce can support reduce to a T type that isn't the arrayhandles T type. This has been done so that operations such as computing the Min/Max of an array can be done in a single reduce step.	2016-11-25 11:40:46 -05:00
Kenneth Moreland	fdaccc22db	Remove exports for header-only functions/methods Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and EXEC_CONT.) Remove the inline from these macros so that they can be applied to everything, including implementations in a library. Because inline is not declared in these modifies, you have to add the keyword to functions and methods where the implementation is not inlined in the class.	2016-11-15 22:22:13 -07:00
Christopher Sewell	1ebf0c17b6	Attempt 10 to resolve Windows compiler warning with streaming storage	2016-10-21 13:10:49 -06:00
Christopher Sewell	05975a2325	Attempt 3 to resolve Windows compiler warning with streaming storage	2016-10-20 10:32:30 -06:00
Christopher Sewell	c6e15c1240	Merge remote-tracking branch 'upstream/master' into StreamingArray	2016-10-07 18:10:29 -06:00
Robert Maynard	0f58d6fc54	Add vtkm/cont/serial directory for the serial backend.	2016-09-28 14:22:53 -04:00
Robert Maynard	f7a9bbe0e7	All cuda / tbb unit tests follow the same naming pattern.	2016-09-28 14:22:53 -04:00
Christopher Sewell	d92f39df12	Merge branch 'master' into StreamingArray	2016-09-15 17:54:59 -06:00
Christopher Sewell	610f96a831	Adding streaming inclusive scan	2016-09-15 17:46:09 -06:00
Robert Maynard	f81c42b9b4	Replace NULL with nullptr where applicable.	2016-09-01 09:38:25 -04:00
Robert Maynard	51e50d2933	Add DeviceAdapter::CopySubRange to all device adapters. This allows callers to copy a subsection of an array into another array, without clearing the contents of the destination array if a resize is required.	2016-08-24 15:42:51 -04:00
Christopher Sewell	767a5f6792	Clean up for streaming dispatcher	2016-08-03 19:47:12 -06:00
Christopher Sewell	504e6f55df	Working streaming dispatcher	2016-08-03 18:48:31 -06:00
Christopher Sewell	ced9fd32db	Improving streaming dispatcher	2016-08-03 15:34:43 -06:00
Christopher Sewell	2caf81a4af	First commit for ArrayHandleStreaming	2016-08-02 16:54:12 -06:00
Robert Maynard	45ada6b55a	Rework ArrayHandleCuda to make it stop generate warnings	2016-07-20 12:41:13 -04:00
Robert Maynard	4ca6ce2ad6	nvcc doesn't have troubles with boost shared_ptr optimizations	2016-07-19 13:26:23 -04:00
Kenneth Moreland	51a35cb4fe	Fix warnings about type conversions	2016-06-27 07:50:15 -06:00
Kenneth Moreland	fd29c81bde	Fix warnings about redefined macros The build automatically sets some macros when building CUDA files. Some of the CUDA sources were setting the same macros, which was causing warnings. Change the code to be more careful about setting preprocessor macros.	2016-06-27 07:50:13 -06:00
Kenneth Moreland	7ff20c9230	Fix includes for CUDA builds The CMake CUDA build targets do not respect the target_include_directories (yet?). Instead, add the necessary includes to cuda_include_directories().	2016-06-22 12:53:23 -06:00
Kenneth Moreland	b5415169e2	Change Field and related methods to use Range and Bounds First, be more explicit when we mean a range of values in a field or a spacial bounds. Use the Range and Bounds structs in Field and CoordinateSystem to make all of this more clear (and reduce a bit of code as well).	2016-05-29 18:49:36 -06:00
Robert Maynard	e5c3f9c42d	Solve reduce by key bugs with cuda 7.5 + maxwell hardware. The concern is now all architectures are doing a hardware sync on reduce_by_key. This isn't a super serious concern, but it is a downside.	2016-05-12 13:24:59 -04:00
Matt Larsen	eeae5c1352	Adding fast path for radix sort sort_by_key	2016-05-03 08:53:42 -07:00
Kenneth Moreland	cc497e6a1b	Remove cont/Assert.h and exec/Assert.h These asserts are consolidated into the unified Assert.h. Also made some minor edits to add asserts where appropriate and a little bit of reconfiguring as found.	2016-04-20 15:41:14 -06:00
Matt Larsen	e5c4aa3f78	Fixing cuda index error	2016-03-08 12:41:11 -08:00
Matt Larsen	249cce352b	Adding type restrictions to serial atomics	2016-03-08 10:39:23 -08:00
Matt Larsen	40b6db7eee	Inserted missing ,	2016-03-08 09:51:50 -08:00
Matt Larsen	3b46706e1f	Adding compare and swap and removing unsigned atomics	2016-03-08 09:41:02 -08:00
Matt Larsen	e8b08f2e00	Merge branch 'master' into feature/atomics	2016-03-04 08:03:33 -08:00
Robert Maynard	2f98cdf717	Resolves Issue #56 : ChooseCudaDevice functions are in the proper namespace.	2016-02-26 13:51:28 -05:00
Matt Larsen	2baac9cd8b	initial commit of atomic adds	2016-02-10 07:51:31 -08:00
Kenneth Moreland	3f446ad261	Add ErrorControlCuda for better CUDA error checking. Add lots of checks to CUDA calls in the timer to try to identify any problems that might be showing up on the dashboard. Also adding some print statements around the sleep function in the device adapter testing. For some reason the problem just went away with them.	2016-01-12 15:19:54 -07:00
Kenneth Moreland	f582729803	Synchronize the CUDA timer on both the start and end events Previously, the timer for CUDA devices only called cudaEventSynchronize at the end event when asking for the elapsed time. This, however, could allow time to pass from when the timer was reset to when the start event happened that was not recorded in the timer. This added synchronization should make sure that all time spent in CUDA is recorded.	2016-01-12 15:13:56 -07:00
Robert Maynard	c70da5fc23	Extend vtkm::DeviceAdapterTraits to include a unique numeric identifier. Previously each device adapter only had a unique string name. This was not the best when it came to developing data structures to track the status of a given device at runtime. This adds in a unique numeric identifier to each device adapter. This will allow classes to easily create bitmasks / lookup tables for the validity of devices.	2015-12-16 11:18:52 -05:00
Robert Maynard	a7127f0fc3	Adding vtkm::cont::RuntimeDeviceInformation. The RuntimeDeviceInformation class allows developers to check if a given device is supported on a machine at runtime. This allows developers to properly check for CUDA support before running any worklets.	2015-12-15 17:25:27 -05:00
Kenneth Moreland	1a538ca196	Merge branch 'scatter-worklets' into 'master' Scatter in worklets Add the functionality to perform a scatter operation from input to output in a worklet invocation. This allows you to, for example, specify a variable amount of outputs generated for each input. See merge request !221	2015-11-11 13:09:47 -05:00
Robert Maynard	b3687c6f3c	Workaround inclusive_scan issues in thrust 1.8.X for complex value types. The original workaround for inclusive_scan bugs in thrust 1.8 only solved the issue for basic arithmetic types such as int, float, double. Now we go one step further and fix the problem for all types. The solution is to provide a proper implementation of destructive_accumulate_n and make sure it exists before any includes of thrust occur.	2015-11-09 17:14:30 -05:00
Kenneth Moreland	f7789f0ed7	Fix issue with const types in Thrust array management Previously, there was a declaration ConstArrayPortalFromThrust<const T> in ArrayManagerExecutionThrustDevice. This proved problematic because values read from the array in the worklet were typed as const T rather than simply T. Any Vec or Matrix built from that type would then fail because they are not meant to work with a const value (which means they have to be set on construction and never changed. Instead, declare ConstArrayPortalFromThrust<T> and internally set all the Thrust pointers to have type const T. Also declare other thrust pointers used as method parameters to have const T rather than T. This should work as conversion from T to const T should be fine, but not the other way around.	2015-11-06 18:05:21 -07:00
Patricia Kroll Fasel - 090207	4757c0ae9e	Merge branch 'master' of gitlab.kitware.com:Fasel/vtk-m into cell_to_point	2015-11-04 13:28:08 -07:00
Patricia Kroll Fasel - 090207	a5f1f823ae	Set default device to cuda in unit test of DataSetExplicit to bypass compiler errors in CellSetExplicit.	2015-11-04 10:16:44 -07:00
Patricia Kroll Fasel - 090207	480f0bd416	Merge branch 'master' of gitlab.kitware.com:Fasel/vtk-m into cell_to_point	2015-11-03 13:48:23 -07:00
Robert Maynard	1b30d6e6de	Update Cuda so that SumExclusiveScan supports fancy iterators.	2015-11-03 13:28:07 -05:00
Robert Maynard	97550d5e2d	Update Cuda so that UnaryPredictes work with fancy cuda array handles.	2015-11-03 13:28:07 -05:00
Patricia Kroll Fasel - 090207	02e16e7e25	Merge branch 'master' of gitlab.kitware.com:Fasel/vtk-m into cell_to_point	2015-11-03 11:15:26 -07:00
Sujin Philip	1b8fe17f1b	Fix for several warnings	2015-11-03 09:11:38 -05:00
Sujin Philip	fd244c4142	Fix errors and warnings caused by recent changes to device adapter tag logic	2015-11-03 09:11:38 -05:00
Robert Maynard	71cf2a7d92	Fix return statement in DeviceAdapterAlgorithmThrust.	2015-11-02 16:46:02 -05:00
Patricia Kroll Fasel - 090207	d167c75596	Merge branch 'master' of gitlab.kitware.com:Fasel/vtk-m into cell_to_point	2015-11-02 11:09:07 -07:00
Patricia Kroll Fasel - 090207	ed0ecf284d	Parallel CellToPoint initial code.	2015-10-30 13:59:36 -06:00
Robert Maynard	85d28667c2	Add a return statement to reduce to stop false positive warnings.	2015-10-30 14:55:13 -04:00
Kenneth Moreland	b861209a22	Fix nvcc warnings on MSVC There is a strange nvcc warning in CUDA 7.5 that sometimes happens on MSVC that causes it to emit a warning for an undefined method that is clearly defined. The CUDA development team is aware of the problem and is going to fix it, but these changes will work around the problem for now. Thanks to Tom Fogal from NVIDIA for these fixes.	2015-10-21 08:33:15 -06:00
Robert Maynard	8de216c088	Propagate vtkm::Id3 scheduling down to the ThreadIndex classes. This now allows for even more efficient construction of uniform point coordinates when running under the 3d scheduler, since we don't need to go from 3d index to flat index to 3d index, instead we stay in 3d index	2015-10-20 09:29:41 -04:00
T.J. Corona	b1665dcb32	Add an array handle for bare cuda device pointers. Array handles for cuda device pointers have been implemented. The data for these handles exists solely on the exec side (info such as length can be queried from the cont side).	2015-10-09 12:41:33 -04:00
Robert Maynard	f38673f618	Replace ErrorControlOutOfMemory with ErrorControlBadAllocation.	2015-10-01 14:25:28 -04:00
Robert Maynard	056f69bf96	Remove unused variable and conversion warnings from cuda code.	2015-09-21 14:17:25 -04:00
Sujin Philip	1d2657f360	Make cuda DeviceAdapter valid only when using nvcc Before it was valid even on a regular compiler, if cuda was available.	2015-09-18 12:21:11 -04:00
hschroot	20c1a04894	CopyInto function for ArrayHandles ArrayHandles in DAX have a CopyInto function which allows the user to copy an array handle's data into a compatible STL type iterator. Originally this was fairly straight forward to implement since array handles in DAX are templated on the DeviceAdapterTag. In contrast, VTKm array handles use a polymorphic ArrayHandleExecutionManager under the hood allowing a single array handle to interface with multiple devices at runtime. To achieve this virtual functions are used. This makes implementing the CopyInto function difficult since it is templated on the IteratorType and virtual functions cannot be templated. To work around this, I've implemented a concrete templated CopyInto function in the class derived from ArrayHandleExecutionManagerBase. In the ArrayHandle class, CopyInto dynamically casts the base class into the derived class, then calls the CopyInto function defined in the derived class. The drawback to this approach is that, should the user define their own class that inherits from ArrayHandleExectionManagerBase, they are not forced to implement the CopyInto function, unlike the other virtual functions.	2015-09-17 14:26:19 -04:00
Robert Maynard	fd68521066	Always install all device headers even when device isn't enabled. vtkm_declare_headers now is able to not test headers, by using the TESTABLE keyword.	2015-09-17 09:28:21 -04:00
Robert Maynard	4d635d642b	DeviceAdapter Tags now always exist, and contain if the device is valid. Previously it was really hard to verify if a device adapter was valid. Since you would have to check for the existence of the tag. Now the tag always exists, but instead you query the traits of the DeviceAdapter to see if it is a valid adapter. This makes compiling with multiple backends alot easier.	2015-09-17 09:28:21 -04:00
Robert Maynard	6272fdcc54	Correct multiple signature compile issue.	2015-09-08 09:39:57 -04:00
Robert Maynard	aa7f5c34b9	Resolves Issue #42 . Now all thrust API calls are in try/catch blocks.	2015-09-07 12:20:31 -04:00
Robert Maynard	72450e87f3	Make thrust use fast paths when doing sort and scan. By introducing our own custom thrust execution policy we can make sure to hit the fastest code paths in thrust for the sort operation. This makes sure that for UInt32,Int32, and Float32 we use the radix sort from thrust which offers a 2x to 3x speed improvement over the merge sort implementation. Secondly by telling thrust that our BinaryOperators are commutative we make sure that we get the fastest code paths when executing Inclusive and Exclusive Scan Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results: median = 0.0117049s median abs dev = 0.00324614s mean = 0.0167615s std dev = 0.00786269s min = 0.00845875s max = 0.0389063s Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results: median = 0.0234463s median abs dev = 0.000317249s mean = 0.021452s std dev = 0.00470307s min = 0.011255s max = 0.0250643s Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results: median = 0.0310486s median abs dev = 0.000182129s mean = 0.0286914s std dev = 0.00634102s min = 0.0116225s max = 0.0317379s Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results: median = 0.0310617s median abs dev = 0.000193583s mean = 0.0295779s std dev = 0.00491531s min = 0.0147257s max = 0.032307s	2015-09-03 16:00:37 -04:00
Robert Maynard	8422108f28	Cuda copy from host to device can't use the cuda execution policy.	2015-09-02 09:53:00 -04:00
Robert Maynard	efc9f0c5cf	All occurrences of thrust invocation uses an execution policy.	2015-09-01 19:32:49 -04:00
Sujin Philip	514ac54e59	Add custom operator and initial value support to ExclusiveScan	2015-08-28 09:56:04 -04:00
Robert Maynard	619103b202	Update cuda ScanExclusive to handle dereferencing device only arrays.	2015-08-26 11:51:02 -04:00
Robert Maynard	157d8efee4	Workaround thrust 1.8 inclusive scan issue. Starting in thrust 1.8 the implementation of scan inclusive inside thrust became highly optimized by using parallel task groups. This new implementation has a bug that only exists when using custom binary operators, large size arrays, release mode, and no debugger or mem-checker attached. While I have submitted the issue to thrust, we need to be able to work around the existing issue. The solution I have chosen is to mark all vtkm::exec::cuda::interal::WrappedBinaryOperators as being commutative as far as thrust is concerened. To make sure we don't get any unexpected behavior I have also had to create WrappedBinaryPredicate so that we don't mark any predicate as commutative.	2015-08-17 10:39:14 -04:00
Robert Maynard	ab59e34a2f	Rename pragma header guard so it makes sense for tbb and thrust. Boost is not the only thirdparty that we are supressing warnings for, so make the name more generic.	2015-08-13 09:04:23 -04:00
Robert Maynard	8204db2f6a	Use VTKM_BOOST_PRE_INCLUDE around thrust headers too.	2015-08-13 08:26:41 -04:00
Robert Maynard	bae6ff7f55	Merge branch 'introduce_binary_and_unary_operators' into 'master' Introduce binary and unary operators See merge request !94	2015-08-06 15:14:28 -04:00
Robert Maynard	d3fd571ef2	Add vtkm/UnaryPredicates header. Currently includes the following predicates: - IsDefaultConstructor - NotDefaultConstructor - LogicalNot	2015-07-30 13:12:59 -04:00
Robert Maynard	d9fd702b1c	Make detecting if we are cuda 3+ gpu running cuda 2 code faster. The original implementing tried to run 2^31 kernels and detect a launch failure to determine this use-case. The issue with this approach is that on a cuda 3+ gpu, this would take multiple seconds and cause the gpu to terminate the kernel when opengl was also loaded.	2015-07-28 17:04:24 -04:00
Robert Maynard	19aa6b8d62	Update the 3d scheduling benchmark code to use the new 1d scheduler	2015-07-23 16:31:41 -04:00
Robert Maynard	d0d11640ea	Cuda now can schedule worklets that require more than 2B instances.	2015-07-23 16:31:28 -04:00
Sujin Philip	c7d3d0df5c	Merge branch 'add-compute-bounds' into 'master' Add compute bounds to Fields See merge request !88	2015-07-22 14:26:48 -04:00
Sujin Philip	91b191bf83	Add compute bounds to Fields	2015-07-22 12:17:33 -04:00
Robert Maynard	780f3fea29	Merge branch 'correct_cuda_scheduling_over_8m' into 'master' Correct cuda scheduling over 8m See merge request !91	2015-07-20 15:01:05 -04:00
Robert Maynard	9f16669e3c	Increase the robustness of 3d scheduling when X dim is very small.	2015-07-20 13:27:58 -04:00
Robert Maynard	7a21a08c46	Handle 1D scheduling with over 65k blocks on SM2.X arch. The initial implementation forgot about the fact that SM2.X architectures can only handle 65k blocks. Now we gracefully handle when compiling for SM2.X.	2015-07-20 13:26:16 -04:00
Sujin Philip	1a9e8d1e3d	Initial support for generating documentation using Doxygen	2015-07-17 15:35:59 -04:00
Robert Maynard	e74ded809a	Defer more thrust iterator deduction logic to ArrayPortalToIterators.	2015-07-14 10:11:12 -04:00
Robert Maynard	4ba1f7c853	Remove the need for any portal to define an IteratorType.	2015-07-13 17:16:27 -04:00
Kenneth Moreland	4fc3626712	Fix compiler directives for icc The Intel icc compiler tries to pretend it is gcc, but it sometimes behaves differently. Add more explicit checks for what compiler is being used.	2015-07-06 10:35:06 -06:00
Robert Maynard	d12375450f	Make ChooseCudaDevice only consider SM3 and above equal.	2015-06-30 08:18:16 -04:00
Robert Maynard	6af949f488	Fix warnings in ChooseCudaDevice by moving performance metric to double.	2015-06-30 08:16:42 -04:00
Kenneth Moreland	3b9bb5ffeb	Merge branch 'math' into 'master' Math Adds a set of portable math functions, most of which are "expected" to be available and are typically useful. They include things like trigonometry (sin, cos, tan, etc.), floating point information (nan, finite, etc), and min/max. See merge request !58	2015-06-27 20:29:49 -04:00
Kenneth Moreland	4a8d69ca66	Add CUDA math test Also fix some issues that caused the compile to fail when trying to run some of the math functions on a CUDA device. In particular, CUDA is picky about using a global const on a device when the const type is not one of the basic C types.	2015-06-25 13:54:24 -06:00
Robert Maynard	47f9874132	Extend the array handle testing to be done on a per device basis. Now we test ArrayHandle with all the vtkm types to make sure all the transfer logic / access logic works for all T types.	2015-06-25 08:21:49 -04:00
Robert Maynard	6b70893b96	Merge branch 'clarify_unary_and_binary_operator_names' into 'master' Clarify unary and binary operator names Fixes the naming issues brought up in Issue #6. See merge request !54	2015-06-23 15:49:03 -04:00
Robert Maynard	459552d795	Update the name of the binary functor for UpperBounds	2015-06-23 09:19:09 -04:00
Robert Maynard	3cbac03810	Update the name of the binary functor Unique	2015-06-23 08:58:53 -04:00
Robert Maynard	e6cadef457	Update the name of the unary functor for StreamCompact.	2015-06-23 08:56:18 -04:00
Robert Maynard	7ee3084868	Update the name of the binary functor for Sort.	2015-06-23 08:36:42 -04:00
Robert Maynard	604289c1ce	Update the name of the binary functor for ScanInclusive	2015-06-23 08:26:04 -04:00
Robert Maynard	8ba0dacc65	Update the name of the binary functor for Reduce.	2015-06-22 11:55:02 -04:00
Robert Maynard	0901e71574	Update the name of the binary comparison for LowerBounds.	2015-06-22 11:49:11 -04:00
Robert Maynard	2a66dec110	Correct a bug in the logic of determining the block size.	2015-06-22 08:51:25 -04:00
Robert Maynard	fa78610ea1	Remove inclusion of iostream header.	2015-06-19 13:03:56 -04:00
Robert Maynard	6c3f669398	Remove dead code for converting zip handles to iterators for cuda. This code isn't needed anymore as the new iterator facade handles this use case properly.	2015-06-18 13:53:02 -04:00
Robert Maynard	9d350fb023	Correct undefined behavior that was causing scan test failures. We need call PrepareForInput on the input argument before invoking a function. The order of execution of parameters of a function is undefined, so we need to make sure input is called before output, or else in-place use case breaks.	2015-06-17 15:35:27 -04:00
Robert Maynard	2ce14c7760	DeviceAdapter's now properly init all value types. This is needed when use vtkm::Vec, as it doesn't zero the memory it contains.	2015-06-16 08:35:04 -04:00
Robert Maynard	2c91cdfa3b	Update cuda/thrust backend scan algorithms to work with vec types.	2015-06-16 08:28:31 -04:00
Chun-Ming Chen	360279c29a	A more general Copy that allows copying into a diff. array type. Testing codes are added.	2015-06-12 12:42:43 -04:00
Robert Maynard	2a2159b1e1	Generalize the support for zip handles inside the cuda backend. Instead of having a single specialization for sort and zip handles, we know handle any fancy handles being passed to the cuda device adapter. This was done by reworking how we represent fancy iterators inside thrust, and instead of using a transform iterator + counting iterator we just use a iterator_facade.	2015-06-12 11:56:46 -04:00
Robert Maynard	d26cfcf5cc	Merge branch 'sort_inplace_zip_arrays' into 'master' Allow in-place sorting of ArrayHandleZip See merge request !29	2015-06-10 15:51:40 -04:00
Robert Maynard	c052400ffb	CUDA now converts ArrayHandleZip to ::thrust::zip iterator. This is required so that we can use ArrayHandleZip with Sort/Reduce and custom comparison operators. ArrayZip and PortalValue don't combine well together, when used with DeviceAlgorithm that has a custom operator. The custom operator is actually passed the PortalValue instead of the real values, and by that point we can't fix anything since we don't know what the original operator is.	2015-06-10 14:15:40 -04:00
Robert Maynard	549a7dd7c7	Adding support to cuda backend to write to a ZipHandle.	2015-06-10 14:15:40 -04:00
Robert Maynard	040d7e761e	Explicitly invoke all thrust algorithms using the cuda policy. This is required if we ever want to use thrust from another device adapter.	2015-06-10 09:41:06 -04:00
Robert Maynard	3a532160c9	cuda device adapter now support different value types for keys and values.	2015-06-01 17:19:33 -04:00
Robert Maynard	07970cf476	Correct all signed / unsigned and narrowing warnings ( 64bit to 32bit ).	2015-05-28 09:05:17 -04:00
Robert Maynard	d54aee7eb5	Merge branch 'fix_typo_in_copyright'	2015-05-21 10:32:08 -04:00
Robert Maynard	6b8e7822be	The Copyright statement now has all the periods in the correct location.	2015-05-21 10:30:11 -04:00
Robert Maynard	f2b47ffd4a	Add ReduceByKey to the DeviceAdapter.	2015-05-21 08:23:28 -04:00
Robert Maynard	be193542ac	Introduce StreamCompact with a custom stencil unary predicate.	2015-05-21 08:23:28 -04:00
Robert Maynard	e38caafe37	Adding Reduce with custom operator to the DeviceAdapterAlgorithm.	2015-05-14 15:16:49 -04:00
Robert Maynard	5d9f369d0c	Adding ScanInclusive with custom binary operator to DeviceAdapterAlgorithm.	2015-05-14 15:16:49 -04:00
Robert Maynard	9519737b3a	Adding Reduce to the DeviceAdapterAlgorithm.	2015-05-14 15:16:49 -04:00
Robert Maynard	fbee20fc03	Mark the local value for the result of GetNumberOfValues as const. We aren't modifying the value, so no reason to not have it as const.	2015-05-14 15:16:48 -04:00
Robert Maynard	59618b3a50	Merge branch 'sort_by_key'	2015-05-11 08:04:09 -04:00
Kenneth Moreland	66aa10aab7	Fix issues with warnings about data loss from type conversion Fix compile warnings that come up with the flags -Wconversion -Wno-sign-conversion This catches several instances (mostly in the testing framework) where types are implicitly converted. I expect these changes to fix some of the warnings we are seeing in MSVC. I was going to add these flags to the list of extra warning flags, but unfortunately the Thrust library has several warnings of these types, and I don't know a good way to turn on the warnings for our code but turn them off for Thrust.	2015-05-07 16:34:06 -06:00
Robert Maynard	6564d7af1c	Enable SortByKey Test on the Device Adapter.	2015-05-06 09:27:59 -04:00
Robert Maynard	c2f2c166c0	Test all Fancy ArrayHandles and Array Portals on every device adapter. Fancy array handles being Counting, Implicit, Permutation, Transform, and Zip.	2015-05-05 14:25:48 -04:00
Kenneth Moreland	d15074153d	Remove comments about textures from ArrayManagerExecutionThrustDevice Per a discussion with Robert Maynard, we no longer have plans to bind textures or use texture objects. Instead, we are now using the __ldg command to load from global memory inside the execution portal, which gives us most of the benefits of textures, but doesn't incur any bookkeeping.	2015-05-04 14:18:00 -06:00
Kenneth Moreland	e14fc427f0	Change storage references to storage pointers. Previously ArrayTransfer and ArrayManagerExecution received a reference to a Storage class in their constructor and held the reference as an ivar. In retrospect, this is just asking for trouble. First, it is way too easy to pass by value when you mean to pass by reference. Second, if there ever is a bug where the Storage goes out of scope before the classes holding a reference, it is that much harder to debug.	2015-04-30 21:29:58 -06:00
Kenneth Moreland	ec0adf8b16	Change interface of ArrayTransfer to be more like ArrayHandle. This includes changing methods like LoadDataForInput to PrepareForInput. It also changed the interface a bit to save a reference to the storage object. (Maybe it would be better to save a pointer?) These changes also extend up to the ArrayManagerExecution class, so it can effect device adapter implementations.	2015-04-30 21:07:36 -06:00
Kenneth Moreland	c5b831b726	Remove LoadDataForInput that takes portals. This API change effects both ArrayTransfer and ArrayManagerExecution. This is in preparation for a future change to make the API more consistent with ArrayHandle.	2015-04-28 11:30:30 -04:00
Robert Maynard	30c5ee9cf4	Remove SetThrustForCuda as we explicitly use the thrust cuda system. The only reason SetThrustForCuda was ever needed was that we used ::thrust::device_vector instead of ::thrust::system::cuda::vector.	2015-04-23 09:23:41 -04:00
Robert Maynard	c806348c94	Remove debug code from cuda schedule algorithm.	2015-04-16 08:56:08 -04:00
Robert Maynard	3a10a63e8c	make sure the 3d block size makes sense.	2015-04-08 15:57:35 -04:00
Robert Maynard	fdac208acb	use cuda scheduling versus thrust scheduling.	2015-04-03 10:18:05 -04:00
Robert Maynard	d340a573d9	don't clear ::thrust::vectors as that is not our intended behavior.	2015-03-23 09:01:42 -04:00
Robert Maynard	86dc8f1d38	Move back to thrust::cuda::vector to properly handle allocating uint8's Our approach of using the underlying allocator inside thrust was a bad approach, for some reason it fails to properly allocate uint8's or int8's on the correct boundaries. I expect that this logic is somewhere else in the code and instead we should use thrust::system::cuda::vector which does this properly.	2015-03-10 10:10:03 -04:00
Robert Maynard	63b1f03187	Simplify the implementation of loading through textures. We don't need this super complicated system for texture loading.	2015-03-09 16:37:45 -04:00
Robert Maynard	9b49973621	Use __ldg instead of texture object.	2015-03-05 18:31:44 -05:00
Robert Maynard	3c8ce36666	Properly deallocate cuda memory when we are done with it.	2015-01-28 15:54:45 -05:00
Robert Maynard	46df484ca7	Redesign the way we implement when to choose texture portals. The logic to select texture portals is easier to understand now.	2015-01-20 13:58:28 -05:00
Robert Maynard	e26040282b	Properly map TexturePortals to Iterators. Also at the same time disable the Texture support by default.	2015-01-20 13:58:28 -05:00
Robert Maynard	04bc41cad3	Fix multiple issue in the cuda array handle unit tests. The namespaces need to be different for each test, or else only the first implementation of the function will be used for all tests that call that function. Also updated the test to verify that we can count starting from a non zero number.	2015-01-20 09:03:24 -05:00
Robert Maynard	1b5c5a6ce5	Add in initial support for texture binding of input arrays.	2014-12-19 13:47:28 -05:00
Robert Maynard	a509dae909	Use pinned memory for error reporting on the cuda backend. This reduces the amount of explicit cuda Device to Host memory copies.	2014-12-19 13:47:28 -05:00
Robert Maynard	d9270e408d	Adding a cuda device adapter to vtkm. Porting the dax device adapter over to vtkm. Unlike the dax version, doesn't use the thrust::device_vector, but instead uses thrust::system calls so that we can support multiple thrust based backends. Also this has Texture Memory support for input array handles. Some more work will need to be done to ArrayHandle so that everything works when using an ArrayHandle inplace with texture memory bindings.	2014-12-19 13:47:28 -05:00

... 3 4 5 6 7 ...

412 Commits