vtk-m2

Author	SHA1	Message	Date
Sujin Philip	1212081de1	Support deferred freeing for CUDA memory Calls to 'cudaFree' block execution on all cuda devices. Reduce the number of times this happens by having a deferred free mechanism that frees a pool of pointers together when a threshold is reached. Especially helpful during virtual object transfers that requires a few small allocations and frees.	2018-08-16 12:05:36 -04:00
Allison Vacanti	f6da092146	Use CUDA_ARCH instead of CUDACC to guard device-only code. CUDACC is defined when compiling host code under nvcc, while CUDA_ARCH is only defined for host code.	2018-08-09 11:57:05 -04:00
Allison Vacanti	2c079b96dd	Make AtomicArrays work on CUDA 8. CUDA 8.0 is erroring out in the cuda AtomicArray implementation: https://open.cdash.org/viewBuildError.php?buildid=5489156 This patch fixes the error. See comments in source for more info.	2018-08-08 15:26:32 -04:00
Haocheng LIU	ce9cd8072a	Use std::call_once to construct singeltons By using `call_once` from C++11, we can simplify the logic in code where we are querying same value variables from multiple threads.	2018-08-06 16:36:03 -04:00
Sujin Philip	259d670ab5	Merge topic 'cuda-per-thread-streams-2' 06dee259f Minimize cuda synchronizations Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1288	2018-07-25 15:07:39 -04:00
Robert Maynard	e031e64967	ExecutionArrayInterfaceBasic<T> explicitly construct DeviceAdapterId objects Rather than implicitly presume the `VTKM_DEVICE_ADAPTER_` macros can convert to DeviceAdapterId.	2018-07-25 12:04:30 -04:00
Robert Maynard	86b9ab9969	Refactor ExecutionArrayInterfaceBasic to use inheriting constructors	2018-07-25 12:03:48 -04:00
Robert Maynard	4240111dd8	Make sure VirtualObjectHandle tests include RuntimeDeviceTracker	2018-07-18 10:37:46 -04:00
Robert Maynard	8077b031a8	Merge topic 'uncomment_cuda_range_test' 1e478bbe6 Re-enable UnitTestCudaComputeRange Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1321	2018-07-17 13:28:05 -04:00
Robert Maynard	1e478bbe63	Re-enable UnitTestCudaComputeRange	2018-07-17 11:43:19 -04:00
Robert Maynard	bf49575e00	Remove unneeded typeinfo includes	2018-07-17 11:41:53 -04:00
Robert Maynard	64958b014b	VTK-m now supports passing pointers when invoking worklets. The original design of invoke and the transport infrastructure relied on the implementation behavior of vtkm::cont types such as ArrayHandle that used an internal shared_ptr to managed state. This allowed passing by value instead of passing by non-const ref when needing to transfer information to the device. As VTK-m adds support for classes that use virtuals the ability to pass by base pointer type allows for us to invoke worklets using a base type without the risk of type slicing. Additional by moving over to a non-const ref Invocation we can update all transports that have 'output' to now be by ref and therefore support types that can't be copied while being 'more' correct.	2018-07-06 14:27:36 -04:00
Robert Maynard	9238cedcab	Merge topic 'ice_nvcc_on_renar' 5ced0da8f Try to ice the ubuntu 17.10 + cuda 9.1 compiler Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1305	2018-07-05 11:36:16 -04:00
Robert Maynard	5ced0da8f5	Try to ice the ubuntu 17.10 + cuda 9.1 compiler	2018-07-05 09:14:52 -04:00
Robert Maynard	e5090e1289	Make sure the PointLocatorUniform uses the correct runtime device	2018-07-03 17:42:57 -04:00
Sujin Philip	06dee259f7	Minimize cuda synchronizations 1. Have a per-thread pinned array for cuda errors 2. Check for errors before scheduling new tasks and at explicit sync points 3. Remove explicit synchronizations from most places Addresses part 2 of #168	2018-07-03 14:19:06 -04:00
ayenpure	e2dccee099	Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into spatialsearch	2018-06-30 11:56:33 -06:00
Kenneth Moreland	4459ab9174	Merge branch 'master' into 'pointlocator-general-interface' # Conflicts: # vtkm/cont/PointLocatorUniformGrid.h	2018-06-28 12:51:08 -04:00
Kenneth Moreland	439beaaed9	Make point locator tests have consistent devices	2018-06-27 10:37:59 +02:00
Allison Vacanti	a8d8b3670d	Suppress host/device warnings on CUDA atomics.	2018-06-25 14:53:53 -04:00
David Thompson	d8cf1f7b51	Merge topic 'geometry-squashed' 880d8a989 Add `vtkm/Geometry.h` and test it. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1262	2018-06-20 14:15:50 -04:00
David Thompson	880d8a989e	Add `vtkm/Geometry.h` and test it. This commit adds several geometric constructs to vtk-m in the `vtkm/Geometry.h` header. They may be used from both the execution and control environments. We also add methods to perform projection and Gram-Schmidt orthonormalization to `vtkm/VectorAnalysis.h`. See `docs/changelog/geometry.md` included in this commit for more information.	2018-06-20 11:58:14 -04:00
ayenpure	d8e8078099	Fixing the typos with ScanExclusiveByKey - Fixed the typo - Moved the test to vtkm/worklet/testing as vtkm/cont/testing does not execute with CUDA	2018-06-15 16:39:00 -07:00
Robert Maynard	8276e35cf4	Mark classes that should not be derived from as final.	2018-06-15 10:49:59 -04:00
Robert Maynard	82cdae0025	VTK-m waits for cuda streams to finish before host access Previously it was possible for VTK-m to access memory from the host before the computations in a stream finished.	2018-06-01 10:28:55 -04:00
Robert Maynard	9c3547bc7c	VTK-m cuda runtime now handles no cuda runtime properly Previously it would throw an uncaught exception and crash.	2018-05-31 10:07:37 -04:00
Allison Vacanti	1f6a662c0a	Merge DevAdaptAlgoThrust --> DevAdaptAlgoCuda.	2018-05-29 14:07:29 -04:00
Allison Vacanti	be0c6a17a9	Move DevAdaptAtomicArrayImplementation to its own file.	2018-05-29 14:07:29 -04:00
Allison Vacanti	3af9f66083	Merge ArrayManagerExecutionThrustDevice into AMECuda.	2018-05-29 14:07:29 -04:00
Robert Maynard	4a520b7bdd	Merge topic 'pascal_managed_memory_copy_non_blocking' e0b6e698 copying cpu memory to pascal managed memory now works consistently. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1211	2018-05-18 15:15:17 -04:00
Robert Maynard	e0b6e69878	copying cpu memory to pascal managed memory now works consistently. When copying small arrays from cpu memory to pascal memory we would see subsequent kernels fail as the memory transfer hadn't finished. This is a bug as each stream should act like a FIFO queue. So for now when encountering this use case we explicitly synchronize after the memcpy.	2018-05-16 17:56:50 -04:00
Robert Maynard	1c5feeb185	Make sure all device specific tests use the intended device. This means that we not only setup the runtime device tracker to force the intended device, it also means making sure the default device is the error device.	2018-05-16 08:21:16 -04:00
Robert Maynard	e28244f345	Re-implement DeviceAdapterRuntimeDetector to avoid ODR violations. The previous implementation of DeviceAdapterRuntimeDetector caused multiple differing definitions of the same class to exist and was causing the runtime device tracker to report CUDA as disabled when it actually was enabled. The ODR was caused by having a default implementation for DeviceAdapterRuntimeDetector and a specific specialization for CUDA. If a library had both CUDA and C++ sources it would pick up both implementations and would have undefined behavior. In general it would think the CUDA backend was disabled. To avoid this kind of situation in the future I have reworked VTK-m so that each device adapter must implement DeviceAdapterRuntimeDetector for that device.	2018-05-15 13:08:34 -04:00
Robert Maynard	571556d984	CUDA's RuntimeDeviceTracker and Timer are now built as part of vtkm_cont This is done to not only reduce the amount of code that users need to generate but to reduce the amount of errors when using the RuntimeDeviceTracker. If the runtime device tracker is initially used in a library by a c++ file it will never properly detect the cuda backend. By moving the code into vtkm_cont we can make sure this problem doesn't occur.	2018-05-10 10:57:06 -04:00
Robert Maynard	364b366ab3	Correct signed/unsigned cast warnings from DeviceAdapterAlgorithmThrust Found with CUDA 7.5	2018-05-08 15:29:11 -04:00
Robert Maynard	c9ba80ad93	Replace uint with vtkm::Id in DeviceAdapterAlgorithmThrust The usage of uint was causing problems with CUDA + MSVC2015 as type was not defined. Instead we use vtkm::Id as that was the expect type to be passed to the task	2018-05-02 09:55:56 -04:00
Robert Maynard	b56894dd09	Move VTK-m Cuda backend over to a grid-stride iteration pattern. This allows for easier host side logic when determining grid and block sizes, and allows for a smaller library side by moving some logic into compiled in functions.	2018-04-30 17:29:26 -04:00
Robert Maynard	b7e6371842	Correct issues found be enabling more CUDA warnings.	2018-04-23 14:27:53 -04:00
Matt Larsen	715141737f	Merge topic 'typos' efdf8543 Misc. Typos Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Matt Larsen <mlarsen@cs.uoregon.edu> Merge-request: !1113	2018-04-06 18:04:46 -04:00
Robert Maynard	84311a2453	Merge branch 'master' into cmake_refactor	2018-04-05 10:18:36 -04:00
Robert Maynard	c123796949	VTK-m ArrayHandle can now take ownership of a user allocated memory location Previously memory that was allocated outside of VTK-m was impossible to transfer to VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors to support a Storage object that is being moved, we can clearly express that the ArrayHandle now owns memory it didn't allocate. Here is an example of how this is done: ```cpp T* buffer = new T[100]; auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); }; vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic> storage(buffer, 100, user_free_function); vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage)); ```	2018-04-04 11:28:25 -04:00
Robert Maynard	707970f492	VTK-m StorageBasic is now able to give/take ownership of user allocated memory. This fixes the three following issues with StorageBasic. 1. Memory that was allocated by VTK-m and Stolen by the user needed the proper free function called which is generally StorageBasicAllocator::deallocate. But that was hard for the user to hold onto. So now we provide a function pointer to the correct free function. 2. Memory that was allocated outside of VTK-m was impossible to transfer to VTK-m as we didn't know how to free it. This is now resolved by allowing the user to specify a free function to be called on release. 3. When the CUDA backend allocates memory for an ArrayHandle that has no control representation, and the location we are running on supports concurrent managed access we want to specify that cuda managed memory as also the host memory. This requires that StorageBasic be able to call an arbitrary new delete function which is chosen at runtime.	2018-04-04 11:27:57 -04:00
Robert Maynard	8808b41fbd	Merge branch 'master' into vtk-m-cmake_refactor	2018-03-29 22:51:26 -04:00
Robert Maynard	944bc3c0d6	Introduce vtkm::cont::ColorTable replacing vtkm::rendering::ColorTable The new and improved vtkm::cont::ColorTable provides a more feature complete color table implementation that is modeled after vtkDiscretizableColorTransferFunction. This class therefore supports different color spaces ( rgb, lab, hsv, diverging ) and supports execution across all device adapters.	2018-03-28 16:11:23 -04:00
luz.paz	efdf854306	Misc. Typos Found via `codespell` and `grep`	2018-03-28 09:45:07 -04:00
Robert Maynard	2bfbf0a902	Transfer of virtuals to the CUDA device now properly uses streams This way when multiple threads are using VTK-m they all won't block while one transfer a class with virtuals to the device.	2018-03-20 17:04:41 -04:00
Robert Maynard	6202d8d22d	CudaAllocator guards all CUDA 8.0+ calls behind ifdef's.	2018-02-26 16:37:57 -05:00
Robert Maynard	e630ac5aa4	Merge branch 'master' into vtk-m-cmake_refactor	2018-02-23 14:52:00 -05:00
Robert Maynard	705528bf17	vtk-m ArrayHandle + basic storage has an optimized PrepareForDevice method By hard coding the PrepareForDevice to know about all the different VTK-m devices, we can have a single base class do the execution allocation, and not have that logic repeated in each child class.	2018-02-16 10:00:28 -05:00
Robert Maynard	22f9ae3d24	vtk-m ArrayHandle + basic holds control data by StorageBasicBase By making the array handle hold the control side data by the parent storage class we remove significant code generation.	2018-02-16 09:59:20 -05:00

1 2 3 4 5 ...

277 Commits