vtk-m

mirror of https://gitlab.kitware.com/vtk/vtk-m synced 2024-09-08 13:23:51 +00:00

Author	SHA1	Message	Date
Sujin Philip	452f61e290	Add Kokkos backend	2020-08-12 13:55:24 -04:00
Kenneth Moreland	18b5be92d6	Fix issue with CUDA and ArrayHandleMultiplexer When you try to call the `Reduce` operation in the CUDA device adapter with a sufficently complex interator type, you get a compile error that says `error: cannot pass an argument with a user-provided copy-constructor to a device-side kernel launch`. This appears to be a bug in either nvcc or Thrust. I believe it is related to the following reported issues: * https://github.com/thrust/thrust/issues/928 * https://github.com/thrust/thrust/issues/1044 Work around this problem by making a special condition for calling `Reduce` with an `ArrayHandleMultiplexer` that calls the generic algorithm in `DeviceAdapterAlgorithmGeneral` instead of the algorithm in Thrust.	2020-07-06 13:51:36 -06:00
Kenneth Moreland	a47fd42bc1	Pin user provided memory in ArrayHandle Often when a user gives memory to an `ArrayHandle`, she wants data to be written into the memory given to be used elsewhere. Previously, the `Buffer` objects would delete the given buffer as soon as a write buffer was created elsewhere. That was a problem if a user wants VTK-m to write results right into a given buffer. Instead, when a user provides memory, "pin" that memory so that the `ArrayHandle` never deletes it.	2020-06-25 14:02:46 -06:00
Kenneth Moreland	56bec1dd7b	Replace basic ArrayHandle implementation to use Buffers This encapsulates a lot of the required memory management into the Buffer object and related code. Many now unneeded classes were deleted.	2020-06-25 14:02:26 -06:00
Kenneth Moreland	8f7b0d18be	Add Buffer class The buffer class encapsulates the movement of raw C arrays between host and devices. The `Buffer` class itself is not associated with any device. Instead, `Buffer` is used in conjunction with a new templated class named `DeviceAdapterMemoryManager` that can allocate data on a given device and transfer data as necessary. `DeviceAdapterMemoryManager` will eventually replace the more complicated device adapter classes that manage data on a device. The code in `DeviceAdapterMemoryManager` is actually enclosed in virtual methods. This allows us to limit the number of classes that need to be compiled for a device. Rather, the implementation of `DeviceAdapterMemoryManager` is compiled once with whatever compiler is necessary, and then the `RuntimeDeviceInformation` is used to get the correct object instance.	2020-06-25 14:01:39 -06:00
Kenneth Moreland	4f9fa08fa1	Remove ArrayHandleStreaming capabilities The `ArrayHandleStreaming` class stems from an old research project experimenting with bringing data from an `ArrayHandle` in parts and overlapping device transfer and execution. It works, but only in very limited contexts. Thus, it is not actually used today. Plus, the feature requires global indexing to be permutated throughout the worklet dispatching classes of VTK-m for no further reason. Because it is not really used, there are other more promising approaches on the horizon, and it makes further scheduling improvements difficult, we are removing this functionality.	2020-03-24 15:01:56 -06:00
Robert Maynard	8377806778	Merge topic 'introduce_mapfield_3d_scheduling' 1f1688483 Initial infrastructure to allow WorkletMapField to have 3D scheduling Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1938	2020-02-27 08:02:52 -05:00
Kenneth Moreland	ec34cb56c4	Use new ways to get array portal in control environment Also fix deadlocks that occur when portals are not destroyed in time.	2020-02-26 13:10:46 -07:00
Robert Maynard	1f1688483e	Initial infrastructure to allow WorkletMapField to have 3D scheduling	2020-02-25 15:23:41 -05:00
Kenneth Moreland	3671cbe168	Fix token issues with CUDA	2020-02-25 09:39:30 -07:00
Kenneth Moreland	ad0a53af71	Convert execution preparation to use tokens Marked the old versions of PrepareFor* that do not use tokens as deprecated and moved all of the code to use the new versions that require a token. This makes the scope of the execution object more explicit so that it will be kept while in use and can potentially be reclaimed afterward.	2020-02-25 09:39:19 -07:00
Kenneth Moreland	76ce9c87f0	Support using Token calling PrepareForExecution in ExecutionObject The old version of ExecutionObject (that only takes a device) is still supported, but you will get a deprecated warning if that is what is defined. Supporing this also included sending vtkm::cont::Token through the vtkm::cont::arg::Transport mechanism, which was a change that propogated through a lot of code.	2020-02-25 07:41:39 -07:00
Allison Vacanti	46b7155bdb	Add 64-bit CUDA atomic store.	2020-01-08 10:58:51 -05:00
Allison Vacanti	539f6e5ad7	Port benchmarking framework to Google Benchmark.	2020-01-08 10:58:51 -05:00
Allison Vacanti	44c4f0838f	Add vtkm/Algorithms.h header with device-friendly binary search algorithms.	2019-12-20 12:35:10 -05:00
Allison Vacanti	813f5a422f	Fixup custom portal iterator logic. The convenience functions `ArrayPortalToIteratorBegin()` and `ArrayPortalToIteratorEnd()` wouldn't detect specializations of `ArrayPortalToIterators<PortalType>` since the specializations aren't visible when the `Begin`/`End` functions are declared. Since the CUDA iterators rely on a specialization, the convenience functions would not compile on CUDA. Now, instead of specializing `ArrayPortalToIterators` to provide custom iterators for a particular portal, the portal may advertise custom iterators by defining `IteratorType`, `GetIteratorBegin()`, and `GetIteratorEnd()`. `ArrayPortalToIterators` will detect such portals and automatically switch to using the specialized portals. This eliminates the need for the specializations to be visible to the convenience functions and allows them to be usable on CUDA.	2019-12-17 15:39:51 -05:00
Kenneth Moreland	92db376236	Convert uses of ListTagBase to List	2019-12-06 15:37:46 -07:00
Kenneth Moreland	5ab0b5bb1d	Access ArrayHandle internals in a critical section Repeat the changes of the previous commit with the specialized ArrayHandle for basic storage.	2019-11-20 14:42:58 -07:00
Robert Maynard	5c56ff945f	Label tests which exercise a given Device Adapter This allows developers an easy way to run all OpenMP tests	2019-09-13 15:52:40 -04:00
Allison Vacanti	b9affb7edc	Disable copy for RAII helper.	2019-09-09 17:59:38 -04:00
Allison Vacanti	ea0bbfeefc	Increase CUDA stack size for ParticleAdvection worklets. Sometimes the CUDA runtime would not allocate sufficient stack space for the particle advection code to run. This issue was exposed by !1737 -- for some reason, once those changes to unrelated filters/worklets are added to VTK, CUDA allocates less stack and the following tests would fail: UnitTestLagrangianFilterCUDA UnitTestLagrangianStructuresFilterCUDA UnitTestStreamlineFilterCUDA UnitTestStreamSurfaceFilterCUDA These were fixed by increasing the stack size in the particle advection worklet Run(...) methods. An RAII helper has been added that will restore the previous stack size in case an exception is thrown, and the KDTree code has been updated to use this helper when it adjusts the CUDA stack allocation.	2019-09-09 16:06:23 -04:00
Allison Vacanti	884616788a	Simplify and extend AtomicArray implementation. - Use AtomicInterface to implement device-specific atomic operations. - Remove DeviceAdapterAtomicArrayImplementations. - Extend supported atomic types to include unsigned 32/64-bit ints. - Add a static_assert to check that AtomicArray type is supported. - Add documentation for AtomicArrayExecutionObject, including a CAS example. - Add a `T Get(idx)` method to AtomicArrayExecutionObject that does an atomic load, and update existing CAS usage to use this instead of `Add(idx, 0)`.	2019-08-23 15:40:37 -04:00
Allison Vacanti	0e728c8000	Update atomic interfaces to support Add/CAS for UInt32/64. These will be used for the AtomicArray implementation.	2019-08-23 15:40:37 -04:00
Allison Vacanti	112024dae2	Fix CUDA shfl usage. There was a bug in the implementations of CountSetBits and BitFieldToUnorderedSet.	2019-08-01 10:57:57 -04:00
Kenneth Moreland	5e23853521	Create ArrayHandleMultiplexer	2019-07-22 08:36:28 -06:00
Mark Kim	8dbb1c4de3	Merge branch 'master' of gitlab.kitware.com:m-kim/vtk-m into advdatamodel	2019-06-26 19:37:47 -04:00
Allison Vacanti	920ef9b3b9	Merge topic 'bit_algorithms' f370857c1 Add CountSetBits and Fill device algorithms. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1696	2019-06-25 15:42:16 -04:00
Allison Vacanti	f370857c15	Add CountSetBits and Fill device algorithms.	2019-06-25 11:30:39 -04:00
Mark Kim	699b57191f	Merge branch 'master' of gitlab.kitware.com:m-kim/vtk-m into advdatamodel	2019-06-25 10:36:47 -04:00
Mark Kim	cffd3873fc	Merge branch 'advdatamodel'	2019-06-20 22:20:44 -04:00
Mark Kim	6e1d3a84f0	First Extrude commit. how did any of this work? match other CellSet file layouts. ??? compile in CUDA. unit tests. also only serial. make error message accurate Well, this compiles and works now. Did it ever? use CellShapeTagGeneric UnitTest matches previous changes. whoops Fix linking problems. Need the same interface as other ThreadIndices. add filter test okay, let's try duplicating CellSetStructure. okay inching... change to wedge in CellSetListTag Means changing these to support it. switch back to wedge from generic compiles and runs remove ExtrudedType need vtkm_worklet vtkm_worklet needs to be included fix segment count for wedge specialization need to actually save the index for the other constructor. specialize on Explicit clean up warning angled brackets not quotes. formatting	2019-06-20 22:17:24 -04:00
Robert Maynard	1ea386222e	cuda copy functions don't launch on length zero arrays	2019-06-20 16:54:23 -04:00
Robert Maynard	8aaf922aa4	Introduce a log level that details kernel launch parameters	2019-06-18 15:01:07 -04:00
Kenneth Moreland	f11702ae92	Fix for rogue definition of PASCAL macro	2019-06-05 10:09:49 -06:00
Robert Maynard	4020f51988	RuntimeDeviceTracker can't be copied and is only accessible via reference. As the RuntimeDeviceTracker is a per thread construct we now make it explicit that you can only get a reference to the per-thread version and can't copy it.	2019-05-20 11:43:05 -04:00
Robert Maynard	d1ce4a0bca	Fix the default launch sizes for Tesla hardware. The 8x8x8 is a better launch strategy for most VTK-m kernels. The current problem is that a couple of VTK-m kernels use a high number of registers and this number of threads combines to require too many registers. What we should do in the longer run is have more controls over kernel launches on a per kernel basis. This will require VTK-m to extract the number of registers being used by each kernel	2019-05-06 16:12:15 -04:00
Robert Maynard	770912f991	Correct compiler issues found with GCC 4.8.5 + CUDA 9.2 on summit	2019-05-02 10:27:48 -04:00
Robert Maynard	065d117838	Testing Device Adapter now uses ArrayHandle for all device transfers The consistent API for control to execution memory transfers is the ArrayHandle class. Previously the tests would verify memory transfer by calling the ArrayManagerExecution class directly. This is problematic as the class isn't used by ArrayHandle<T, StorageBasic>.	2019-04-30 13:50:08 -04:00
Robert Maynard	63c931e639	Correct location of ThrustPatches which clang formatter moved	2019-04-23 15:02:58 -04:00
Robert Maynard	ff687016ee	For VTK-m libs all includes of DeviceAdapterTagCuda happen from cuda files It is very easy to cause ODR violations with DeviceAdapterTagCuda. If you include that header from a C++ file and a CUDA file inside the same program we an ODR violation. The reasons is that the C++ versions will say the tag is invalid, and the CUDA will say the tag is valid. The solution to this is that any compilation unit that includes DeviceAdapterTagCuda from a version of VTK-m that has CUDA enabled must be invoked by the cuda compiler.	2019-04-22 10:39:54 -04:00
nadavi	fbcea82e78	conslidate the license statement	2019-04-17 10:57:13 -06:00
Robert Maynard	6c5c197a37	Merge topic 'support_cuda_scheduling_parameters_via_runtime' 047b64651 VTK-m now provides better scheduling parameters controls Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1643	2019-04-17 10:04:19 -04:00
Robert Maynard	047b646517	VTK-m now provides better scheduling parameters controls VTK-m now offers a more GPU aware set of defaults for kernel scheduling. When VTK-m first launches a kernel we do system introspection and determine what GPU's are on the machine and than match this information to a preset table of values. The implementation is designed in a way that allows for VTK-m to offer both specific presets for a given GPU ( V100 ) or for an entire generation of cards ( Pascal ). Currently VTK-m offers preset tables for the following GPU's: - Tesla V100 - Tesla P100 If the hardware doesn't match a specific GPU card we than try to find the nearest know hardware generation and use those defaults. Currently we offer defaults for - Older than Pascal Hardware - Pascal Hardware - Volta+ Hardware Some users have workloads that don't align with the defaults provided by VTK-m. When that is the cause, it is possible to override the defaults by binding a custom function to `vtkm::cont::cuda::InitScheduleParameters`. As shown below: ```cpp ScheduleParameters CustomScheduleValues(char const* name, int major, int minor, int multiProcessorCount, int maxThreadsPerMultiProcessor, int maxThreadsPerBlock) { ScheduleParameters params { 64 * multiProcessorCount, //1d blocks 64, //1d threads per block 64 * multiProcessorCount, //2d blocks { 8, 8, 1 }, //2d threads per block 64 * multiProcessorCount, //3d blocks { 4, 4, 4 } }; //3d threads per block return params; } vtkm::cont::cuda::InitScheduleParameters(&CustomScheduleValues); ```	2019-04-17 08:32:16 -04:00
Robert Maynard	ff30684c8e	Removes the default device macros from VTK-m Fixes #116	2019-04-15 08:15:36 -04:00
Robert Maynard	a5dbe1ece3	Merge topic 'bitfields' 661fb64de AtomicInterfaceControl functions are marked with VTKM_SUPPRESS_EXEC_WARNINGS 0c70f9b9a Add BitFieldIn/Out/InOut worklet signature tags. a66510e81 Add ArrayHandleBitField, a boolean-valued AH backed by a BitField. 56cc5c3d3 Add support for BitFields. d01b97382 Allow VTKM_SUPPRESS_EXEC_WARNINGS to be used inside macros. 2f2ca9370 Add bit operations FindFirstSetBit and CountSetBits to Math.h. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1629	2019-04-11 12:32:03 -04:00
Allison Vacanti	56cc5c3d3a	Add support for BitFields. BitFields are: - Stored in memory using a contiguous buffer of bits. - Accessible via portals, a la ArrayHandle. - Portals operate on individual bits or words. - Operations may be atomic for safe use from concurrent kernels. The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle containing the indices of all set bits, in no particular order. The new AtomicInterface classes provide an abstraction into bitwise atomic operations across control and execution environments and are used to implement the BitPortals.	2019-04-11 08:27:17 -04:00
Robert Maynard	89ec4aae2f	Reduction on CUDA handles different input and output types better When reducing an input type that differs from the output type you need to write a custom binary operator that also implements how to do the unary transformation.	2019-04-10 14:44:44 -04:00
Robert Maynard	1d20ae4f7b	Move DeviceAdapterTag to vtkm/cont	2019-04-04 11:58:51 -04:00
Robert Maynard	7f612502ac	Merge topic 'remove_unneeded_cont_exec_markup' f1056affa Move select functions to host only to remove host/device suppressions 4f2156dfa Thrust detail::aligned_reinterpret_cast doesn't warn now f4840618c Make sure ThrustPatches is included before thrust. b2bbd66e6 Merge branch 'upstream-taotuple' into update_taoo 4ec6fc812 taotuple 2019-04-03 (8e70fa8a) Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Allison Vacanti <allison.vacanti@kitware.com> Merge-request: !1607	2019-04-04 09:32:04 -04:00
Sujin Philip	c6bead8388	Rename CellLocatorTwoLevelUniformGrid to CellLocatorUniformBins Also make it a concrete sub-class of vtkm::cont::CellLocator Fixes issue #251	2019-04-03 10:21:56 -04:00
Robert Maynard	f4840618cf	Make sure ThrustPatches is included before thrust.	2019-04-03 08:51:05 -04:00
Robert Maynard	b9e0e541b8	VTK-m once again uses consistent include style	2019-03-28 14:12:08 -04:00
Robert Maynard	256e0c3c11	Merge topic 'rename_to_GetRuntimeDeviceTracker' ae11e115a RuntimeDeviceTracker: Remove `Global` from names Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1592	2019-03-24 08:17:02 -04:00
Robert Maynard	ae11e115a0	RuntimeDeviceTracker: Remove `Global` from names	2019-03-22 08:53:26 -07:00
Robert Maynard	6cdf6cb672	Less aggressive defaults for VTK-m compared to summit. Since we don't have per system checks currently built into vtk-m we can't use the tuned values for Summit, as they don't run on all our hardware.	2019-03-20 09:30:34 -07:00
Robert Maynard	3879479185	Improve VTK-m cuda scheduling based on Summit scaling study When benchmarking the VTK-m algorithms on Summit I discovered that our scheduling choices aren't optimal for the hardware. This is a short term fix where we select good numbers for Summit, and in the future make the defaults controllable by the calling programming and/or environment variables. Performance numbers can be found at: https://gitlab.kitware.com/snippets/755	2019-03-20 09:30:34 -07:00
Kenneth Moreland	4d9ce24888	Synchronize CUDA timer when stopping it Previously, when Stop was called on a Cuda timer, it would record a stop event but it would not synchronize it at that time. Instead, the synchronize was only called when GetElapsedTime was called. The problem is that the time of the event is only marked when synchronize is called. Thus, if the event completed before GetElapsedTime was called, it would record the time from when the event acutally happened to the time when GetElapsedTime was called as part of the elapsed time, which is incorrect. Fix the problem by synchronizing when Stop is called. Although this makes the Timer more invasive, generally using the Timer can cause synchronization to happen. This behavior is consistent with the Timer implementation for other devices.	2019-02-28 15:08:32 -07:00
Kenneth Moreland	85265a9c84	Add const correctness to Timer It should be possible to query a vtkm::cont::Timer without modifying it. As such, its query functions (such as Stopped and GetElapsedTime) should be const.	2019-02-28 15:08:16 -07:00
Haocheng LIU	0696ae135e	Merge topic 'asynchronize-timer' 415252c66 Introduce asynchronous and device independent timer Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Haocheng LIU <haocheng.liu@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1530	2019-02-05 12:02:59 -05:00
Haocheng LIU	415252c662	Introduce asynchronous and device independent timer The timer class now is asynchronous and device independent. it's using an similiar API as vtkOpenGLRenderTimer with Start(), Stop(), Reset(), Ready(), and GetElapsedTime() function. For convenience and backward compability, Each Start() function call will call Reset() internally and each GetElapsedTime() function call will call Stop() function if it hasn't been called yet for keeping backward compatibility purpose. Bascially it can be used in two modes: * Create a Timer without any device info. vtkm::cont::Timer time; * It would enable timers for all enabled devices on the machine. Users can get a specific elapsed time by passing a device id into the GetElapsedtime function. If no device is provided, it would pick the maximum of all timer results - the logic behind this decision is that if cuda is disabled, openmp, serial and tbb roughly give the same results; if cuda is enabled it's safe to return the maximum elapsed time since users are more interested in the device execution time rather than the kernal launch time. The Ready function can be handy here to query the status of the timer. * Create a Timer with a device id. vtkm::cont::Timer time((vtkm::cont::DeviceAdapterTagCuda())); * It works as the old timer that times for a specific device id.	2019-02-05 12:01:56 -05:00
Robert Maynard	d0a70946b8	Simplify the DeviceAdapterRuntimeDetectorCuda to not do a kernel launch. The kernel launch component of the runtime device adapter is fairly pointless. If the hardware supports CUDA we should expect that VTK-m has the correct kernel versions. Plus in the original version if the CUDA device was being used and the kernel launch returns cudaErrorDevicesUnavailable it was never possible to restore CUDA support. Now what happens is that the runtime tracker is marked as failed, but the calling code can always go back and trying the device again.	2019-02-04 13:27:20 -05:00
Robert Maynard	5508d17c31	Merge topic 'correct_broken_install' 24e71d251 VTK-m yet again has properly installed headers. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1525	2019-01-24 14:59:41 -05:00
Robert Maynard	24e71d251b	VTK-m yet again has properly installed headers. Fixes the install issues mentioned in #342	2019-01-24 14:26:40 -05:00
Allison Vacanti	03fc7b66d0	Add VTKM_CUDA_DEVICE_PASS preprocessing definition. This is only set while compiling device code, and is useful for code that needs different implementations on devices (e.g. they call CUDA device intrinsics, etc).	2019-01-24 11:23:45 -05:00
Robert Maynard	d6f66d17a3	Testing run methods now take argc/argv to init logging/runtime device `vtkm::cont::testing` now initializes with logging enabled and support for device being passed on the command line, `vtkm::testing` only enables logging.	2019-01-17 13:16:27 -06:00
Robert Maynard	4ec5bae02d	Remove VTK-m TestBuild infrastructure The purpose of the TestBuild infrastructure was to confirm that VTK-m didn't have any lexical issues when it was a pure header only project. As we now move to have more compiled components the need for this form of testing is mitigated. Combined with the issue of TestBuilds causing MSVC issues, we should just remove this infrastructure.	2019-01-16 10:04:33 -06:00
Kenneth Moreland	2e426ad547	Run the update-control-signature-tags.sh script	2019-01-11 12:23:10 -07:00
Robert Maynard	628dce822e	Merge topic 'require_cmake38' f1e1a524e Require CMake 3.8 to build VTK-m. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1514	2019-01-09 17:02:52 -05:00
Abhishek Yenpure	afd0409189	Merge topic 'code_sprint_locator_fixes' 9b56d41fe Fixing Rectilinear Grid Cell Locator 10e9d47dc Removing std::out print statement from test 34c7b57d8 Merge branch 'code_sprint_locator_fixes' of gitlab.kitware.com:ayenpure/vtk-m into code_sprint_locator_fixes 62ee1a2c8 Updates to the Cell Locators 7eb0de5b7 Merge branch 'code_sprint_locator_fixes' of gitlab.kitware.com:ayenpure/vtk-m into code_sprint_locator_fixes 866b0798d Resolving type warnings c062f2e26 Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into code_sprint_locator_fixes 797c83891 Adding default constructor and removing wrong comment ... Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1395	2019-01-09 16:23:17 -05:00
Robert Maynard	f1e1a524e9	Require CMake 3.8 to build VTK-m.	2019-01-09 16:01:22 -05:00
ayenpure	62ee1a2c8a	Updates to the Cell Locators - Adding updates to uniform grid cell locator - adding OpenMP test, updating copyrights - Adding rectilinear grid cell locator - adding unit tests for serial, tbb, OpenMP, and cuda - Updating CMakeLists to honor the alphabetical ordering	2019-01-06 17:18:23 -08:00
Robert Maynard	718caaaeac	CudaAllocator allows managed memory to be explicitly disabled	2018-12-28 11:30:29 -05:00
Robert Maynard	90bb23de6b	CudaAllocator::Initialize correctly uses managed memory when possible Previously the logic would always think managed memory wasn't supported	2018-12-20 17:21:55 -05:00
ayenpure	c062f2e26c	Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into code_sprint_locator_fixes	2018-12-03 07:44:31 -08:00
Allison Vacanti	16c4dde2ee	Merge topic 'cuda10_warning' 0e105eae6 cudaPointerAttributes::isManaged deprecated in CUDA 10. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1430	2018-10-10 15:05:57 -04:00
Allison Vacanti	0e105eae6d	cudaPointerAttributes::isManaged deprecated in CUDA 10. Update code to support both the old and new way of checking this.	2018-10-10 13:51:56 -04:00
Allison Vacanti	bd337854ec	Initial implementation of general logging. Addresses #291.	2018-10-02 11:37:55 -04:00
Kenneth Moreland	98a0a20feb	Allow ArrayHandleTransform to work with ExecObject This change allows you to set a subclass of vtkm::cont::ExecutionObjectBase as a functor used in ArrayHandleTransform. This latter class will then detect that the functor is an ExecObject and will call PrepareForExecution with the appropriate device to get the actual Functor object. This change allows you to use virtual objects and other device dependent objects as functors for ArrayHandleTransform without knowing a priori what device the portal will be used on.	2018-09-05 13:11:04 -06:00
ayenpure	22ca8bce15	Fixing unit test	2018-08-30 10:19:00 -07:00
ayenpure	42e2bb7f9a	Updating files with copyrights	2018-08-29 19:46:49 -07:00
ayenpure	594d1934d4	Adding CellLocatorUniformGrid - Adding a cell locator to locate points in a uniform grid - Adding unit tests for the new cell locator	2018-08-29 19:30:07 -07:00
Kenneth Moreland	d879188de0	Make DispatcherBase invoke using a TryExecute Rather than force all dispatchers to be templated on a device adapter, instead use a TryExecute internally within the invoke to select a device adapter. Because this removes the need to declare a device when invoking a worklet, this commit also removes the need to declare a device in several other areas of the code.	2018-08-29 19:18:54 -07:00
Allison Vacanti	024a75821d	Make DeviceAdapterId constructor protected. This forces users to use a defined tag, since they shouldn't need to create their own.	2018-08-24 16:38:08 -04:00
Haocheng LIU	7d22132253	Merge topic 'allow-disabling/enabling-cuda-managed-memory' e34301eca Allow disabling/enabling of CUDA managed memory via an env variable Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !1359	2018-08-17 13:14:02 -04:00
Haocheng LIU	e34301eca8	Allow disabling/enabling of CUDA managed memory via an env variable By setting the environment variable "VTKM_MANAGEDMEMO_DISABLED" to be 1, users are able to disable CUDA managed memory even though the hardware is capable of doing so.	2018-08-17 11:10:15 -04:00
Sujin Philip	1212081de1	Support deferred freeing for CUDA memory Calls to 'cudaFree' block execution on all cuda devices. Reduce the number of times this happens by having a deferred free mechanism that frees a pool of pointers together when a threshold is reached. Especially helpful during virtual object transfers that requires a few small allocations and frees.	2018-08-16 12:05:36 -04:00
Allison Vacanti	f6da092146	Use CUDA_ARCH instead of CUDACC to guard device-only code. CUDACC is defined when compiling host code under nvcc, while CUDA_ARCH is only defined for host code.	2018-08-09 11:57:05 -04:00
Allison Vacanti	2c079b96dd	Make AtomicArrays work on CUDA 8. CUDA 8.0 is erroring out in the cuda AtomicArray implementation: https://open.cdash.org/viewBuildError.php?buildid=5489156 This patch fixes the error. See comments in source for more info.	2018-08-08 15:26:32 -04:00
Haocheng LIU	ce9cd8072a	Use std::call_once to construct singeltons By using `call_once` from C++11, we can simplify the logic in code where we are querying same value variables from multiple threads.	2018-08-06 16:36:03 -04:00
Sujin Philip	259d670ab5	Merge topic 'cuda-per-thread-streams-2' 06dee259f Minimize cuda synchronizations Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1288	2018-07-25 15:07:39 -04:00
Robert Maynard	e031e64967	ExecutionArrayInterfaceBasic<T> explicitly construct DeviceAdapterId objects Rather than implicitly presume the `VTKM_DEVICE_ADAPTER_` macros can convert to DeviceAdapterId.	2018-07-25 12:04:30 -04:00
Robert Maynard	86b9ab9969	Refactor ExecutionArrayInterfaceBasic to use inheriting constructors	2018-07-25 12:03:48 -04:00
Robert Maynard	4240111dd8	Make sure VirtualObjectHandle tests include RuntimeDeviceTracker	2018-07-18 10:37:46 -04:00
Robert Maynard	8077b031a8	Merge topic 'uncomment_cuda_range_test' 1e478bbe6 Re-enable UnitTestCudaComputeRange Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !1321	2018-07-17 13:28:05 -04:00
Robert Maynard	1e478bbe63	Re-enable UnitTestCudaComputeRange	2018-07-17 11:43:19 -04:00
Robert Maynard	bf49575e00	Remove unneeded typeinfo includes	2018-07-17 11:41:53 -04:00
Robert Maynard	64958b014b	VTK-m now supports passing pointers when invoking worklets. The original design of invoke and the transport infrastructure relied on the implementation behavior of vtkm::cont types such as ArrayHandle that used an internal shared_ptr to managed state. This allowed passing by value instead of passing by non-const ref when needing to transfer information to the device. As VTK-m adds support for classes that use virtuals the ability to pass by base pointer type allows for us to invoke worklets using a base type without the risk of type slicing. Additional by moving over to a non-const ref Invocation we can update all transports that have 'output' to now be by ref and therefore support types that can't be copied while being 'more' correct.	2018-07-06 14:27:36 -04:00
Robert Maynard	9238cedcab	Merge topic 'ice_nvcc_on_renar' 5ced0da8f Try to ice the ubuntu 17.10 + cuda 9.1 compiler Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !1305	2018-07-05 11:36:16 -04:00
Robert Maynard	5ced0da8f5	Try to ice the ubuntu 17.10 + cuda 9.1 compiler	2018-07-05 09:14:52 -04:00
Robert Maynard	e5090e1289	Make sure the PointLocatorUniform uses the correct runtime device	2018-07-03 17:42:57 -04:00

1 2 3 4 5 ...

412 Commits