Commit Graph

1612 Commits

Author SHA1 Message Date
Haocheng LIU
7d22132253 Merge topic 'allow-disabling/enabling-cuda-managed-memory'
e34301eca Allow disabling/enabling of CUDA managed memory via an env variable

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1359
2018-08-17 13:14:02 -04:00
Haocheng LIU
e34301eca8 Allow disabling/enabling of CUDA managed memory via an env variable
By setting the environment variable "VTKM_MANAGEDMEMO_DISABLED" to be 1,
users are able to disable CUDA managed memory even though the hardware is
capable of doing so.
2018-08-17 11:10:15 -04:00
Sujin Philip
1212081de1 Support deferred freeing for CUDA memory
Calls to 'cudaFree' block execution on all cuda devices. Reduce the number of
times this happens by having a deferred free mechanism that frees a pool
of pointers together when a threshold is reached.

Especially helpful during virtual object transfers that requires a few small
allocations and frees.
2018-08-16 12:05:36 -04:00
Robert Maynard
20a62ae560 Merge topic 'use_better_runtime_device_representation'
28e0eb9da Replace FindDeviceAdapterTagAndCall with TryExecuteOnDevice

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1356
2018-08-14 14:59:36 -04:00
Allison Vacanti
f6da092146 Use CUDA_ARCH instead of CUDACC to guard device-only code.
CUDACC is defined when compiling host code under nvcc, while
CUDA_ARCH is only defined for host code.
2018-08-09 11:57:05 -04:00
Allison Vacanti
727ebee197 Merge topic 'cuda_array_handles_on_cuda8'
2c079b96d Make AtomicArrays work on CUDA 8.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1357
2018-08-09 10:34:10 -04:00
Robert Maynard
c332dbd0a1 Only add the rules to run openmp tests serially when testing is enabled 2018-08-08 15:46:32 -04:00
Allison Vacanti
2c079b96dd Make AtomicArrays work on CUDA 8.
CUDA 8.0 is erroring out in the cuda AtomicArray implementation:

https://open.cdash.org/viewBuildError.php?buildid=5489156

This patch fixes the error. See comments in source for more info.
2018-08-08 15:26:32 -04:00
Robert Maynard
28e0eb9da6 Replace FindDeviceAdapterTagAndCall with TryExecuteOnDevice
Also add a throwFailedRuntimeDeviceTransfer that throws a nicely
detailed message on why a something couldn't be transfered to
the requested device adapter.
2018-08-08 14:53:28 -04:00
Robert Maynard
a3fe97709c Merge topic 'openmp_tests_run_serial'
48cc2f661 Make sure VTK-m runs all OpenMP tests serially.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1354
2018-08-08 13:10:15 -04:00
Robert Maynard
c4fa66aff4 Merge topic 'better_runtime_device_representation'
554bc3d36 At runtime TryExecute supports a specific deviceId to execute on.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1334
2018-08-08 12:41:32 -04:00
Robert Maynard
48cc2f661a Make sure VTK-m runs all OpenMP tests serially.
Fixes issue #276.
OpenMP tests when run in parallel exhibit negative scaling as we
have N openMP processes each spawning N threads. We speculate that
this causes excessive context switching and swapping and reduces
performance.
2018-08-08 10:01:18 -04:00
luz.paz
7f9b54a31a Misc. typos
Found via `codespell -q 3`
2018-08-07 17:50:41 -04:00
Robert Maynard
554bc3d369 At runtime TryExecute supports a specific deviceId to execute on.
Instead of always using the first enabled device, now TryExecute
can be told which device at runtime to use.
2018-08-07 17:22:18 -04:00
Haocheng LIU
282a2bf8f3 Add more unit tests for OpenMP DeviceAdapter 2018-08-07 11:32:21 -04:00
Haocheng LIU
ccc985748d Merge topic 'use-std-call_once-to-construct-singletons'
ce9cd8072 Use std::call_once to construct singeltons

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1350
2018-08-06 17:11:19 -04:00
Haocheng LIU
ce9cd8072a Use std::call_once to construct singeltons
By using `call_once` from C++11, we can simplify the logic in code
where we are querying same value variables from multiple threads.
2018-08-06 16:36:03 -04:00
Robert Maynard
3533975694 Remove usages of std::vector from OpenMP reduction algorithm
The OpenMP Device Reduction algorithm previously used a std::vector<T>
to store the reduction results of each thread. This caused problems
when T=bool as the types became a proxy type which isn't usable
with vtkm BinaryOperators.

Additionally by fixing this issue in the FunctorsOpenMP we
can remove a workaround in FunctorsGeneral that caused
compile failures when using complex BinaryOperators
such as MinAndMax.
2018-08-06 13:08:33 -04:00
Haocheng LIU
1fcbca3eed Replace std::random_shuffle with std::shuffle
std::random_shuffle is deprecated in C++14 because it's using std::rand
which uses a non uniform distribution and the underlying algorithm is
unspecified. Using std::shuffle can provide a reliable result in a 64
bit version.
2018-08-02 12:15:58 -04:00
Haocheng LIU
c95db1fc78 Use thread_local in GetGlobalRuntimeDeviceTracker if possible
It will reduce the cost of getting the thread runtime device tracker,
and will have a better runtime overhead if user constructs a lot of
short lived threads that use VTK-m.
2018-08-01 15:51:24 -04:00
Sujin Philip
259d670ab5 Merge topic 'cuda-per-thread-streams-2'
06dee259f Minimize cuda synchronizations

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1288
2018-07-25 15:07:39 -04:00
Robert Maynard
4ca4c17415 DeviceAdapterTagTestAlgorithmGeneral Id is positive value.
All valid devices must use a positive integer value as the
RuntimeTracker and VirtualObject consider all negative values
to be errors.
2018-07-25 14:09:00 -04:00
Robert Maynard
b51c773766 Allow ArrayHandleBasicImpl to work when we add new devices
Previously ArrayHandleBasicImpl had no support for OpenMP since
we forgot to update the implementation. This version will
work when adding new devices without any changes.
2018-07-25 12:57:27 -04:00
Robert Maynard
42af1d09c2 Merge topic 'ExecutionArrayInterfaceBasic_explicitly_constructs_DeviceAdapterIds'
e031e6496 ExecutionArrayInterfaceBasic<T> explicitly construct DeviceAdapterId objects
86b9ab996 Refactor ExecutionArrayInterfaceBasic to use inheriting constructors

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1330
2018-07-25 12:53:59 -04:00
Robert Maynard
24d3aa0428 Merge topic 'everyone_treat_deviceAdapterId_as_real_type'
14824bd42 Make sure people always treat DeviceAdapterId as a proper type

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1332
2018-07-25 12:05:05 -04:00
Robert Maynard
e031e64967 ExecutionArrayInterfaceBasic<T> explicitly construct DeviceAdapterId objects
Rather than implicitly presume the `VTKM_DEVICE_ADAPTER_` macros can
convert to DeviceAdapterId.
2018-07-25 12:04:30 -04:00
Robert Maynard
86b9ab9969 Refactor ExecutionArrayInterfaceBasic to use inheriting constructors 2018-07-25 12:03:48 -04:00
Robert Maynard
14824bd42e Make sure people always treat DeviceAdapterId as a proper type 2018-07-25 11:00:06 -04:00
Robert Maynard
36be8f97a1 DeviceAdapterAlgorithmOpenMP doesn't depend on the serial device.
It should be possible to build VTK-m without the serial device
adapter enabled, and therefore the OpenMP device shouldn't
rely on it.
2018-07-25 10:37:04 -04:00
Robert Maynard
f6b0c6a7a6 Merge topic 'remove_DeviceAdapterTagCheck'
f6789d9cf Remove DeviceAdapterTagCheck with DeviceAdapterTraits

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1329
2018-07-24 11:12:09 -04:00
Robert Maynard
f6789d9cfd Remove DeviceAdapterTagCheck with DeviceAdapterTraits
The DeviceAdapterTraits already contains if the device adapter is
valid, and therefore DeviceAdapterTagCheck is redundant.
2018-07-24 08:16:48 -04:00
Robert Maynard
d595abf907 WrappedBinaryOperator now supports std::vector<bool>::reference 2018-07-23 14:24:19 -04:00
Robert Maynard
8a44d0a5ae Merge topic 'vtkm_cont_less_device_sources'
d7660a556 vtkm_cont listed non-device sources as device-source

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1324
2018-07-19 15:57:02 -04:00
Robert Maynard
d3326a37a6 ReverseConnectivityBuilder now uses the new vtkm::cont::AtomicArray
Fixes Issue #270
2018-07-19 13:39:47 -04:00
Robert Maynard
d7660a556c vtkm_cont listed non-device sources as device-source
Cleanup the device sources list in vtkm_cont to only contain
.cxx files that could invoke cuda
2018-07-19 12:59:34 -04:00
Kenneth Moreland
b4bfb95131 Merge topic 'atomic-array-device-execution'
96ae94420 Simplified execution object creation for atomic array
0bd197af9 moved TwoLevelUniformGridExecutionObject to vtkm/exec/internal
6ce895be8 simplified how atomic arrays create execution objects
f1ee5b92a fix a rebase error
25d140361 fix bad rabse for wireframer
f892695f1 fixing so wierd merging issue
9bb00ec66 moved the execution object for TwoLevelUniform grid to vrkm::exec
db1c9bfee Change the namespacing of atomic array
...

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1243
2018-07-18 18:08:05 -04:00
Robert Maynard
4240111dd8 Make sure VirtualObjectHandle tests include RuntimeDeviceTracker 2018-07-18 10:37:46 -04:00
Robert Maynard
8077b031a8 Merge topic 'uncomment_cuda_range_test'
1e478bbe6 Re-enable UnitTestCudaComputeRange

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1321
2018-07-17 13:28:05 -04:00
Robert Maynard
f331d6d686 Merge topic 'remove_unneeded_typeinfo_includes'
bf49575e0 Remove unneeded typeinfo includes

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1320
2018-07-17 13:27:46 -04:00
Robert Maynard
1e478bbe63 Re-enable UnitTestCudaComputeRange 2018-07-17 11:43:19 -04:00
Robert Maynard
bf49575e00 Remove unneeded typeinfo includes 2018-07-17 11:41:53 -04:00
Allison Vacanti
ef578bb2c7 Reduce computational overhead for reverse connectivity calc.
Benchmarking in VTK showed significant overhead in the computation
of the reverse connectivity calculation in
ConnectivityExplicitInternals::ComputeCellToPointConnectivity.

This patch adds a ReverseConnectivityBuilder that reduces the amount of
time and memory needed to build the table by using an atomic histogram
approach that avoids a costly radix SortByKey.

Key operations in the new helper class are templated to allow this
approach to be reused by VTK-specific cell array converters.
2018-07-13 14:15:06 -04:00
Kenneth Moreland
2dbc45ac08 Merge topic 'fix-cuda-warnings'
6d24343c5 Add exec to ArrayPortalFromIterators constructors
91df12305 Remove VTKM_EXEC modifiers from CPU devices

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1318
2018-07-12 13:18:25 -04:00
Robert Maynard
6dc06423d8 ColorTable can provide vtkm::exec::Colortable to a specific device
Previously it wasn't possible to get a color table transfered
to a specific device.
2018-07-12 10:28:18 -04:00
Kenneth Moreland
6d24343c51 Add exec to ArrayPortalFromIterators constructors
There is no real reason why you cannot construct an
ArrayPortalFromIterators on a device, so go ahead and let that happen.
(This removes some CUDA warnings about calling __host__ from
__device__.)
2018-07-12 08:09:22 -06:00
Kenneth Moreland
91df123055 Remove VTKM_EXEC modifiers from CPU devices
Having VTKM_EXEC on algorithms for CPU devices was problematic because
the algorithms were specific to the CPU, but during a CUDA compile it
would try to compile device code (for no reasons since it was never
called on a device).

Remove these identifiers for the idea that a device implementation knows
specifically what function modifiers to use and does not need the VTK-m
defined catch-alls.
2018-07-11 16:45:30 -06:00
Matthew Letter
96ae94420d Simplified execution object creation for atomic array
simplified the creation of the execution object in the transport tag of the atomic array.
2018-07-11 10:58:51 -06:00
Kenneth Moreland
abfc946f84 Merge topic 'exec-objects-as-alg-sort-compare'
f14021dd8 Shorten code for PrepareArgForExec
3b828608a Support ExecArg behavior in vtkm::cont::Algorithm methods

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1308
2018-07-10 19:01:40 -04:00
Kenneth Moreland
f14021dd84 Shorten code for PrepareArgForExec
By making is_base_of part of PrepareArgForExec, we can shorten not only
the C++ code but also the code that is generated by it.

Also, return && instead of by value when passing through the argument.

Changes thanks to Robert Maynard.
2018-07-10 13:48:20 -06:00
Matthew Letter
0bd197af96 moved TwoLevelUniformGridExecutionObject to vtkm/exec/internal
Also changed the namespacing to vtkm::exec::twolevelgrid after discussion with Rob
2018-07-09 16:28:09 -06:00