Commit Graph

385 Commits

Author SHA1 Message Date
Kenneth Moreland
163d591795 Add DEVICE_SOURCES to vtkm_unit_tests
The `vtkm_unit_tests` function in the CMake build now allows you to specify
which files need to be compiled with a device compiler using the
`DEVICE_SOURCES` argument. Previously, the only way to specify that unit
tests needed to be compiled with a device compiler was to use the
`ALL_BACKENDS` argument, which would automatically compile everything with
the device compiler as well as test the code on all backends.
`ALL_BACKENDS` is still supported, but it no longer changes the sources to
be compiled with the device compiler.
2022-07-08 06:28:51 -06:00
Cyrus Harrison
987a1fd8a2 fix include path 2021-11-19 12:15:53 -08:00
Nickolas Davis
9730de8074 remove cudaGetDevice calls, favor runtime device config 2021-09-02 09:17:36 -06:00
Nickolas Davis
adac415f15 implement cuda runtime device configuraton 2021-09-02 09:12:21 -06:00
nadavi
cd2a6c1385 implement kokkos runtime device configuration 2021-08-18 13:23:30 -06:00
Nickolas Davis
f949d8adea Merge topic 'device-resource-management'
fe3b82b99 implement return codes and protected virtual parsing of arguments
ce9c27bc0 Implement tests for RuntimeDeviceConfiguration helpers
408beefc0 Implement RuntimeDeviceConfiguration

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <morelandkd@ornl.gov>
Merge-request: !2513
2021-06-23 16:11:32 -04:00
nadavi
fe3b82b99c implement return codes and protected virtual parsing of arguments 2021-06-23 17:58:38 +00:00
nadavi
408beefc0a Implement RuntimeDeviceConfiguration 2021-06-17 17:56:38 +00:00
Ben Boeckel
4c7fe13a98 cmake: avoid adding testing directories if testing is disabled
Some testing directories have side effects such as installing headers or
compiling code that ultimately doesn't end up getting used.
2021-06-01 18:40:40 -04:00
Kenneth Moreland
8eed21d085 Do not declare headers for virtual classes that are removed 2021-04-28 15:28:06 -06:00
Kenneth Moreland
cb3bb43ff9 Completely deprecate virtual methods
Deprecate `VirtualObjectHandle` and all other classes that are used to
implement objects with virtual methods in the execution environment.

Additionally, the code is updated so that if the
`VTKm_NO_DEPRECATED_VIRTUAL` flag is set none of the code is compiled at
all. This opens us up to opportunities that do not work with virtual
methods such as backends that do not support virtual methods and dynamic
libraries for CUDA.
2021-04-28 07:28:32 -06:00
Kenneth Moreland
a7100c845a Do not assume CUDA reduce operator is unary
The `Reduce` algorithm is sometimes used to convert an input type to a
different output type. For example, you can compute the min and max at
the same time by making the output of the binary functor a pair of the
input type. However, for this to work with the CUDA algorithm, you have
to be able to also convert the input type to the output type. This was
previously done by treating the binary operator as also a unary
operator. That's fine for custom operators, but if you are using
something like `thrust::plus`, it has no unary operation. (Why would
it?)

So, detect whether the operator has a unary operation. If it does, use
it to cast from the input portal to the output type. If it does not,
just use `static_cast`. Thus, the operator only has to have the unary
operation if `static_cast` does not work.
2021-03-03 09:39:51 -07:00
Kenneth Moreland
34412ff298 Deprecate ArrayHandle::Shrink
This method has been subsumed by Allocate with vtkm::CopyFlag::On.
2021-02-01 08:07:40 -07:00
Kenneth Moreland
90050b96e4 Remove ArrayManagerExecution
This class was used indirectly by the old `ArrayHandle`, through
`ArrayHandleTransfer`, to move data to and from a device. This
functionality has been replaced in the new `ArrayHandle`s through the
`Buffer` class (which can be compiled into libraries rather than make
every translation unit compile their own template).

This commit removes `ArrayManagerExecution` and all the implementations
that the device adapters were required to make. None of this code was in
any use anymore.
2020-12-08 13:18:44 -07:00
Kenneth Moreland
b33c54bf61 Add ScheduleTask to performance log
When `DeviceAdapterAlgorithm::ScheduleTask` was called directly (i.e.
not through `Schedule`), nothing was added to the log. Adding
`VTKM_LOG_SCOPE` to these methods so that all scheduling is added to the
performance log.
2020-10-25 17:22:38 -06:00
Kenneth Moreland
b012c42beb Rename CellLocatorUniformBins to CellLocatorTwoLevel
The new name is more descriptive and prevents confusion with
CellLocatorUniformGrid.
2020-09-21 15:42:47 -06:00
Kenneth Moreland
c9b763aca9 Rename PointLocatorUniformGrid to PointLocatorSparseGrid
The new name reflects better what the underlying algorithm does. It also
helps prevent confusion about what types of data the locator is good
for. The old name suggested it only worked for structured grids, which
is not the case.
2020-09-21 15:42:41 -06:00
Kenneth Moreland
63ef84ed78 Optionally remove all use of ArrayHandleVirtual
As we remove more and more virtual methods from VTK-m, I expect several
users will be interested in completely removing them from the build for
several reasons.

1. They may be compiling for hardware that does not support virtual
methods.
2. They may need to compile for CUDA but need shared libraries.
3. It should go a bit faster.

To enable this, a CMake option named `VTKm_NO_DEPRECATED_VIRTUAL` is
added. It defaults to `OFF`. But when it is `ON`, none of the code that
both uses virtuals and is deprecated will be built.

Currently, only `ArrayHandleVirtual` is deprecated, so the rest of the
virtual classes will still be built. As we move forward, more will be
removed until all virtual method functionality is removed.
2020-09-04 22:52:45 -06:00
Kitware Robot
cf0cdcf7d1 clang-format: reformat the repository with clang-format-9 2020-08-24 14:01:08 -04:00
Robert Maynard
b48c19f251 Correct warnings found by using clang as the host compiler for cuda 2020-08-24 08:57:38 -04:00
Kenneth Moreland
13056b3af5 Deprecate AtomicInterfaceControl and AtomicInterfaceExecution
Now that we have the functions in `vtkm/Atomic.h`, we can deprecate (and
eventually remove) the more cumbersome classes `AtomicInterfaceControl`
and `AtomicInterfaceExecution`.

Also reversed the order of the `expected` and `desired` parameters of
`vtkm::AtomicCompareAndSwap`. I think the former order makes more sense
and matches more other implementations (such as `std::atomic` and the
GCC `__atomic` built ins). However, there are still some non-deprecated
classes with similar methods that cannot easily be switched. Thus, it's
better to be inconsistent with most other libraries and consistent with
ourself than to be inconsitent with ourself.
2020-08-20 13:40:44 -06:00
Kenneth Moreland
d3503bfaba Implement AtomicInterfaceControl/Execution with free functions
Now that we have atomic free functions (e.g. `vtkm::AtomicAdd()`), we no
longer need special implementations for control and each execution
device. (Well, technically we do have special implementations for each,
but they are handled with compiler directives in the free functions.)

Convert the old atomic interface classes (`AtomicInterfaceControl` and
`AtomicInterfaceExecution`) to use the new atomic free functions. This
will allow us to test the new atomic functions everywhere that atomics
are used in VTK-m.

Once verified, we can deprecate the old atomic interface classes.
2020-08-20 13:40:44 -06:00
Kenneth Moreland
ed41874cc8 Consolidate tests for base vtkm code that is device-specific
Some of the code in the base `vtkm` namespace is device specific. For
example, the functions in `Math.h` are customized for specific devices.
Thus, we want this code to be specially compiled and run on these
devices.

Previously, we made a header file and then added separate tests to each
device package. That was created before we had ways of running on any
device. Now, it is much easier to compile the test a single time for all
devices and use the `ALL_BACKENDS` feature of `vtkm_unit_tests` CMake
function to automatically create the test for all devices.
2020-08-18 14:30:25 -06:00
Sujin Philip
452f61e290 Add Kokkos backend 2020-08-12 13:55:24 -04:00
Kenneth Moreland
18b5be92d6 Fix issue with CUDA and ArrayHandleMultiplexer
When you try to call the `Reduce` operation in the CUDA device adapter
with a sufficently complex interator type, you get a compile error
that says `error: cannot pass an argument with a user-provided
copy-constructor to a device-side kernel launch`.

This appears to be a bug in either nvcc or Thrust. I believe it is
related to the following reported issues:

* https://github.com/thrust/thrust/issues/928
* https://github.com/thrust/thrust/issues/1044

Work around this problem by making a special condition for calling
`Reduce` with an `ArrayHandleMultiplexer` that calls the generic
algorithm in `DeviceAdapterAlgorithmGeneral` instead of the algorithm in
Thrust.
2020-07-06 13:51:36 -06:00
Kenneth Moreland
a47fd42bc1 Pin user provided memory in ArrayHandle
Often when a user gives memory to an `ArrayHandle`, she wants data to be
written into the memory given to be used elsewhere. Previously, the
`Buffer` objects would delete the given buffer as soon as a write buffer
was created elsewhere. That was a problem if a user wants VTK-m to write
results right into a given buffer.

Instead, when a user provides memory, "pin" that memory so that the
`ArrayHandle` never deletes it.
2020-06-25 14:02:46 -06:00
Kenneth Moreland
56bec1dd7b Replace basic ArrayHandle implementation to use Buffers
This encapsulates a lot of the required memory management into the
Buffer object and related code.

Many now unneeded classes were deleted.
2020-06-25 14:02:26 -06:00
Kenneth Moreland
8f7b0d18be Add Buffer class
The buffer class encapsulates the movement of raw C arrays between
host and devices.

The `Buffer` class itself is not associated with any device. Instead,
`Buffer` is used in conjunction with a new templated class named
`DeviceAdapterMemoryManager` that can allocate data on a given
device and transfer data as necessary. `DeviceAdapterMemoryManager`
will eventually replace the more complicated device adapter classes
that manage data on a device.

The code in `DeviceAdapterMemoryManager` is actually enclosed in
virtual methods. This allows us to limit the number of classes that
need to be compiled for a device. Rather, the implementation of
`DeviceAdapterMemoryManager` is compiled once with whatever compiler
is necessary, and then the `RuntimeDeviceInformation` is used to
get the correct object instance.
2020-06-25 14:01:39 -06:00
Kenneth Moreland
4f9fa08fa1 Remove ArrayHandleStreaming capabilities
The `ArrayHandleStreaming` class stems from an old research project
experimenting with bringing data from an `ArrayHandle` in parts and
overlapping device transfer and execution. It works, but only in very
limited contexts. Thus, it is not actually used today. Plus, the feature
requires global indexing to be permutated throughout the worklet
dispatching classes of VTK-m for no further reason.

Because it is not really used, there are other more promising approaches
on the horizon, and it makes further scheduling improvements difficult,
we are removing this functionality.
2020-03-24 15:01:56 -06:00
Robert Maynard
8377806778 Merge topic 'introduce_mapfield_3d_scheduling'
1f1688483 Initial infrastructure to allow WorkletMapField to have 3D scheduling

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1938
2020-02-27 08:02:52 -05:00
Kenneth Moreland
ec34cb56c4 Use new ways to get array portal in control environment
Also fix deadlocks that occur when portals are not destroyed
in time.
2020-02-26 13:10:46 -07:00
Robert Maynard
1f1688483e Initial infrastructure to allow WorkletMapField to have 3D scheduling 2020-02-25 15:23:41 -05:00
Kenneth Moreland
3671cbe168 Fix token issues with CUDA 2020-02-25 09:39:30 -07:00
Kenneth Moreland
ad0a53af71 Convert execution preparation to use tokens
Marked the old versions of PrepareFor* that do not use tokens as
deprecated and moved all of the code to use the new versions that
require a token. This makes the scope of the execution object more
explicit so that it will be kept while in use and can potentially be
reclaimed afterward.
2020-02-25 09:39:19 -07:00
Kenneth Moreland
76ce9c87f0 Support using Token calling PrepareForExecution in ExecutionObject
The old version of ExecutionObject (that only takes a device) is still
supported, but you will get a deprecated warning if that is what is
defined.

Supporing this also included sending vtkm::cont::Token through the
vtkm::cont::arg::Transport mechanism, which was a change that propogated
through a lot of code.
2020-02-25 07:41:39 -07:00
Allison Vacanti
46b7155bdb Add 64-bit CUDA atomic store. 2020-01-08 10:58:51 -05:00
Allison Vacanti
539f6e5ad7 Port benchmarking framework to Google Benchmark. 2020-01-08 10:58:51 -05:00
Allison Vacanti
44c4f0838f Add vtkm/Algorithms.h header with device-friendly binary search algorithms. 2019-12-20 12:35:10 -05:00
Allison Vacanti
813f5a422f Fixup custom portal iterator logic.
The convenience functions `ArrayPortalToIteratorBegin()` and
`ArrayPortalToIteratorEnd()` wouldn't detect specializations of
`ArrayPortalToIterators<PortalType>` since the specializations aren't
visible when the `Begin`/`End` functions are declared.

Since the CUDA iterators rely on a specialization, the convenience
functions would not compile on CUDA.

Now, instead of specializing `ArrayPortalToIterators` to provide custom
iterators for a particular portal, the portal may advertise custom
iterators by defining `IteratorType`, `GetIteratorBegin()`, and
`GetIteratorEnd()`. `ArrayPortalToIterators` will detect such portals
and automatically switch to using the specialized portals.

This eliminates the need for the specializations to be visible to the
convenience functions and allows them to be usable on CUDA.
2019-12-17 15:39:51 -05:00
Kenneth Moreland
92db376236 Convert uses of ListTagBase to List 2019-12-06 15:37:46 -07:00
Kenneth Moreland
5ab0b5bb1d Access ArrayHandle internals in a critical section
Repeat the changes of the previous commit with the specialized
ArrayHandle for basic storage.
2019-11-20 14:42:58 -07:00
Robert Maynard
5c56ff945f Label tests which exercise a given Device Adapter
This allows developers an easy way to run all OpenMP tests
2019-09-13 15:52:40 -04:00
Allison Vacanti
b9affb7edc Disable copy for RAII helper. 2019-09-09 17:59:38 -04:00
Allison Vacanti
ea0bbfeefc Increase CUDA stack size for ParticleAdvection worklets.
Sometimes the CUDA runtime would not allocate sufficient stack
space for the particle advection code to run. This issue was exposed by
!1737 -- for some reason, once those changes to unrelated filters/worklets
are added to VTK, CUDA allocates less stack and the following tests would
fail:

UnitTestLagrangianFilterCUDA
UnitTestLagrangianStructuresFilterCUDA
UnitTestStreamlineFilterCUDA
UnitTestStreamSurfaceFilterCUDA

These were fixed by increasing the stack size in the particle advection
worklet Run(...) methods.

An RAII helper has been added that will restore the previous stack size
in case an exception is thrown, and the KDTree code has been updated
to use this helper when it adjusts the CUDA stack allocation.
2019-09-09 16:06:23 -04:00
Allison Vacanti
884616788a Simplify and extend AtomicArray implementation.
- Use AtomicInterface to implement device-specific atomic operations.
- Remove DeviceAdapterAtomicArrayImplementations.
- Extend supported atomic types to include unsigned 32/64-bit ints.
- Add a static_assert to check that AtomicArray type is supported.
- Add documentation for AtomicArrayExecutionObject, including a CAS
  example.
- Add a `T Get(idx)` method to AtomicArrayExecutionObject that does
  an atomic load, and update existing CAS usage to use this instead
  of `Add(idx, 0)`.
2019-08-23 15:40:37 -04:00
Allison Vacanti
0e728c8000 Update atomic interfaces to support Add/CAS for UInt32/64.
These will be used for the AtomicArray implementation.
2019-08-23 15:40:37 -04:00
Allison Vacanti
112024dae2 Fix CUDA shfl usage.
There was a bug in the implementations of CountSetBits and
BitFieldToUnorderedSet.
2019-08-01 10:57:57 -04:00
Kenneth Moreland
5e23853521 Create ArrayHandleMultiplexer 2019-07-22 08:36:28 -06:00
Mark Kim
8dbb1c4de3 Merge branch 'master' of gitlab.kitware.com:m-kim/vtk-m into advdatamodel 2019-06-26 19:37:47 -04:00
Allison Vacanti
920ef9b3b9 Merge topic 'bit_algorithms'
f370857c1 Add CountSetBits and Fill device algorithms.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1696
2019-06-25 15:42:16 -04:00