b48c19f25 Correct warnings found by using clang as the host compiler for cuda
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2232
6cbcb9f5d Fix behavior of Cuda AtomicLoad with SequentiallyConsistent
7573d4ed5 Fix compiler warnings
147dd24d0 Remove ARM intrinsics in MSVC
2229c22f4 Avoid invalid Kokkos atomic calls
3b147878f Always use our implementation of Cuda atomics
9e6fe8fb6 Add memory order semantics to atomic functions
d2ac4b860 Be more careful in casting with Atomic functions
13056b3af Deprecate AtomicInterfaceControl and AtomicInterfaceExecution
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2223
According to Allie Vacanti, a sequentially consistent load requires a
full fence on Cuda hardware to be conforming.
Also improved the documentation of `MemoryOrder` based on Allie's
suggestion.
Also removed the `Consume` memory order based on Allie's suggestion. It
is tricky to use correctly, and most implementations just regress to the
safer `Acquired` behavior anyway.
Previously, if Kokkos was enabled we always used Kokkos atomics.
However, if a user, for some reason, compiled with a version of Kokkos
that was _not_ compiled for Cuda and we turned on our own Cuda device
adapter, that would cause a problem. The old code assumed Kokkos would
create the Cuda version of the atomics, but it would not. Thus, there
would be no atomics for Cuda.
Resolved this problem by switching the order in which we try to define
atomics. Use our version of atomics whenever compiling for a Cuda
device. Otherwise, try to compile the Kokkos version (and if that is not
available, use ours as before.
To ensure correctness of an atomic function, it is often necessary to
force a compiler to order operations correctly. However, doing so may
slow things down, so it is helpful to relax these constraints if they
are not necessary. Add the ability to select the correct memory order
constraints in the atomic functions.
Previously, the atomic functions accepted any type as its operand and
then cast that to the storage type. That could cause some rather
unexpected behavior. For example, casting a floating point number to an
integer might not give you the behavior as expected. So that behavior as
been removed and now the operand has to match the pointer.
However, all the currently supported atomics are unsigned, and there are
many reasons that it might be easier to use signed as operands. (For
example, C literals are signed.) Thus, a second condition that allows
the sign to be swapped has been added so that you don't get annoying
signed/unsigned conversion warnings.
Now that we have the functions in `vtkm/Atomic.h`, we can deprecate (and
eventually remove) the more cumbersome classes `AtomicInterfaceControl`
and `AtomicInterfaceExecution`.
Also reversed the order of the `expected` and `desired` parameters of
`vtkm::AtomicCompareAndSwap`. I think the former order makes more sense
and matches more other implementations (such as `std::atomic` and the
GCC `__atomic` built ins). However, there are still some non-deprecated
classes with similar methods that cannot easily be switched. Thus, it's
better to be inconsistent with most other libraries and consistent with
ourself than to be inconsitent with ourself.
If the Kokkos device is enabled, we use the Kokkos atomic functions for
implementation since Kokkos has already ported all of these to each
device.
If Kokkos is compiled with CUDA, then the functions are marked with
`__device__` and `__host__`. Makes sense right?
Unless we are trying to use the atomic functions on host code compiled
only for the host. In that case, modifiers like `__device__` will cause
compiler errors.
So we want to disable Kokkos from using `__device__`, but only if CUDA
is enabled and we are not using the CUDA compiler. Had to hack things up
to get that to work.
Now that we have atomic free functions (e.g. `vtkm::AtomicAdd()`), we no
longer need special implementations for control and each execution
device. (Well, technically we do have special implementations for each,
but they are handled with compiler directives in the free functions.)
Convert the old atomic interface classes (`AtomicInterfaceControl` and
`AtomicInterfaceExecution`) to use the new atomic free functions. This
will allow us to test the new atomic functions everywhere that atomics
are used in VTK-m.
Once verified, we can deprecate the old atomic interface classes.
Previously, all atomic functions were stored in classes named
`AtomicInterfaceControl` and `AtomicInterfaceExecution`, which required
you to know at compile time which device was using the methods. That in
turn means that anything using an atomic needed to be templated on the
device it is running on.
That can be a big hassle (and is problematic for some code structure).
Instead, these methods are moved to free functions in the `vtkm`
namespace. These functions operate like those in `Math.h`. Using
compiler directives, an appropriate version of the function is compiled
for the current device the compiler is using.
There is a limitation in Windows builds using VS2019 where libraries cannot be
bigger than 4GiB. This is normally not an issue but in `VTKm` due to its strong
template usage libraries can reach that size.
The `VTKm` filter library is can easily reach that size and it will halt the
build
This MR tries to avoid reaching those sizes for now by splitting the filter
library into four smaller libraries.
The proposal scheme is:
It splits vtkm-filter into:
- vtkm-common, Classes that are dependencies of other filter libs.
- vtkm-contour, Contour class and its instantiations.
- vtkm-contour, Gradient class and its instantiations.
- vtkm-extra, Classes other than Contour or Gradient that are
not dependencies.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
ed41874cc Consolidate tests for base vtkm code that is device-specific
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2219
d13a08b6a Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into mpiStreamlines2
530c4504d Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into mpiStreamlines2
6bfb91aac try to fix/debug the windows builds.
8b1520fdb address link errors in windows and gcc4.8 dashboard compile errors
5fa93a64f fix build errors
ec796445c Remove some print statements.
9bf49101c Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into mpiStreamlines2
4722dc67b Code review cleanup.
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2210
Some of the code in the base `vtkm` namespace is device specific. For
example, the functions in `Math.h` are customized for specific devices.
Thus, we want this code to be specially compiled and run on these
devices.
Previously, we made a header file and then added separate tests to each
device package. That was created before we had ways of running on any
device. Now, it is much easier to compile the test a single time for all
devices and use the `ALL_BACKENDS` feature of `vtkm_unit_tests` CMake
function to automatically create the test for all devices.
4e72eb043 Don't apply pyexpander fix on Windows.
959db40aa Find expander.py on the syspath, do not call it from a Python interpreter.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2214