The original design of invoke and the transport infrastructure
relied on the implementation behavior of vtkm::cont types
such as ArrayHandle that used an internal shared_ptr to managed
state. This allowed passing by value instead of passing by
non-const ref when needing to transfer information to the device.
As VTK-m adds support for classes that use virtuals the ability
to pass by base pointer type allows for us to invoke worklets
using a base type without the risk of type slicing.
Additional by moving over to a non-const ref Invocation we
can update all transports that have 'output' to now be
by ref and therefore support types that can't be copied while
being 'more' correct.
The invocation parameters need to be non const as we want to
be able to call non-const methods like `PrepareForOutput` on them
from a transport function.
The original implementation abused the fact that everything
could be copied by value and have that work properly. But
when we start introducing virtual classes copying by value of
a base type can cause type slicing.
e5090e128 Make sure the PointLocatorUniform uses the correct runtime device
38e0e4c33 Mark PointLocatorUniformGrid constructors as host only
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1306
b47b1f9ae Allow NDHistogram to take custom type
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1299
1e6c30b7a Make pair construction rvalue-friendly.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1302
There was an error in TestingPointLocatorUniformGrid in which it was
creating arrays of type vtkm::Float32 and passing them to a worklet that
expected vtkm::FloatDefault. This is corrected.
-For the BoundingIntervalHierarchy CUDA had failures with using
.cxx file to implement the virtual methods
-Moving the contents to the .hxx file after discussing with Rob
over email
-Need to still work on the .cxx implementation after merge
Error: Throwing an exception in CUDA code.
Fix: Change method throwing exception to VTKM_CONT.
New warning: host/device warning in taotuple.
Fix: Markup additional taotuple methods with suppressions.
This also updates our taotuple checkout to match upstream master.
4459ab917 Merge branch 'master' into 'pointlocator-general-interface'
51fd4a117 Fix warning about __host__/__device__ on default constructor
6f75cd008 Fix crash in CUDA compiler
439beaaed Make point locator tests have consistent devices
33f1f2dd1 Make sure all source files are listed in CMake
367ca3e21 Correct error of grabbing reference of stack variable
693c8ea8c Update PointLocatorUniformGrid.h
d8ff2ba78 Update PointLocator.h to add a newline at the end of file.
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Li-Ta Lo <ollie@lanl.gov>
Merge-request: !1292
80b12e325 Merge branch 'coordSysFilter' of https://gitlab.kitware.com/dpugmire/vtk-m into coordSysFilter
ab5eeab18 Fixes for making the filters non-templated.
27dade145 Fixes for coordinate systems w/ help from Sujin.
db5ded3a6 Add files for coord sys transform filters.
17087a26a Filter for coordinate system transform.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1284
- Reducing the stack allocation for CUDA for the BIH unit test
- Adding changes from Ken's review
- Suppress ptxas stack size warning for BoundingIntervalHierarchy
The vtkm::exec::PointLocatorUniformGrid has a default constructor. It
was "helpfully" declared as VTKM_EXEC_CONT, but apparently that is the
wrong thing to do for constructors that are set to default.
Previously when PointLocatorUniformGrid.h was compiled by the CUDA
compiler, the compiler would crash. Apparently during the ptxas
part of the compiler goes into a crazy recursion and runs out of
stack space. This appears to be a long-standing bug in CUDA
(been there for multiple releases) without a clear reason why it
sometimes rears its ugly head. (See for example
https://devtalk.nvidia.com/default/topic/1028825/cuda-programming-and-performance/-ptxas-died-with-status-0xc00000fd-stack_overflow-/)
The problem appears to be when having a doubly or triply nested
loop over a box of values to check in the uniform array. This
appears to fix the problem by converting that to a single for
loop with some index magic to convert that to 3D indices.