- Reducing the stack allocation for CUDA for the BIH unit test
- Adding changes from Ken's review
- Suppress ptxas stack size warning for BoundingIntervalHierarchy
- Changing the name PrepareForExecutionOnDevice to PrepareForExecutionImpl
- Adding changes suggested by Ollie and Ken to return the execution object
from PrepareForExecutionImpl using VirtualObjectHandle
- Updating PrepareForExecutionFunctor
- Adding API files
- Adding back Manish's BoundingIntervalHierarchy search structure
- Updating CMakeLists.txt to accomodate these changes
- Adding the old test file from Manish - won't build for now
-Changing the existing CellLocator.h to CellLocatorHelper.h,
it's used by CellLocatorTwoLevelUniformGrid.h
-Changing unit tests and worklets that use CellLocator.h to use CellLocatorHelper.h
This fix addresses issue #110. Test functions in TestingDeviceAdapter have
been updated to test large arrays through random indices using
std::uniform_int_distribution rather than testing every 100th or 50th value.
1c5feeb1 Make sure all device specific tests use the intended device.
30205be8 Make sure PointLocatorUniformGrid always uses the provided device adapter
74df09fb Remove unneeded trailing ; from PointLocatorUniformGrid
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1209
c3c8d0fd Refactor ArrayHandleCompositeVector to simplify usage and impl.
ec88b7dd Markup array portals for use in the exec env.
3159b376 Make Swizzle and ExtractComponent array parameters runtime vars.
19344707 Add vtkm_taotuple to build system.
d23c4452 Merge branch 'upstream-taotuple' into tmp
b3b14de7 taotuple 2018-05-15 (e3de8c97)
b00f6c1c Add taocpp/tuple as a thirdparty package.
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1199
When copying small arrays from cpu memory to pascal memory we would
see subsequent kernels fail as the memory transfer hadn't finished.
This is a bug as each stream should act like a FIFO queue. So
for now when encountering this use case we explicitly synchronize
after the memcpy.
- Use tao::tuple instead of FunctionInterface to hold array/portal
collections.
- Type signatures are simplified. Now just use:
- ArrayHandleCompositeVector<ArrayT1, ArrayT2, ...>
- make_ArrayHandleCompositeVector(array1, array2, ...)
instead of relying on helper structs to determine types.
- No longer support component selection from an input array. All
input arrays must have the same ValueType (See ArrayHandleSwizzle
and ArrayHandleExtractComponent as the replacements for these
usecases.
This means that we not only setup the runtime device tracker
to force the intended device, it also means making sure
the default device is the error device.
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.
The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.
To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
While making changes to how execution objects work, we had agreed to
name the base object ExecutionObjectBase instead of its original name of
ExecutionObjectFactoryBase. Somehow that change did not make it through.
571556d9 CUDA's RuntimeDeviceTracker and Timer are now built as part of vtkm_cont
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1196
This is done to not only reduce the amount of code that users need
to generate but to reduce the amount of errors when using
the RuntimeDeviceTracker. If the runtime device tracker is initially
used in a library by a c++ file it will never properly detect the
cuda backend. By moving the code into vtkm_cont we can make sure
this problem doesn't occur.
The logic inside DataSetBuilderUniform has some tricky logic but for all
cases the dimensions are correct. We just initialize the variable to make
the compiler stop warning.