There was an error in TestingPointLocatorUniformGrid in which it was
creating arrays of type vtkm::Float32 and passing them to a worklet that
expected vtkm::FloatDefault. This is corrected.
-For the BoundingIntervalHierarchy CUDA had failures with using
.cxx file to implement the virtual methods
-Moving the contents to the .hxx file after discussing with Rob
over email
-Need to still work on the .cxx implementation after merge
Error: Throwing an exception in CUDA code.
Fix: Change method throwing exception to VTKM_CONT.
New warning: host/device warning in taotuple.
Fix: Markup additional taotuple methods with suppressions.
This also updates our taotuple checkout to match upstream master.
- Reducing the stack allocation for CUDA for the BIH unit test
- Adding changes from Ken's review
- Suppress ptxas stack size warning for BoundingIntervalHierarchy
Previously when PointLocatorUniformGrid.h was compiled by the CUDA
compiler, the compiler would crash. Apparently during the ptxas
part of the compiler goes into a crazy recursion and runs out of
stack space. This appears to be a long-standing bug in CUDA
(been there for multiple releases) without a clear reason why it
sometimes rears its ugly head. (See for example
https://devtalk.nvidia.com/default/topic/1028825/cuda-programming-and-performance/-ptxas-died-with-status-0xc00000fd-stack_overflow-/)
The problem appears to be when having a doubly or triply nested
loop over a box of values to check in the uniform array. This
appears to fix the problem by converting that to a single for
loop with some index magic to convert that to 3D indices.
- Changing the name PrepareForExecutionOnDevice to PrepareForExecutionImpl
- Adding changes suggested by Ollie and Ken to return the execution object
from PrepareForExecutionImpl using VirtualObjectHandle
- Updating PrepareForExecutionFunctor
Previously, to query whether an ArrayHandle was writable with
IsWriteableArrayHandle, you had to specify a device adapter. The idea
was that it would look at the portal used for that device adapter.
Instead, check the control pointer, which should give the same
indication without having to have a separate check for every type of
device.
880d8a989 Add `vtkm/Geometry.h` and test it.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1262
This commit adds several geometric constructs to vtk-m
in the `vtkm/Geometry.h` header. They may be used from
both the execution and control environments.
We also add methods to perform projection and Gram-Schmidt
orthonormalization to `vtkm/VectorAnalysis.h`.
See `docs/changelog/geometry.md` included in this commit
for more information.
Found via `codespell` and `grep`
more typos
includes source typo change and a typo that needs further review
follow-up typos
Follow-up typos
Revert a commit
Previously, it was not possible to call Allocate or Shrink on an
implicit storage. The reason for this is that the implicit storage does
not represent any real memory and any attempt to modify it is wrong.
However, there are some rare cases where ArrayHandle will attempt to
"allocate" the storage even when behaving in a read-only manner. The use
case this is being created for is when an ArrayHandleImplicit first
calls ReleaseResources and then calls ReleaseResourcesExecution (or
anything else that tries to get a control-side portal). In this case,
the ReleaseResources makes the control side portal invalid and the
ReleaseResourcesExecution attempts to make it valid again by allocating
the storage to size 0. This change solves the problem by allowing the
implicit storage to be "allocated" to something smaller than originally
created.
157894436 Enable Vec construction with intializer_list of single value
79afe2a16 Simplify make_ArrayHandleSwizzle
ae8d994d2 Add support for initializer lists in Vec
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1239
- Adding API files
- Adding back Manish's BoundingIntervalHierarchy search structure
- Updating CMakeLists.txt to accomodate these changes
- Adding the old test file from Manish - won't build for now
-Changing the existing CellLocator.h to CellLocatorHelper.h,
it's used by CellLocatorTwoLevelUniformGrid.h
-Changing unit tests and worklets that use CellLocator.h to use CellLocatorHelper.h
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
9c3547bc7 VTK-m cuda runtime now handles no cuda runtime properly
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1242
This fix addresses issue #110. Test functions in TestingDeviceAdapter have
been updated to test large arrays through random indices using
std::uniform_int_distribution rather than testing every 100th or 50th value.
1c5feeb1 Make sure all device specific tests use the intended device.
30205be8 Make sure PointLocatorUniformGrid always uses the provided device adapter
74df09fb Remove unneeded trailing ; from PointLocatorUniformGrid
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1209
c3c8d0fd Refactor ArrayHandleCompositeVector to simplify usage and impl.
ec88b7dd Markup array portals for use in the exec env.
3159b376 Make Swizzle and ExtractComponent array parameters runtime vars.
19344707 Add vtkm_taotuple to build system.
d23c4452 Merge branch 'upstream-taotuple' into tmp
b3b14de7 taotuple 2018-05-15 (e3de8c97)
b00f6c1c Add taocpp/tuple as a thirdparty package.
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1199
When copying small arrays from cpu memory to pascal memory we would
see subsequent kernels fail as the memory transfer hadn't finished.
This is a bug as each stream should act like a FIFO queue. So
for now when encountering this use case we explicitly synchronize
after the memcpy.
- Use tao::tuple instead of FunctionInterface to hold array/portal
collections.
- Type signatures are simplified. Now just use:
- ArrayHandleCompositeVector<ArrayT1, ArrayT2, ...>
- make_ArrayHandleCompositeVector(array1, array2, ...)
instead of relying on helper structs to determine types.
- No longer support component selection from an input array. All
input arrays must have the same ValueType (See ArrayHandleSwizzle
and ArrayHandleExtractComponent as the replacements for these
usecases.
This means that we not only setup the runtime device tracker
to force the intended device, it also means making sure
the default device is the error device.
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.
The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.
To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
While making changes to how execution objects work, we had agreed to
name the base object ExecutionObjectBase instead of its original name of
ExecutionObjectFactoryBase. Somehow that change did not make it through.
571556d9 CUDA's RuntimeDeviceTracker and Timer are now built as part of vtkm_cont
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1196
This is done to not only reduce the amount of code that users need
to generate but to reduce the amount of errors when using
the RuntimeDeviceTracker. If the runtime device tracker is initially
used in a library by a c++ file it will never properly detect the
cuda backend. By moving the code into vtkm_cont we can make sure
this problem doesn't occur.
The logic inside DataSetBuilderUniform has some tricky logic but for all
cases the dimensions are correct. We just initialize the variable to make
the compiler stop warning.
Rather than require all ExecutionObjectFactoryBase classes to declare a
templated ExecObjectType type, get the type of the execution object
directly from the result of the PrepareForExecution method.
before it was wall prepareforexecution(device) creating an execution object but invoke calls this function now through the transport tag so we just need to give it the factory object not the execution object
updating the transport execution object test to use the factory to create an execution object based on the templated device and chaged the trasport tag to call the prepareForExectuion(Device) method to create the execution object.
In order to make the change from the current way execution obejcts are utilized to the new proposed executionObjectFactory process type checks now has to look for the new execution object factory class to check against.
Modify PointLocatorUniform::Build to return an ExecutionObjec, Locator.
The Locator can the be passed to Worklets for finding neareast neighbor
point by calling the FindNeareastPoint method.
f025c218 Suppress conversion warning inside Cell Interpolate code
822d4c61 Marching Cubes test doesn't abuse fall through case statements
dac7ab98 Correct a bad memcpy in ColorTable that gcc 7 found
c6726644 Reformat some test code to stop gcc 7.3 from segfaulting.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Utkarsh Ayachit <utkarsh.ayachit@kitware.com>
Merge-request: !1175
Somehow deducing the parameters to a static function was causing
gcc to segfault and crash. Changed around the code to use a free
function and everything works properly
The usage of uint was causing problems with CUDA + MSVC2015 as
type was not defined. Instead we use vtkm::Id as that was the expect
type to be passed to the task
This allows for easier host side logic when determining grid and block
sizes, and allows for a smaller library side by moving some logic
into compiled in functions.
Previously, when a Worklet needed a scatter, the scatter object was
stored in the Worklet object. That was problematic because that means
the Scatter, which is a control object, was shoved into the execution
environment.
To prevent that, move the Scatter into the Dispatcher object. The
worklet still declares a ScatterType alias, but no longer has a
GetScatter method. Instead, the Dispatcher now takes a Scatter object in
its constructor. If using the default scatter (ScatterIdentity), the
default constructor is used. If using another type of Scatter that
requires data to set up its state, then the caller of the worklet needs
to provide that to the dispatcher. For convenience, worklets are
encouraged to have a MakeScatter method to help construct a proper
scatter object.
Adding API to DataSet to `CopyStructure` from another dataset. This
copies the cellsets and coordinate systems while leaving the fields
unchanged.
CreateResult no longer copies all input fields to the output dataset
created.
Furthermore, if a Filter subclass doesn't provide `MapFieldOntoOutput`,
then the default implementation simply copies only the selected fields
to the output dataset.
Minimizing the need to have complex serialization code in vtkm/cont.
This is first step in moving DIY serialization code out of vtkm/cont. We
need to move that to filter so we can leverage policy correctly
serialize fields.
`vtkm_unit_tests` now supports an MPI option that can be used to add
test that run with MPI. Adding `UnitTestFieldRangeGlobalCompute` to test
global ranges for fields.
Removing another API that need not be on MultiBlock. There's generally
no need for apps to know this. If needed, we can add `...Compute`
function. This removes another API on MultiBlock that could trigger
parallel communication/synchronization.
Removing MultiBlock::GetGlobalRange API to keep things consistent with
DataSet API. Instead, one should use `FieldRangeCompute` or
`FieldRangeGlobalCompute` as appropriate.
c1237969 VTK-m ArrayHandle can now take ownership of a user allocated memory location
707970f4 VTK-m StorageBasic is now able to give/take ownership of user allocated memory.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1137
Previously memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors
to support a Storage object that is being moved, we can clearly express that
the ArrayHandle now owns memory it didn't allocate.
Here is an example of how this is done:
```cpp
T* buffer = new T[100];
auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };
vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
storage(buffer, 100, user_free_function);
vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));
```
This fixes the three following issues with StorageBasic.
1. Memory that was allocated by VTK-m and Stolen by the user needed the
proper free function called which is generally StorageBasicAllocator::deallocate.
But that was hard for the user to hold onto. So now we provide a function
pointer to the correct free function.
2. Memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. This is now resolved by allowing the
user to specify a free function to be called on release.
3. When the CUDA backend allocates memory for an ArrayHandle that has no
control representation, and the location we are running on supports concurrent
managed access we want to specify that cuda managed memory as also the host memory.
This requires that StorageBasic be able to call an arbitrary new delete function
which is chosen at runtime.