59c545c15 add test and fix cuda long compiling issue
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Abhishek Yenpure <abhi.yenpure@kitware.com>
Merge-request: !3188
The `DeviceAdapter` provides an abstract interface to the accelerator
devices worklets and other algorithms run on. As such, the programmer has
less control about how the device launches each worklet. Each device
adapter has its own configuration parameters and other ways to attempt to
optimize how things are run, but these are always a universal set of
options that are applied to everything run on the device. There is no way
to specify launch parameters for a particular worklet.
To provide this information, VTK-m now supports `Hint`s to the device
adapter. The `DeviceAdapterAlgorithm::Schedule` method takes a templated
argument that is of the type `HintList`. This object contains a template
list of `Hint` types that provide suggestions on how to launch the parallel
execution. The device adapter will pick out hints that pertain to it and
adjust its launching accordingly.
These are called hints rather than, say, directives, because they don't
force the device adapter to do anything. The device adapter is free to
ignore any (and all) hints. The point is that the device adapter can take
into account the information to try to optimize for itself.
A provided hint can be tied to specific device adapters. In this way, an
worklet can further optimize itself. If multiple hints match a device
adapter, the last one in the list will be selected.
The `Worklet` base now has an internal type named `Hints` that points to a
`HintList` that is applied when the worklet is scheduled. Derived worklet
classes can provide hints by simply defining their own `Hints` type.
Previously, if you called `Fill` on an `ArrayHandleStride`, you would get
an exception that said the feature was not supported. It turns out that
filling values is very useful in situations where, for example, you need to
initialize an array when processing an unknown type (and thus dealing with
extracted components).
This implementation of `Fill` first attempts to call `Fill` on the
contained array. This only works if the stride is set to 1. If this does
not work, then the code leverages the precompiled `ArrayCopy`. It does this
by first creating a new `ArrayHandleStride` containing the fill value and a
modulo of 1 so that value is constantly repeated. It then reconstructs an
`ArrayHandleStride` for itself with a modified size and offset to match the
start and end indices.
Referencing the `ArrayCopy` was tricky because it kept creating circular
dependencies among `ArrayHandleStride`, `ArrayExtractComponent`, and
`UnknownArrayHandle`. These dependencies were broken by having
`ArrayHandleStride` directly use the internal `ArrayCopyUnknown` function
and to use a forward declaration of `UnknownArrayHandle` rather than
including its header.
23fbedf54 Order the VTK scalar types to favor used types
053237e90 Favor base types used by VTK-m when making new basic array
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Li-Ta Lo <ollie@lanl.gov>
Merge-request: !3171
Reorder the `VTKScalarTypes` to put the repeated types at the bottom.
This will avoid using types like `long` that are essentially the same as
other types (e.g. `long long`) but will still be considered different.
This causes problems when checking against some VTK types (such as
`vtkIdType`).
When calling `NewInstanceBasic` on an `UnknownArrayHandle`, all C base
types will be tried. In some cases, multiple type will match the same
array. When this happens, favor types used by VTK-m (e.g. `long long`
over `long`).
nvcc doesn't like the couple of places that we have Impl classes hiding behind a
private declaration. So for VTKM_CUDA only, ifdef the private declaration out
`UnknownArrayHandle` allows you to create a new instance of a compatible
array so that when receiving an array of unknown type, a place to put the
output can be created. However, these methods only worked if the number of
components in each value could be determined statically at compile time.
However, there are some special `ArrayHandle`s that can define the number
of components at runtime. In this case, the `ArrayHandle` would throw an
exception if `NewInstanceBasic` or `NewInstanceFloatBasic` was called.
Although rare, this condition could happen when, for example, an array was
extracted from an `UnknownArrayHandle` with `ExtractArrayFromComponents` or
with `CastAndCallWithExtractedArray` and then the resulting array was
passed to a function with arrays passed with `UnknownArrayHandle` such as
`ArrayCopy`.
Getting the number of components (or the number of flattened components)
from an `ArrayHandle` is usually trivial. However, if the `ArrayHandle` is
special in that the number of components is specified at runtime, then it
becomes much more difficult to determine.
Getting the number of components is most important when extracting
component arrays (or reconstructions using component arrays) with
`UnknownArrayHandle`. Previously, `UnknownArrayHandle` used a hack to get
the number of components, which mostly worked but broke down when wrapping
a runtime array inside another array such as `ArrayHandleView`.
To prevent this issue, the ability to get the number of components has been
added to `ArrayHandle` proper. All `Storage` objects for `ArrayHandle`s now
need a method named `GetNumberOfComponentsFlat`. The implementation of this
method is usually trivial. The `ArrayHandle` template now also provides a
`GetNumberOfComponentsFlat` method that gets this information from the
`Storage`. This provides an easy access point for the `UnknownArrayHandle`
to pull this information.
Previously, `ArrayHandleConstant` did not really support component
extraction. Instead, it let a fallback operation create a full array on
the CPU.
Component extraction is now quite efficient for `ArrayHandleConstant`. It
creates a basic `ArrayHandle` with one entry and sets a modulo on the
`ArrayHandleStride` to access that value for all indices.
This resolves a compiler ambiguity I hit during development.
In my case, I created an `ArrayHandleDecorator` with one of the arrays being
an `ArrayHandleTransform`. The ambiguity occurs in function
`DecoratorStorageTraits<...>::BuffersToArray`, here an `ArrayHandleTransform`
is constructed using buffers (`std::vector<vtkm::cont::internal::Buffer>`)
This constructor is not defined for `ArrayHandleTransform`, but it's defined for
its superclass (`vtkm::cont::ArrayHandle`). `ArrayHandleTransform` does have a
non-explicit constructor that takes the array to be transformed and the
transform-functor as parameters, where the later has a default value.
This produces the following ambiguous options for the compiler:
1. Create a "to-be-transformed" ArrayHandle instance using the buffers, call
the `ArrayHandleTransform` constructor with this array with the defaulted
functor parameter.
2. Create the superclass instance using the buffer and call the
`ArrayHandleTransform` constructor that takes the superclass.
In this situation, option 2 is the correct choice.
The ambiguity is resolved by marking the constructors that take
buffers as explicit. These constructors are also added for the
derived classess via the `VTK_M_ARRAY_HANDLE_SUBCLASS_IMPL` macro.
The file `ArrayRangeCompute.cxx` was taking a long time to compile with
some device compilers. This is because it precompiles the range computation
for many types of array structures. It thus compiled the same operation
many times over.
The new implementation compiles just as many cases. However, the
compilation is split into many different translation units using the
instantiations feature of VTK-m's configuration. Although this rarely
reduces the overall CPU time spent during compiling, it prevents parallel
compiles from waiting for this one build to complete. It also avoids
potential issues with compilers running out of resources as it tries to
build a monolithic file.
The structured connectivity classes are templated on two tags to
determine what 2 incident topological elements are being accessed. Back
in the day, these were called the "from" elements and "to" elements, as
taken from VTK filter names like `PointDataToCellData`. However, these
names were found to be very confusion, and after much debate they have
been renamed to the visit element type and the incident element type.
Meaning that a worklet is "visiting" elements of a particular type (such
as visiting each cell) and can access "incident" elements of a
particular type (such as the points incident on the cell).
I found a few methods converting flat and logical indices using the old,
confusing from/to convention. This changes them to the new convention.
This commit adds the flag VTKm_ENABLE_GPU_MPI which when enable it
will use GPU AWARE MPI.
- This will only work with GPUs and MPI implementation that supports GPU
AWARE MPI calls.
- Enabling VTKm_ENABLE_GPU_MPI without MPI/GPU support might results in
errors when running VTK-m with DIY/MPI.
- Only the following tests can run with this feature if enabled:
- UnitTestSerializationDataSet
- UnitTestSerializationArrayHandle
The internal array copy has an optimization to use the device the array
exists on to do the copy. However, if that device is disabled the copy
would fail. This problem has been fixed.
The legacy VTK file reader previously only supported a specific set of Vec
lengths (i.e., 1, 2, 3, 4, 6, and 9). This is because a basic array
handle has to have the vec length compiled in. However, the new
`ArrayHandleRuntimeVec` feature is capable of reading in any vec-length
and can be leveraged to read in arbitrarily sized vectors in field
arrays.
5bdd3c7bc Move ArrayHandleRuntimeVec metadata to a separate class
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Li-Ta Lo <ollie@lanl.gov>
Merge-request: !3062
Originally, the metadata structure used by the `ArrayHandleRuntimeVec`
storage was a nested class of its `Storage`. But the `Storage` is
templated on the component type, and the metadata object is the same
regardless. In addition to the typical minor issue of having the
compiler create several identical classes, this caused problems when
pulling arrays from equivalent but technically different C types (for
example, `long` is the same as either `int` or `long long`).
There was an error in `operator-=` for `IteratorFromArrayPortal` that went
by unnoticed. The operator is fixed and regression tests for the operators
has been added.
`UnknownArrayHandle` is supposed to treat `ArrayHandleRuntimeVec` the
same as `ArrayHandleBasic`. However, the `NewInstance` methods were
failing because they need custom handling of the vec size.
b0715e14a ArrayHandle::StorageType should be public
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <morelandkd@ornl.gov>
Merge-request: !3038