Add changelog for extract component
This commit is contained in:
parent
03c3f9e178
commit
8dfd019423
226
docs/changelog/array-extract-component.md
Normal file
226
docs/changelog/array-extract-component.md
Normal file
@ -0,0 +1,226 @@
|
||||
# Extract component arrays from unknown arrays
|
||||
|
||||
One of the problems with the data structures of VTK-m is that non-templated
|
||||
classes like `DataSet`, `Field`, and `UnknownArrayHandle` (formally
|
||||
`VariantArrayHandle`) internally hold an `ArrayHandle` of a particular type
|
||||
that has to be cast to the correct task before it can be reasonably used.
|
||||
That in turn is problematic because the list of possible `ArrayHandle`
|
||||
types is very long.
|
||||
|
||||
At one time we were trying to compensate for this by using
|
||||
`ArrayHandleVirtual`. However, for technical reasons this class is
|
||||
infeasible for every use case of VTK-m and has been deprecated. Also, this
|
||||
was only a partial solution since using it still required different code
|
||||
paths for, say, handling values of `vtkm::Float32` and `vtkm::Vec3f_32`
|
||||
even though both are essentially arrays of 32-bit floats.
|
||||
|
||||
The extract component feature compensates for this problem by allowing you
|
||||
to extract the components from an `ArrayHandle`. This feature allows you to
|
||||
create a single code path to handle `ArrayHandle`s containing scalars or
|
||||
vectors of any size. Furthermore, when you extract a component from an
|
||||
array, the storage gets normalized so that one code path covers all storage
|
||||
types.
|
||||
|
||||
## `ArrayExtractComponent`
|
||||
|
||||
The basic enabling feature is a new function named `ArrayExtractComponent`.
|
||||
This function takes takes an `ArrayHandle` and an index to a component. It
|
||||
then returns an `ArrayHandleStride` holding the selected component of each
|
||||
entry in the original array.
|
||||
|
||||
We will get to the structure of `ArrayHandleStride` later. But the
|
||||
important part is that `ArrayHandleStride` does _not_ depend on the storage
|
||||
type of the original `ArrayHandle`. That means whether you extract a
|
||||
component from `ArrayHandleBasic`, `ArrayHandleSOA`,
|
||||
`ArrayHandleCartesianProduct`, or any other type, you get back the same
|
||||
`ArrayHandleStride`. Likewise, regardless of whether the input
|
||||
`ArrayHandle` has a `ValueType` of `FloatDefault`, `Vec2f`, `Vec3f`, or any
|
||||
other `Vec` of a default float, you get the same `ArrayHandleStride`. Thus,
|
||||
you can see how this feature can dramatically reduce code paths if used
|
||||
correctly.
|
||||
|
||||
It should be noted that `ArrayExtractComponent` will (logically) flatten
|
||||
the `ValueType` before extracting the component. Thus, nested `Vec`s such
|
||||
as `Vec<Vec3f, 3>` will be treated as a `Vec<FloatDefault, 9>`. The
|
||||
intention is so that the extracted component will always be a basic C type.
|
||||
For the purposes of this document when we refer to the "component type", we
|
||||
really mean the base component type.
|
||||
|
||||
Different `ArrayHandle` implementations provide their own implementations
|
||||
for `ArrayExtractComponent` so that the component can be extracted without
|
||||
deep copying all the data. We will visit how `ArrayHandleStride` can
|
||||
represent different data layouts later, but first let's go into the main
|
||||
use case.
|
||||
|
||||
## Extract components from `UnknownArrayHandle`
|
||||
|
||||
The principle use case for `ArrayExtractComponent` is to get an
|
||||
`ArrayHandle` from an unknown array handle without iterating over _every_
|
||||
possible type. (Rather, we iterate over a smaller set of types.) To
|
||||
facilitate this, an `ExtractComponent` method has been added to
|
||||
`UnknownArrayHandle`.
|
||||
|
||||
To use `UnknownArrayHandle::ExtractComponent`, you must give it the
|
||||
component type. You can check for the correct component type by using the
|
||||
`IsBaseComponentType` method. The method will then return an
|
||||
`ArrayHandleStride` for the component type specified.
|
||||
|
||||
### Example
|
||||
|
||||
As an example, let's say you have a worklet, `FooWorklet`, that does some
|
||||
per component operation on an array. Furthermore, let's say that you want
|
||||
to implement a function that, to the best of your ability, can apply
|
||||
`FooWorklet` on an array of any type. This function should be pre-compiled
|
||||
into a library so it doesn't have to be compiled over and over again.
|
||||
(`MapFieldPermutation` and `MapFieldMergeAverage` are real and important
|
||||
examples that have this behavior.)
|
||||
|
||||
Without the extract component feature, the implementation might look
|
||||
something like this (many practical details left out):
|
||||
|
||||
``` cpp
|
||||
struct ApplyFooFunctor
|
||||
{
|
||||
template <typename ArrayType>
|
||||
void operator()(const ArrayType& input, vtkm::cont::UnknownArrayHandle& output) const
|
||||
{
|
||||
ArrayType outputArray;
|
||||
vtkm::cont::Invoke invoke;
|
||||
invoke(FooWorklet{}, input, outputArray);
|
||||
output = outputArray;
|
||||
}
|
||||
};
|
||||
|
||||
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
|
||||
{
|
||||
vtkm::cont::UnknownArrayHandle output;
|
||||
input.CastAndCallForTypes<vtkm::TypeListAll, VTKM_DEFAULT_STORAGE_LIST_TAG>(
|
||||
ApplyFooFunctor{}, output);
|
||||
return output;
|
||||
}
|
||||
```
|
||||
|
||||
Take a look specifically at the `CastAndCallForTypes` call near the bottom
|
||||
of this example. It calls for all types in `vtkm::TypeListAll`, which is
|
||||
about 40 instances. Then, it needs to be called for any type in the desired
|
||||
storage list. This could include basic arrays, SOA arrays, and lots of
|
||||
other specialized types. It would be expected for this code to generate
|
||||
over 100 paths for `ApplyFooFunctor`. This in turn contains a worklet
|
||||
invoke, which is not a small amount of code.
|
||||
|
||||
Now consider how we can use the `ExtractComponent` feature to reduce the
|
||||
code paths:
|
||||
|
||||
``` cpp
|
||||
struct ApplyFooFunctor
|
||||
{
|
||||
template <typename T>
|
||||
void operator()(T,
|
||||
const vtkm::cont::UnknownArrayHandle& input,
|
||||
cont vtkm::cont::UnknownArrayHandle& output) const
|
||||
{
|
||||
if (!input.IsBasicComponentType<T>()) { return; }
|
||||
VTKM_ASSERT(output.IsBasicComponentType<T>());
|
||||
|
||||
vtkm::cont::Invoke invoke;
|
||||
invoke(FooWorklet{}, input.ExtractComponent<T>(), output.ExtractComponent<T>());
|
||||
}
|
||||
};
|
||||
|
||||
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
|
||||
{
|
||||
vtkm::cont::UnknownArrayHandle output = input.NewInstanceBasic();
|
||||
output.Allocate(input.GetNumberOfValues());
|
||||
vtkm::cont::ListForEach(ApplyFooFunctor{}, vtkm::TypeListScalarAll{}, input, output);
|
||||
return output;
|
||||
}
|
||||
```
|
||||
|
||||
The number of lines of code is about the same, but take a look at the
|
||||
`ListForEach` (which replaces the `CastAndCallForTypes`). This calling code
|
||||
takes `TypeListScalarAll` instead of `TypeListAll`, which reduces the
|
||||
instances created from around 40 to 13 (every basic C type). It is also no
|
||||
longer dependent on the storage, so these 13 instances are it. As an
|
||||
example of potential compile savings, changing the implementation of the
|
||||
`MapFieldMergePermutation` and `MapFieldMergeAverage` functions in this way
|
||||
reduced the filters_common library (on Mac, Debug build) by 24 MB (over a
|
||||
third of the total size).
|
||||
|
||||
Another great advantage of this approach is that even though it takes less
|
||||
time to compile and generates less code, it actually covers more cases.
|
||||
Have an array containg values of `Vec<short, 13>`? No problem. The values
|
||||
were actually stored in an `ArrayHandleReverse`? It will still work.
|
||||
|
||||
## `ArrayHandleStride`
|
||||
|
||||
This functionality is made possible with the new `ArrayHandleStride`. This
|
||||
array behaves much like `ArrayHandleBasic`, except that it contains an
|
||||
_offset_ parameter to specify where in the buffer array to start reading
|
||||
and a _stride_ parameter to specify how many entries to skip for each
|
||||
successive entry. `ArrayHandleStride` also has optional parameters
|
||||
`divisor` and `modulo` that allow indices to be repeated at regular
|
||||
intervals.
|
||||
|
||||
Here are how `ArrayHandleStride` extracts components from several common
|
||||
arrays. For each of these examples, we assume that the `ValueType` of the
|
||||
array is `Vec<T, N>`. They are each extracting _component_.
|
||||
|
||||
### Extracting from `ArrayHandleBasic`
|
||||
|
||||
When extracting from an `ArrayHandleBasic`, we just need to start at the
|
||||
proper component and skip the length of the `Vec`.
|
||||
|
||||
* _offset_: _component_
|
||||
* _stride_: `N`
|
||||
|
||||
### Extracting from `ArrayHandleSOA`
|
||||
|
||||
Since each component is held in a separate array, they are densly packed.
|
||||
Each component could be represented by `ArrayHandleBasic`, but of course we
|
||||
use `ArrayHandleStride` to keep the type consistent.
|
||||
|
||||
* _offset_: 0
|
||||
* _stride_: 1
|
||||
|
||||
### Extracting from `ArrayHandleCartesianProduct`
|
||||
|
||||
This array is the basic reason for implementing the _divisor_ and _modulo_
|
||||
parameters. Each of the 3 components have different parameters, which are
|
||||
the following (given that _dims_[3] captures the size of the 3 arrays for
|
||||
each dimension).
|
||||
|
||||
* _offset_: 0
|
||||
* _stride_: 1
|
||||
* case _component_ == 0
|
||||
* _divisor_: _ignored_
|
||||
* _modulo_: _dims_[0]
|
||||
* case _component_ == 1
|
||||
* _divisor_: _dims_[0]
|
||||
* _modulo_: _dims_[1]
|
||||
* case _component_ == 2
|
||||
* _divisor_: _dims_[0]
|
||||
* _modulo_: _ignored_
|
||||
|
||||
### Extracting from `ArrayHandleUniformPointCoordinates`
|
||||
|
||||
This array cannot be represented directly because it is fully implicit.
|
||||
However, it can be trivially converted to `ArrayHandleCartesianProduct` in
|
||||
typically very little memory. (In fact, EAVL always represented uniform
|
||||
point coordinates by explicitly storing a Cartesian product.) Thus, for
|
||||
very little overhead the `ArrayHandleStride` can be created.
|
||||
|
||||
## Runtime overhead of extracting components
|
||||
|
||||
These benefits come at a cost, but not a large one. The "biggest" cost is
|
||||
the small cost of computing index arithmetic for each access into
|
||||
`ArrayHandleStride`. To make this as efficient as possible, there are
|
||||
conditions that skip over the modulo and divide steps if they are not
|
||||
necessary. (Integer modulo and divide tend to take much longer than
|
||||
addition and multiplication.) It is for this reason that we probably do not
|
||||
want to use this method all the time.
|
||||
|
||||
Another cost is the fact that not every `ArrayHandle` can be represented by
|
||||
`ArrayHandleStride` directly without copying. If you ask to extract a
|
||||
component that cannot be directly represented, it will be copied into a
|
||||
basic array, which is not great. To make matters worse, for technical
|
||||
reasons this copy happens on the host rather than the device.
|
Loading…
Reference in New Issue
Block a user