mirror of
https://gitlab.kitware.com/vtk/vtk-m
synced 2024-09-19 18:45:43 +00:00
19c1b99f18
The major changes to VTK-m from 1.5.1 can be found in: docs/changelog/1.6.0/release-notes.md Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
2841 lines
112 KiB
Markdown
2841 lines
112 KiB
Markdown
VTK-m 1.6 Release Notes
|
|
=======================
|
|
|
|
# Table of Contents
|
|
|
|
1. [Core](#Core)
|
|
- Add Kokkos backend
|
|
- Deprecate `DataSetFieldAdd`
|
|
- Move VTK file readers and writers into vtkm_io
|
|
- Remove VTKDataSetWriter::WriteDataSet just_points parameter
|
|
- Added VecFlat class
|
|
- Add a vtkm::Tuple class
|
|
- DataSet now only allows unique field names
|
|
- Result DataSet of coordinate transform has its CoordinateSystem changed
|
|
- `vtkm::cont::internal::Buffer` now can have ownership transferred
|
|
- Configurable default types
|
|
2. [ArrayHandle](#ArrayHandle)
|
|
- Shorter fancy array handle classnames
|
|
- `ReadPortal().Get(idx)`
|
|
- Precompiled `ArrayCopy` for `UnknownArrayHandle`
|
|
- Create `ArrayHandleOffsetsToNumComponents`
|
|
- Recombine extracted component arrays from unknown arrays
|
|
- UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
|
|
- Support `ArrayHandleSOA` as a "default" array
|
|
- Removed old `ArrayHandle` transfer mechanism
|
|
- Order asynchronous `ArrayHandle` access
|
|
- Improvements to moving data into ArrayHandle
|
|
- Deprecate ArrayHandleVirtualCoordinates
|
|
- ArrayHandleDecorator Allocate and Shrink Support
|
|
- Portals may advertise custom iterators
|
|
- Redesign of ArrayHandle to access data using typeless buffers
|
|
- `ArrayRangeCompute` works on any array type without compiling device code
|
|
- Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
|
|
- Extract component arrays from unknown arrays
|
|
- `ArrayHandleGroupVecVariable` holds now one more offset.
|
|
3. [Control Environment](#Control-Environment)
|
|
- Algorithms for Control and Execution Environments
|
|
4. [Execution Environment](#Execution-Environment)
|
|
- Scope ExecObjects with Tokens
|
|
- Masks and Scatters Supported for 3D Scheduling
|
|
- Virtual methods in execution environment deprecated
|
|
- Deprecate Execute with policy
|
|
5. [Worklets and Filters](#Worklets-and-Filters)
|
|
- Enable setting invalid value in probe filter
|
|
- Avoid raising errors when operating on cells
|
|
- Add atomic free functions
|
|
- Flying Edges
|
|
- Filters specify their own field types
|
|
6. [Build](#Build)
|
|
- Disable asserts for CUDA architecture builds
|
|
- Disable asserts for HIP architecture builds
|
|
- Add VTKM_DEPRECATED macro
|
|
7. [Other](#Other)
|
|
- Porting layer for future std features
|
|
- Removed OpenGL Rendering Classes
|
|
- Reorganization of `io` directory
|
|
- Implemented PNG/PPM image Readers/Writers
|
|
- Updated Benchmark Framework
|
|
- Provide scripts to build Gitlab-ci workers locally
|
|
- Replaced `vtkm::ListTag` with `vtkm::List`
|
|
- Add `ListTagRemoveIf`
|
|
- Write uniform and rectilinear grids to legacy VTK files
|
|
8. [References](#References)
|
|
|
|
# Core
|
|
|
|
## Add Kokkos backend
|
|
|
|
Adds a new device backend `Kokkos` which uses the kokkos library for parallelism.
|
|
User must provide the kokkos build and Vtk-m will use the default configured execution
|
|
space.
|
|
|
|
## Deprecate `DataSetFieldAdd`
|
|
|
|
The class `vtkm::cont::DataSetFieldAdd` is now deprecated.
|
|
Its methods, `AddPointField` and `AddCellField` have been moved to member functions
|
|
of `vtkm::cont::DataSet`, which simplifies many calls.
|
|
|
|
For example, the following code
|
|
|
|
```cpp
|
|
vtkm::cont::DataSetFieldAdd fieldAdder;
|
|
fieldAdder.AddCellField(dataSet, "cellvar", values);
|
|
```
|
|
|
|
would now be
|
|
|
|
```cpp
|
|
dataSet.AddCellField("cellvar", values);
|
|
```
|
|
|
|
## Move VTK file readers and writers into vtkm_io
|
|
|
|
The legacy VTK file reader and writer were created back when VTK-m was a
|
|
header-only library. Things have changed and we now compile quite a bit of
|
|
code into libraries. At this point, there is no reason why the VTK file
|
|
reader/writer should be any different.
|
|
|
|
Thus, `VTKDataSetReader`, `VTKDataSetWriter`, and several supporting
|
|
classes are now compiled into the `vtkm_io` library. Also similarly updated
|
|
`BOVDataSetReader` for good measure.
|
|
|
|
As a side effect, code using VTK-m will need to link to `vtkm_io` if they
|
|
are using any readers or writers.
|
|
|
|
|
|
## Remove VTKDataSetWriter::WriteDataSet just_points parameter
|
|
|
|
In the method `VTKDataSetWriter::WriteDataSet`, `just_points` parameter has been
|
|
removed due to lack of usage.
|
|
|
|
The purpose of `just_points` was to allow exporting only the points of a
|
|
DataSet without its cell data.
|
|
|
|
|
|
## Added VecFlat class
|
|
|
|
`vtkm::VecFlat` is a wrapper around a `Vec`-like class that may be a nested
|
|
series of vectors. For example, if you run a gradient operation on a vector
|
|
field, you are probably going to get a `Vec` of `Vec`s that looks something
|
|
like `vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 3>`. That is fine, but what if
|
|
you want to treat the result simply as a `Vec` of size 9?
|
|
|
|
The `VecFlat` wrapper class allows you to do this. Simply place the nested
|
|
`Vec` as an argument to `VecFlat` and it will behave as a flat `Vec` class.
|
|
(In fact, `VecFlat` is a subclass of `Vec`.) The `VecFlat` class can be
|
|
copied to and from the nested `Vec` it is wrapping.
|
|
|
|
There is a `vtkm::make_VecFlat` convenience function that takes an object
|
|
and returns a `vtkm::VecFlat` wrapped around it.
|
|
|
|
`VecFlat` works with any `Vec`-like object as well as scalar values.
|
|
However, any type used with `VecFlat` must have `VecTraits` defined and the
|
|
number of components must be static (i.e. known at compile time).
|
|
|
|
|
|
## Add a vtkm::[Tuple](Tuple) class
|
|
|
|
This change added a `vtkm::Tuple` class that is very similar in nature to
|
|
`std::tuple`. This should replace our use of tao tuple.
|
|
|
|
The motivation for this change was some recent attempts at removing objects
|
|
like `Invocation` and `FunctionInterface`. I expected these changes to
|
|
speed up the build, but in fact they ended up slowing down the build. I
|
|
believe the problem was that these required packing variable parameters
|
|
into a tuple. I was using the tao `tuple` class, but it seemed to slow down
|
|
the compile. (That is, compiling tao's `tuple` seemed much slower than
|
|
compiling the equivalent `FunctionInterface` class.)
|
|
|
|
The implementation of `vtkm::Tuple` is using `pyexpander` to build lots of
|
|
simple template cases for the object (with a backup implementation for even
|
|
longer argument lists). I believe the compiler is better and parsing
|
|
through thousands of lines of simple templates than to employ clever MPL to
|
|
build general templates.
|
|
|
|
### Usage
|
|
|
|
The `vtkm::Tuple` class is defined in the `vtkm::Tuple.h` header file. A
|
|
`Tuple` is designed to behave much like a `std::tuple` with some minor
|
|
syntax differences to fit VTK-m coding standards.
|
|
|
|
A tuple is declared with a list of template argument types.
|
|
|
|
``` cpp
|
|
vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>> myTuple;
|
|
```
|
|
|
|
If given no arguments, a `vtkm::Tuple` will default-construct its contained
|
|
objects. A `vtkm::Tuple` can also be constructed with the initial values of
|
|
all contained objects.
|
|
|
|
``` cpp
|
|
vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>>
|
|
myTuple(0, vtkm::Vec3f(0, 1, 2), array);
|
|
```
|
|
|
|
For convenience there is a `vtkm::MakeTuple` function that takes arguments
|
|
and packs them into a `Tuple` of the appropriate type. (There is also a
|
|
`vtkm::make_tuple` alias to the function to match the `std` version.)
|
|
|
|
``` cpp
|
|
auto myTuple = vtkm::MakeTuple(0, vtkm::Vec3f(0, 1, 2), array);
|
|
```
|
|
|
|
Data is retrieved from a `Tuple` by using the `vtkm::Get` method. The `Get`
|
|
method is templated on the index to get the value from. The index is of
|
|
type `vtkm::IdComponent`. (There is also a `vtkm::get` that uses a
|
|
`std::size_t` as the index type as an alias to the function to match the
|
|
`std` version.)
|
|
|
|
``` cpp
|
|
vtkm::Id a = vtkm::Get<0>(myTuple);
|
|
vtkm::Vec3f b = vtkm::Get<1>(myTuple);
|
|
vtkm::cont::ArrayHandle<vtkm::Float32> c = vtkm::Get<2>(myTuple);
|
|
```
|
|
|
|
Likewise `vtkm::TupleSize` and `vtkm::TupleElement` (and their aliases
|
|
`vtkm::Tuple_size`, `vtkm::tuple_element`, and `vtkm::tuple_element_t`) are
|
|
provided.
|
|
|
|
### Extended Functionality
|
|
|
|
The `vtkm::Tuple` class contains some functionality beyond that of
|
|
`std::tuple` to cover some common use cases in VTK-m that are tricky to
|
|
implement. In particular, these methods allow you to use a `Tuple` as you
|
|
would commonly use parameter packs. This allows you to stash parameter
|
|
packs in a `Tuple` and then get them back out again.
|
|
|
|
#### For Each
|
|
|
|
`vtkm::Tuple::ForEach()` is a method that takes a function or functor and
|
|
calls it for each of the items in the tuple. Nothing is returned from
|
|
`ForEach` and any return value from the function is ignored.
|
|
|
|
`ForEach` can be used to check the validity of each item.
|
|
|
|
``` cpp
|
|
void CheckPositive(vtkm::Float64 x)
|
|
{
|
|
if (x < 0)
|
|
{
|
|
throw vtkm::cont::ErrorBadValue("Values need to be positive.");
|
|
}
|
|
}
|
|
|
|
// ...
|
|
|
|
vtkm::Tuple<vtkm::Float64, vtkm::Float64, vtkm::Float64> tuple(
|
|
CreateValue1(), CreateValue2(), CreateValue3());
|
|
|
|
// Will throw an error if any of the values are negative.
|
|
tuple.ForEach(CheckPositive);
|
|
```
|
|
|
|
`ForEach` can also be used to aggregate values.
|
|
|
|
``` cpp
|
|
struct SumFunctor
|
|
{
|
|
vtkm::Float64 Sum = 0;
|
|
|
|
template <typename T>
|
|
void operator()(const T& x)
|
|
{
|
|
this->Sum = this->Sum + static_cast<vtkm::Float64>(x);
|
|
}
|
|
};
|
|
|
|
// ...
|
|
|
|
vtkm::Tuple<vtkm::Float32, vtkm::Float64, vtkm::Id> tuple(
|
|
CreateValue1(), CreateValue2(), CreateValue3());
|
|
|
|
SumFunctor sum;
|
|
tuple.ForEach(sum);
|
|
vtkm::Float64 average = sum.Sum / 3;
|
|
```
|
|
|
|
#### Transform
|
|
|
|
`vtkm::Tuple::Transform` is a method that builds a new `Tuple` by calling a
|
|
function or functor on each of the items. The return value is placed in the
|
|
corresponding part of the resulting `Tuple`, and the type is automatically
|
|
created from the return type of the function.
|
|
|
|
``` cpp
|
|
struct GetReadPortalFunctor
|
|
{
|
|
template <typename Array>
|
|
typename Array::ReadPortal operator()(const Array& array) const
|
|
{
|
|
VTKM_IS_ARRAY_HANDLE(Array);
|
|
return array.ReadPortal();
|
|
}
|
|
};
|
|
|
|
// ...
|
|
|
|
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
|
|
|
|
auto portalTuple = arrayTuple.Transform(GetReadPortalFunctor{});
|
|
```
|
|
|
|
#### Apply
|
|
|
|
`vtkm::Tuple::Apply` is a method that calls a function or functor using the
|
|
objects in the `Tuple` as the arguments. If the function returns a value,
|
|
that value is returned from `Apply`.
|
|
|
|
``` cpp
|
|
struct AddArraysFunctor
|
|
{
|
|
template <typename Array1, typename Array2, typename Array3>
|
|
vtkm::Id operator()(Array1 inArray1, Array2 inArray2, Array3 outArray) const
|
|
{
|
|
VTKM_IS_ARRAY_HANDLE(Array1);
|
|
VTKM_IS_ARRAY_HANDLE(Array2);
|
|
VTKM_IS_ARRAY_HANDLE(Array3);
|
|
|
|
vtkm::Id length = inArray1.GetNumberOfValues();
|
|
VTKM_ASSERT(inArray2.GetNumberOfValues() == length);
|
|
outArray.Allocate(length);
|
|
|
|
auto inPortal1 = inArray1.ReadPortal();
|
|
auto inPortal2 = inArray2.ReadPortal();
|
|
auto outPortal = outArray.WritePortal();
|
|
for (vtkm::Id index = 0; index < length; ++index)
|
|
{
|
|
outPortal.Set(index, inPortal1.Get(index) + inPortal2.Get(index));
|
|
}
|
|
|
|
return length;
|
|
}
|
|
};
|
|
|
|
// ...
|
|
|
|
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
|
|
|
|
vtkm::Id arrayLength = arrayTuple.Apply(AddArraysFunctor{});
|
|
```
|
|
|
|
If additional arguments are given to `Apply`, they are also passed to the
|
|
function (before the objects in the `Tuple`). This is helpful for passing
|
|
state to the function. (This feature is not available in either `ForEach`
|
|
or `Transform` for technical implementation reasons.)
|
|
|
|
``` cpp
|
|
struct ScanArrayLengthFunctor
|
|
{
|
|
template <std::size_t N, typename Array, typename... Remaining>
|
|
std::array<vtkm::Id, N + 1 + sizeof...(Remaining)>
|
|
operator()(const std::array<vtkm::Id, N>& partialResult,
|
|
const Array& nextArray,
|
|
const Remaining&... remainingArrays) const
|
|
{
|
|
std::array<vtkm::Id, N + 1> nextResult;
|
|
std::copy(partialResult.begin(), partialResult.end(), nextResult.begin());
|
|
nextResult[N] = nextResult[N - 1] + nextArray.GetNumberOfValues();
|
|
return (*this)(nextResult, remainingArray);
|
|
}
|
|
|
|
template <std::size_t N>
|
|
std::array<vtkm::Id, N> operator()(const std::array<vtkm::Id, N>& result) const
|
|
{
|
|
return result;
|
|
}
|
|
};
|
|
|
|
// ...
|
|
|
|
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
|
|
|
|
std::array<vtkm::Id, 4> =
|
|
arrayTuple.Apply(ScanArrayLengthFunctor{}, std::array<vtkm::Id, 1>{ 0 });
|
|
```
|
|
|
|
## DataSet now only allows unique field names
|
|
|
|
When you add a `vtkm::cont::Field` to a `vtkm::cont::DataSet`, it now
|
|
requires every `Field` to have a unique name. When you attempt to add a
|
|
`Field` to a `DataSet` that already has a `Field` of the same name and
|
|
association, the old `Field` is removed and replaced with the new `Field`.
|
|
|
|
You are allowed, however, to have two `Field`s with the same name but
|
|
different associations. For example, you could have a point `Field` named
|
|
"normals" and also have a cell `Field` named "normals" in the same
|
|
`DataSet`.
|
|
|
|
This new behavior matches how VTK's data sets manage fields.
|
|
|
|
The old behavior allowed you to add multiple `Field`s with the same name,
|
|
but it would be unclear which one you would get if you asked for a `Field`
|
|
by name.
|
|
|
|
|
|
## Result DataSet of coordinate transform has its CoordinateSystem changed
|
|
|
|
When you run one of the coordinate transform filters,
|
|
`CylindricalCoordinateTransform` or `SphericalCoordinateTransform`, the
|
|
transform coordiantes are placed as the first `CoordinateSystem` in the
|
|
returned `DataSet`. This means that after running this filter, the data
|
|
will be moved to this new coordinate space.
|
|
|
|
Previously, the result of these filters was just placed in a named `Field`
|
|
of the output. This caused some confusion because the filter did not seem
|
|
to have any effect (unless you knew to modify the output data). Not using
|
|
the result as the coordinate system seems like a dubious use case (and not
|
|
hard to work around), so this is much better behavior.
|
|
|
|
|
|
## `vtkm::cont::internal::Buffer` now can have ownership transferred
|
|
|
|
Memory once transferred to `Buffer` always had to be managed by VTK-m. This is problematic
|
|
for applications that needed VTK-m to allocate memory, but have the memory ownership
|
|
be longer than VTK-m.
|
|
|
|
`Buffer::TakeHostBufferOwnership` allows for easy transfer ownership of memory out of VTK-m.
|
|
When taking ownership of an VTK-m buffer you are provided the following information:
|
|
|
|
- Memory: A `void*` pointer to the array
|
|
- Container: A `void*` pointer used to free the memory. This is necessary to support cases such as allocations transferred into VTK-m from a `std::vector`.
|
|
- Delete: The function to call to actually delete the transferred memory
|
|
- Reallocate: The function to call to re-allocate the transferred memory. This will throw an exception if users try
|
|
to reallocate a buffer that was 'view' only
|
|
- Size: The size in number of elements of the array
|
|
|
|
|
|
To properly steal memory from VTK-m you do the following:
|
|
```cpp
|
|
vtkm::cont::ArrayHandle<T> arrayHandle;
|
|
|
|
...
|
|
|
|
auto stolen = arrayHandle.GetBuffers()->TakeHostBufferOwnership();
|
|
|
|
...
|
|
|
|
stolen.Delete(stolen.Container);
|
|
```
|
|
|
|
|
|
## Configurable default types
|
|
|
|
Because VTK-m compiles efficient code for accelerator architectures, it
|
|
often has to compile for static types. This means that dynamic types often
|
|
have to be determined at runtime and converted to static types. This is the
|
|
reason for the `CastAndCall` architecture in VTK-m.
|
|
|
|
For this `CastAndCall` to work, there has to be a finite set of static
|
|
types to try at runtime. If you don't compile in the types you need, you
|
|
will get runtime errors. However, the more types you compile in, the longer
|
|
the compile time and executable size. Thus, getting the types right is
|
|
important.
|
|
|
|
The "right" types to use can change depending on the application using
|
|
VTK-m. For example, when VTK links in VTK-m, it needs to support lots of
|
|
types and can sacrifice the compile times to do so. However, if using VTK-m
|
|
in situ with a fortran simulation, space and time are critical and you
|
|
might only need to worry about double SoA arrays.
|
|
|
|
Thus, it is important to customize what types VTK-m uses based on the
|
|
application. This leads to the oxymoronic phrase of configuring the default
|
|
types used by VTK-m.
|
|
|
|
This is being implemented by providing VTK-m with a header file that
|
|
defines the default types. The header file provided to VTK-m should define
|
|
one or more of the following preprocessor macros:
|
|
|
|
* `VTKM_DEFAULT_TYPE_LIST` - a `vtkm::List` of value types for fields that
|
|
filters should directly operate on (where applicable).
|
|
* `VTKM_DEFAULT_STORAGE_LIST` - a `vtkm::List` of storage tags for fields
|
|
that filters should directly operate on.
|
|
* `VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED` - a `vtkm::List` of
|
|
`vtkm::cont::CellSet` types that filters should operate on as a
|
|
strutured cell set.
|
|
* `VTKM_DEFAULT_CELL_SET_LIST_UNSTRUCTURED` - a `vtkm::List` of
|
|
`vtkm::cont::CellSet` types that filters should operate on as an
|
|
unstrutured cell set.
|
|
* `VTKM_DEFAULT_CELL_SET_LIST` - a `vtkm::List` of `vtkm::cont::CellSet`
|
|
types that filters should operate on (where applicable). The default of
|
|
`vtkm::ListAppend<VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED, VTKM_DEFAULT_CELL_SET_LIST>`
|
|
is usually correct.
|
|
|
|
If any of these macros are not defined, a default version will be defined.
|
|
(This is the same default used if no header file is provided.)
|
|
|
|
This header file is provided to the build by setting the
|
|
`VTKm_DEFAULT_TYPES_HEADER` CMake variable. `VTKm_DEFAULT_TYPES_HEADER`
|
|
points to the file, which will be configured and copied to VTK-m's build
|
|
directory.
|
|
|
|
For convenience, header files can be added to the VTK_m source directory
|
|
(conventionally under vtkm/cont/internal). If this is the case, an advanced
|
|
CMake option should be added to select the provided header file.
|
|
|
|
|
|
|
|
# ArrayHandle
|
|
|
|
## Shorter fancy array handle classnames
|
|
|
|
Many of the fancy `ArrayHandle`s use the generic builders like
|
|
`ArrayHandleTransform` and `ArrayHandleImplicit` for their implementation.
|
|
Such is fine, but because they use functors and other such generic items to
|
|
template their `Storage`, you can end up with very verbose classnames. This
|
|
is an issue for humans trying to discern classnames. It can also be an
|
|
issue for compilers that end up with very long resolved classnames that
|
|
might get truncated if they extend past what was expected.
|
|
|
|
The fix was for these classes to declare their own `Storage` tag and then
|
|
implement their `Storage` and `ArrayTransport` classes as trivial
|
|
subclasses of the generic `ArrayHandleImplicit` or `ArrayHandleTransport`.
|
|
|
|
As an added bonus, a lot of this shortening also means that storage that
|
|
relies on other array handles now are just typed by the storage of the
|
|
decorated type, not the array itself. This should make the types a little
|
|
more robust.
|
|
|
|
Here is a list of classes that were updated.
|
|
|
|
### `ArrayHandleCast<TargetT, vtkm::cont::ArrayHandle<SourceT, SourceStorage>>`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagTransform<
|
|
vtkm::cont::ArrayHandle<SourceT, SourceStorage>,
|
|
vtkm::cont::internal::Cast<TargetT, SourceT>,
|
|
vtkm::cont::internal::Cast<SourceT, TargetT>>
|
|
```
|
|
|
|
New Storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagCast<SourceT, SourceStorage>
|
|
```
|
|
|
|
(Developer's note: Implementing this change to `ArrayHandleCast` was a much bigger PITA than expected.)
|
|
|
|
### `ArrayHandleCartesianProduct<AH1, AH2, AH3>`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagCartesianProduct<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag1,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag2,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagCartesianProduct<StorageTag1, StorageTag2, StorageTag3>
|
|
```
|
|
|
|
### `ArrayHandleCompositeVector<AH1, AH2, ...>`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagCompositeVector<
|
|
tao::tuple<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageType1>,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageType2>,
|
|
...
|
|
>
|
|
>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagCompositeVec<StorageType1, StorageType2>
|
|
```
|
|
|
|
### `ArrayHandleConcatinate`
|
|
|
|
First an example with two simple types.
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagConcatenate<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>
|
|
```
|
|
|
|
Now a more specific example taken from the unit test of a concatination of a concatination.
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagConcatenate<
|
|
vtkm::cont::ArrayHandleConcatenate<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagConcatenate<
|
|
vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>, StorageTag3>
|
|
```
|
|
|
|
### `ArrayHandleConstant`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagImplicit<
|
|
vtkm::cont::detail::ArrayPortalImplicit<
|
|
vtkm::cont::detail::ConstantFunctor<ValueType>>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagConstant
|
|
```
|
|
|
|
### `ArrayHandleCounting`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagImplicit<vtkm::cont::internal::ArrayPortalCounting<ValueType>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagCounting
|
|
```
|
|
|
|
### `ArrayHandleGroupVec`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagGroupVec<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag>, N>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagGroupVec<StorageTag, N>
|
|
```
|
|
|
|
### `ArrayHandleGroupVecVariable`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagGroupVecVariable<
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
|
|
vtkm::cont::ArrayHandle<vtkm::Id, StorageTag2>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagGroupVecVariable<StorageTag1, StorageTag2>
|
|
```
|
|
|
|
### `ArrayHandleIndex`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagImplicit<
|
|
vtkm::cont::detail::ArrayPortalImplicit<vtkm::cont::detail::IndexFunctor>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagIndex
|
|
```
|
|
|
|
### `ArrayHandlePermutation`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagPermutation<
|
|
vtkm::cont::ArrayHandle<vtkm::Id, StorageTag1>,
|
|
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagPermutation<StorageTag1, StorageTag2>
|
|
```
|
|
|
|
### `ArrayHandleReverse`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagReverse<vtkm::cont::ArrayHandle<ValueType, vtkm::cont::StorageTag>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagReverse<StorageTag>
|
|
```
|
|
|
|
### `ArrayHandleUniformPointCoordinates`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagImplicit<vtkm::internal::ArrayPortalUniformPointCoordinates>
|
|
```
|
|
|
|
New Storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagUniformPoints
|
|
```
|
|
|
|
### `ArrayHandleView`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagView<vtkm::cont::ArrayHandle<ValueType, StorageTag>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
'vtkm::cont::StorageTagView<StorageTag>
|
|
```
|
|
|
|
|
|
### `ArrayPortalZip`
|
|
|
|
Old storage:
|
|
``` cpp
|
|
vtkm::cont::internal::StorageTagZip<
|
|
vtkm::cont::ArrayHandle<ValueType1, StorageTag1>,
|
|
vtkm::cont::ArrayHandle<ValueType2, StorageTag2>>
|
|
```
|
|
|
|
New storage:
|
|
``` cpp
|
|
vtkm::cont::StorageTagZip<StorageTag1, StorageTag2>
|
|
```
|
|
|
|
|
|
## `ReadPortal().Get(idx)`
|
|
|
|
Calling `ReadPortal()` in a tight loop is an antipattern.
|
|
A call to `ReadPortal()` causes the array to be copied back to the control environment,
|
|
and hence code like
|
|
|
|
```cpp
|
|
for (vtkm::Id i = 0; i < array.GetNumberOfValues(); ++i) {
|
|
vtkm::FloatDefault x = array.ReadPortal().Get(i);
|
|
}
|
|
```
|
|
|
|
is a quadratic-scaling loop.
|
|
|
|
We have remove *almost* all internal uses of the `ReadPortal().Get` antipattern,
|
|
with the exception of 4 API calls into which the pattern is baked in:
|
|
`CellSetExplicit::GetCellShape`, `CellSetPermutation::GetNumberOfPointsInCell`, `CellSetPermutation::GetCellShape`, and `CellSetPermutation::GetCellPointIds`.
|
|
We expect these will need to be deprecated in the future.
|
|
|
|
## Precompiled `ArrayCopy` for `UnknownArrayHandle`
|
|
|
|
Previously, in order to copy an `UnknownArrayHandle`, you had to specify
|
|
some subset of types and then specially compile a copy for each potential
|
|
type. With the new ability to extract a component from an
|
|
`UnknownArrayHandle`, it is now feasible to precompile copying an
|
|
`UnknownArrayHandle` to another array. This greatly reduces the overhead of
|
|
using `ArrayCopy` to copy `UnknownArrayHandle`s while simultaneously
|
|
increasing the likelihood that the copy will be successful.
|
|
|
|
## Create `ArrayHandleOffsetsToNumComponents`
|
|
|
|
`ArrayHandleOffsetsToNumComponents` is a fancy array that takes an array of
|
|
offsets and converts it to an array of the number of components for each
|
|
packed entry.
|
|
|
|
It is common in VTK-m to pack small vectors of variable sizes into a single
|
|
contiguous array. For example, cells in an explicit cell set can each have
|
|
a different amount of vertices (triangles = 3, quads = 4, tetra = 4, hexa =
|
|
8, etc.). Generally, to access items in this list, you need an array of
|
|
components in each entry and the offset for each entry. However, if you
|
|
have just the array of offsets in sorted order, you can easily derive the
|
|
number of components for each entry by subtracting adjacent entries. This
|
|
works best if the offsets array has a size that is one more than the number
|
|
of packed vectors with the first entry set to 0 and the last entry set to
|
|
the total size of the packed array (the offset to the end).
|
|
|
|
When packing data of this nature, it is common to start with an array that
|
|
is the number of components. You can convert that to an offsets array using
|
|
the `vtkm::cont::ConvertNumComponentsToOffsets` function. This will create
|
|
an offsets array with one extra entry as previously described. You can then
|
|
throw out the original number of components array and use the offsets with
|
|
`ArrayHandleOffsetsToNumComponents` to represent both the offsets and num
|
|
components while storing only one array.
|
|
|
|
This replaces the use of `ArrayHandleDecorator` in `CellSetExplicit`.
|
|
The two implementation should do the same thing, but the new
|
|
`ArrayHandleOffsetsToNumComponents` should be less complex for
|
|
compilers.
|
|
|
|
## Recombine extracted component arrays from unknown arrays
|
|
|
|
Building on the recent capability to [extract component arrays from unknown
|
|
arrays](array-extract-component.md), there is now also the ability to
|
|
recombine these extracted arrays to a single `ArrayHandle`. It might seem
|
|
counterintuitive to break an `ArrayHandle` into component arrays and then
|
|
combine the component arrays back into a single `ArrayHandle`, but this is
|
|
a very handy way to run algorithms without knowing the exact `ArrayHandle`
|
|
type.
|
|
|
|
Recall that when extracting a component array from an `UnknownArrayHandle`
|
|
you only need to know the base component of the value type of the contained
|
|
`ArrayHandle`. That makes extracting a component array independent from
|
|
either the size of any `Vec` value type and any storage type.
|
|
|
|
The added `UnknownArrayHandle::ExtractArrayFromComponents` method allows
|
|
you to use the functionality to transform the unknown array handle to a
|
|
form of `ArrayHandle` that depends only on this base component type. This
|
|
method internally uses a new `ArrayHandleRecombineVec` class, but this
|
|
class is mostly intended for internal use by this class.
|
|
|
|
As an added convenience, `UnknownArrayHandle` now also provides the
|
|
`CastAndCallWithExtractedArray` method. This method works like other
|
|
`CastAndCall`s except that it uses the `ExtractArrayFromComponents` feature
|
|
to allow you to handle most `ArrayHandle` types with few template
|
|
instances.
|
|
|
|
|
|
## UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
|
|
|
|
Two new classes have been added to VTK-m: `UnknownArrayHandle` and
|
|
`UncertainArrayHandle`. These classes serve the same purpose as the set of
|
|
`VariantArrayHandle` classes and will replace them.
|
|
|
|
Motivated mostly by the desire to move away from `ArrayHandleVirtual`, we
|
|
have multiple reasons to completely refactor the `VariantArrayHandle`
|
|
class. These include changing the implementation, some behavior, and even
|
|
the name.
|
|
|
|
### Motivation
|
|
|
|
We have several reasons that have accumulated to revisit the implementation
|
|
of `VariantArrayHandle`.
|
|
|
|
#### Move away from `ArrayHandleVirtual`
|
|
|
|
The current implementation of `VariantArrayHandle` internally stores the
|
|
array wrapped in an `ArrayHandleVirtual`. That makes sense since you might
|
|
as well consolidate the hierarchy of virtual objects into one.
|
|
|
|
Except `ArrayHandleVirtual` is being deprecated, so it no longer makes
|
|
sense to use that internally.
|
|
|
|
So we will transition the class back to managing the data as typeless on
|
|
its own. We will consider using function pointers rather than actual
|
|
virtual functions because compilers can be slow in creating lots of virtual
|
|
subclasses.
|
|
|
|
#### Reintroduce storage tag lists
|
|
|
|
The original implementation of `VariantArrayHandle` (which at the time was
|
|
called `DynamicArrayHandle`) actually had two type lists: one for the array
|
|
value type and one for the storage type. The storage type list was removed
|
|
soon after `ArrayHandleVirtual` was introduced because whatever the type of
|
|
array it could be access as `ArrayHandleVirtual`.
|
|
|
|
However, with `ArrayHandleVirtual` being deprecated, this feature is no
|
|
longer possible. We are in need again for the list of storage types to try.
|
|
Thus, we need to reintroduce this template argument to
|
|
`VariantArrayHandle`.
|
|
|
|
#### More clear name
|
|
|
|
The name of this class has always been unsatisfactory. The first name,
|
|
`DynamicArrayHandle`, makes it sound like the data is always changing. The
|
|
second name, `VariantArrayHandle`, makes it sound like an array that holds
|
|
a value type that can vary (like an `std::variant`).
|
|
|
|
We can use a more clear name that expresses better that it is holding an
|
|
`ArrayHandle` of an _unknown_ type.
|
|
|
|
#### Take advantage of default types for less templating
|
|
|
|
Once upon a time everything in VTK-m was templated header library. Things
|
|
have changed quite a bit since then. The most recent development is the
|
|
ability to select the "default types" with CMake configuration that allows
|
|
you to select a global set of types you care about during compilation. This
|
|
is so units like filters can be compiled into a library with all types we
|
|
care about, and we don't have to constantly recompile units.
|
|
|
|
This means that we are becoming less concerned about maintaining type lists
|
|
everywhere. Often we can drop the type list and pass data across libraries.
|
|
|
|
With that in mind, it makes less sense for `VariantArrayHandle` to actually
|
|
be a `using` alias for `VariantArrayHandleBase<VTKM_DEFAULT_TYPE_LIST>`.
|
|
|
|
In response, we can revert the is-a relationship between the two. Have a
|
|
completely typeless version as the base class and have a second version
|
|
templated version to express when the type of the array has been partially
|
|
narrowed down to given type lists.
|
|
|
|
### New Name and Structure
|
|
|
|
The ultimate purpose of this class is to store an `ArrayHandle` where the
|
|
value and storage types are unknown. Thus, an appropriate name for the
|
|
class is `UnknownArrayHandle`.
|
|
|
|
`UnknownArrayHandle` is _not_ templated. It simply stores an `ArrayHandle`
|
|
in a typeless (`void *`) buffer. It does, however, contain many templated
|
|
methods that allow you to query whether the contained array matches given
|
|
types, to cast to given types, and to cast and call to a given functor
|
|
(from either given type lists or default lists).
|
|
|
|
Rather than have a virtual class structure to manage the typeless array,
|
|
the new management will use function pointers. This has shown to sometimes
|
|
improve compile times and generate less code.
|
|
|
|
Sometimes it is the case that the set of potential types can be narrowed. In
|
|
this case, the array ceases to be unknown and becomes _uncertain_. Thus,
|
|
the companion class to `UnknownArrayHandle` is `UncertainArrayHandle`.
|
|
|
|
`UncertainArrayHandle` has two template parameters: a list of potential
|
|
value types and a list of potential storage types. The behavior of
|
|
`UncertainArrayHandle` matches that of `UnknownArrayHandle` (and might
|
|
inherit from it). However, for `CastAndCall` operations, it will use the
|
|
type lists defined in its template parameters.
|
|
|
|
### Serializing UnknownArrayHandle
|
|
|
|
Because `UnknownArrayHandle` is not templated, it contains some
|
|
opportunities to compile things into the `vtkm_cont` library. Templated
|
|
methods like `CastAndCall` cannot be, but the specializations of DIY's
|
|
serialize can be.
|
|
|
|
And since it only has to be compiled once into a library, we can spend some
|
|
extra time compiling for more types. We don't have to restrict ourselves to
|
|
`VTKM_DEFAULT_TYPE_LIST`. We can compile for vtkm::TypeListTagAll.
|
|
|
|
|
|
Support `ArrayHandleSOA` as a "default" array
|
|
|
|
Many programs, particularly simulations, store fields of vectors in
|
|
separate arrays for each component. This maps to the storage of
|
|
`ArrayHandleSOA`. The VTK-m code tends to prefer the AOS storage (which is
|
|
what is implemented in `ArrayHandleBasic`, and the behavior of which is
|
|
inherited from VTK). VTK-m should better support adding `ArrayHandleSOA` as
|
|
one of the types.
|
|
|
|
We now have a set of default types for Ascent that uses SOA as one of the
|
|
basic types.
|
|
|
|
Part of this change includes an intentional feature regression of
|
|
`ArrayHandleSOA` to only support value types of `Vec`. Previously, scalar
|
|
types were supported. However, the behavior of `ArrayHandleSOA` is exactly
|
|
the same as `ArrayHandleBasic`, except a lot more template code has to be
|
|
generated. That itself is not a huge deal, but because you have 2 types
|
|
that essentially do the same thing, a lot of template code in VTK-m would
|
|
unwind to create two separate code paths that do the same thing with the
|
|
same data. To avoid creating those code paths, we simply make any use of
|
|
`ArrayHandleSOA` without a `Vec` value invalid. This will prevent VTK-m
|
|
from creating those code paths.
|
|
|
|
|
|
## Removed old `ArrayHandle` transfer mechanism
|
|
|
|
Deleted the default implementation of `ArrayTransfer`. `ArrayTransfer` is
|
|
used with the old `ArrayHandle` style to move data between host and device.
|
|
The new version of `ArrayHandle` does not use `ArrayTransfer` at all
|
|
because this functionality is wrapped in `Buffer` (where it can exist in a
|
|
precompiled library).
|
|
|
|
Once all the old `ArrayHandle` classes are gone, this class will be removed
|
|
completely. Although all the remaining `ArrayHandle` classes provide their
|
|
own versions of `ArrayTransfer`, they still need the prototype to be
|
|
defined to specialize. Thus, the guts of the default `ArrayTransfer` are
|
|
removed and replaced with a compile error if you try to compile it.
|
|
|
|
Also removed `ArrayManagerExecution`. This class was used indirectly by the
|
|
old `ArrayHandle`, through `ArrayHandleTransfer`, to move data to and from
|
|
a device. This functionality has been replaced in the new `ArrayHandle`s
|
|
through the `Buffer` class (which can be compiled into libraries rather
|
|
than make every translation unit compile their own template).
|
|
|
|
|
|
## Order asynchronous `ArrayHandle` access
|
|
|
|
The recent feature of [tokens that scope access to
|
|
`ArrayHandle`s](scoping-tokens.md) allows multiple threads to use the same
|
|
`ArrayHandle`s without read/write hazards. The intent is twofold. First, it
|
|
allows two separate threads in the control environment to independently
|
|
schedule tasks. Second, it allows us to move toward scheduling worklets and
|
|
other algorithms asynchronously.
|
|
|
|
However, there was a flaw with the original implementation. Once requests
|
|
to an `ArrayHandle` get queued up, they are resolved in arbitrary order.
|
|
This might mean that things run in surprising and incorrect order.
|
|
|
|
### Problematic use case
|
|
|
|
To demonstrate the flaw in the original implementation, let us consider a
|
|
future scenario where when you invoke a worklet (on OpenMP or TBB), the
|
|
call to invoke returns immediately and the actual work is scheduled
|
|
asynchronously. Now let us say we have a sequence of 3 worklets we wish to
|
|
run: `Worklet1`, `Worklet2`, and `Worklet3`. One of `Worklet1`'s parameters
|
|
is a `FieldOut` that creates an intermediate `ArrayHandle` that we will
|
|
simply call `array`. `Worklet2` is given `array` as a `FieldInOut` to
|
|
modify its values. Finally, `Worklet3` is given `array` as a `FieldIn`. It
|
|
is clear that for the computation to be correct, the three worklets _must_
|
|
execute in the correct order of `Worklet1`, `Worklet2`, and `Worklet3`.
|
|
|
|
The problem is that if `Worklet2` and `Worklet3` are both scheduled before
|
|
`Worklet1` finishes, the order they are executed could be arbitrary. Let us
|
|
say that `Worklet1` is invoked, and the invoke call returns before the
|
|
execution of `Worklet1` finishes.
|
|
|
|
The calling code immediately invokes `Worklet2`. Because `array` is already
|
|
locked by `Worklet1`, `Worklet2` does not execute right away. Instead, it
|
|
waits on a condition variable of `array` until it is free. But even though
|
|
the scheduling of `Worklet2` is blocked, the invoke returns because we are
|
|
scheduling asynchronously.
|
|
|
|
Likewise, the calling code then immediately calls invoke for `Worklet3`.
|
|
`Worklet3` similarly waits on the condition variable of `array` until it is
|
|
free.
|
|
|
|
Let us assume the likely event that both `Worklet2` and `Worklet3` get
|
|
scheduled before `Worklet1` finishes. When `Worklet1` then later does
|
|
finish, it's token relinquishes the lock on `array`, which wakes up the
|
|
threads waiting for access to `array`. However, there is no imposed order on
|
|
in what order the waiting threads will acquire the lock and run. (At least,
|
|
I'm not aware of anything imposing an order.) Thus, it is quite possible
|
|
that `Worklet3` will wake up first. It will see that `array` is no longer
|
|
locked (because `Worklet1` has released it and `Worklet2` has not had a
|
|
chance to claim it).
|
|
|
|
Oops. Now `Worklet3` is operating on `array` before `Worklet2` has had a
|
|
chance to put the correct values in it. The results will be wrong.
|
|
|
|
### Queuing requests
|
|
|
|
What we want is to impose the restriction that locks to an `ArrayHandle`
|
|
get resolved in the order that they are requested. In the previous example,
|
|
we have 3 requests on an array that happen in a known order. We want
|
|
control given to them in the same order.
|
|
|
|
To implement this, we need to impose another restriction on the
|
|
`condition_variable` when waiting to read or write. We want the lock to go
|
|
to the thread that first started waiting. To do this, we added an
|
|
internal queue of `Token`s to the `ArrayHandle`.
|
|
|
|
In `ArrayHandle::WaitToRead` and `ArrayHandle::WaitToWrite`, it first adds
|
|
its `Token` to the back of the queue before waiting on the condition
|
|
variable. In the `CanRead` and `CanWrite` methods, it checks this queue to
|
|
see if the provided `Token` is at the front. If not, then the lock is
|
|
denied and the thread must continue to wait.
|
|
|
|
### Early enqueuing
|
|
|
|
Another issue that can happen in the previous example is that as threads
|
|
are spawned for the 3 different worklets, they may actually start running
|
|
in an unexpected order. So the thread running `Worklet3` might actually
|
|
start before the other 2 and place itself in the queue first.
|
|
|
|
The solution is to add a method to `ArrayHandle` called `Enqueue`. This
|
|
method takes a `Token` object and adds that `Token` to the queue. However,
|
|
regardless of where the `Token` ends up on the queue, the method
|
|
immediately returns. It does not attempt to lock the `ArrayHandle`.
|
|
|
|
So now we can ensure that `Worklet1` properly locks `array` with this
|
|
sequence of events. First, the main thread calls `array.Enqueue`. Then a
|
|
thread is spawned to call `PrepareForOutput`.
|
|
|
|
Even if control returns to the calling code and it calls invoke for
|
|
`Worklet2` before this spawned thread starts, `Worklet2` cannot start
|
|
first. When `PrepareForInput` is called on `array`, it is queued after the
|
|
`Token` for `Worklet1`, even if `Worklet1` has not started waiting on the
|
|
`array`.
|
|
|
|
|
|
## Improvements to moving data into ArrayHandle
|
|
|
|
We have made several improvements to adding data into an `ArrayHandle`.
|
|
|
|
### Moving data from an `std::vector`
|
|
|
|
For numerous reasons, it is convenient to define data in a `std::vector`
|
|
and then wrap that into an `ArrayHandle`. There are two obvious ways to do
|
|
this. First, you could deep copy the data into an `ArrayHandle`, which has
|
|
obvious drawbacks. Second, you could take the pointer for the data in the
|
|
`std::vector` and use that as user-allocated memory in the `ArrayHandle`
|
|
without deep copying it. The problem with this shallow copy is that it is
|
|
unsafe. If the `std::vector` goes out of scope (or gets resized), then the
|
|
data the `ArrayHandle` is pointing to becomes unallocated, which will lead
|
|
to unpredictable behavior.
|
|
|
|
However, there is a third option. It is often the case that an
|
|
`std::vector` is filled and then becomes unused once it is converted to an
|
|
`ArrayHandle`. In this case, what we really want is to pass the data off to
|
|
the `ArrayHandle` so that the `ArrayHandle` is now managing the data and
|
|
not the `std::vector`.
|
|
|
|
C++11 has a mechanism to do this: move semantics. You can now pass
|
|
variables to functions as an "rvalue" (right-hand value). When something is
|
|
passed as an rvalue, it can pull state out of that variable and move it
|
|
somewhere else. `std::vector` implements this movement so that an rvalue
|
|
can be moved to another `std::vector` without actually copying the data.
|
|
`make_ArrayHandle` now also takes advantage of this feature to move rvalue
|
|
`std::vector`s.
|
|
|
|
There is a special form of `make_ArrayHandle` named `make_ArrayHandleMove`
|
|
that takes an rvalue. There is also a special overload of
|
|
`make_ArrayHandle` itself that handles an rvalue `vector`. (However, using
|
|
the explicit move version is better if you want to make sure the data is
|
|
actually moved.)
|
|
|
|
So if you create the `std::vector` in the call to `make_ArrayHandle`, then
|
|
the data only gets created once.
|
|
|
|
``` cpp
|
|
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
|
|
```
|
|
|
|
Note that there is now a better way to express an initializer list to
|
|
`ArrayHandle` documented below. But this form of `ArrayHandleMove` can be
|
|
particularly useful for initializing an array to all of a particular value.
|
|
For example, an easy way to initialize an array of 1000 elements all to 1
|
|
is
|
|
|
|
``` cpp
|
|
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>(1000, 1));
|
|
```
|
|
|
|
You can also move the data from an already created `std::vector` by using
|
|
the `std::move` function to convert it to an rvalue. When you do this, the
|
|
`std::vector` becomes invalid after the call and any use will be undefined.
|
|
|
|
``` cpp
|
|
std::vector<vtkm::Id> vector;
|
|
// fill vector
|
|
|
|
auto array = vtkm::cont::make_ArrayHandleMove(std::move(vector));
|
|
```
|
|
|
|
### Make `ArrayHandle` from initalizer list
|
|
|
|
A common use case for using `std::vector` (particularly in our unit tests)
|
|
is to quickly add an initalizer list into an `ArrayHandle`. Repeating the
|
|
example from above:
|
|
|
|
``` cpp
|
|
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
|
|
```
|
|
|
|
However, creating the `std::vector` should be unnecessary. Why not be able
|
|
to create the `ArrayHandle` directly from an initializer list? Now you can
|
|
by simply passing an initializer list to `make_ArrayHandle`.
|
|
|
|
``` cpp
|
|
auto array = vtkm::cont::make_ArrayHandle({ 2, 6, 1, 7, 4, 3, 9 });
|
|
```
|
|
|
|
There is an issue here. The type here can be a little ambiguous (for
|
|
humans). In this case, `array` will be of type
|
|
`vtkm::cont::ArrayHandleBasic<int>`, since that is what an integer literal
|
|
defaults to. This could be a problem if, for example, you want to use
|
|
`array` as an array of `vtkm::Id`, which could be of type `vtkm::Int64`.
|
|
This is easily remedied by specifying the desired value type as a template
|
|
argument to `make_ArrayHandle`.
|
|
|
|
``` cpp
|
|
auto array = vtkm::cont::make_ArrayHandle<vtkm::Id>({ 2, 6, 1, 7, 4, 3, 9 });
|
|
```
|
|
|
|
### Deprecated `make_ArrayHandle` with default shallow copy
|
|
|
|
For historical reasons, passing an `std::vector` or a pointer to
|
|
`make_ArrayHandle` does a shallow copy (i.e. `CopyFlag` defaults to `Off`).
|
|
Although more efficient, this mode is inherintly unsafe, and making it the
|
|
default is asking for trouble.
|
|
|
|
To combat this, calling `make_ArrayHandle` without a copy flag is
|
|
deprecated. In this way, if you wish to do the faster but more unsafe
|
|
creation of an `ArrayHandle` you should explicitly express that.
|
|
|
|
This requried quite a few changes through the VTK-m source (particularly in
|
|
the tests).
|
|
|
|
### Similar changes to `Field`
|
|
|
|
`vtkm::cont::Field` has a `make_Field` helper function that is similar to
|
|
`make_ArrayHandle`. It also features the ability to create fields from
|
|
`std::vector`s and C arrays. It also likewise had the same unsafe behavior
|
|
by default of not copying from the source of the arrays.
|
|
|
|
That behavior has similarly been depreciated. You now have to specify a
|
|
copy flag.
|
|
|
|
The ability to construct a `Field` from an initializer list of values has
|
|
also been added.
|
|
|
|
|
|
## Deprecate ArrayHandleVirtualCoordinates
|
|
|
|
As we port VTK-m to more types of accelerator architectures, supporting
|
|
virtual methods is becoming more problematic. Thus, we are working to back
|
|
out of using virtual methods in the execution environment.
|
|
|
|
One of the most widespread users of virtual methods in the execution
|
|
environment is `ArrayHandleVirtual`. As a first step of deprecating this
|
|
class, we first deprecate the `ArrayHandleVirtualCoordinates` subclass.
|
|
|
|
Not surprisingly, `ArrayHandleVirtualCoordinates` is used directly by
|
|
`CoordinateSystem`. The biggest change necessary was that the `GetData`
|
|
method returned an `ArrayHandleVirtualCoordinates`, which obviously would
|
|
not work if that class is deprecated.
|
|
|
|
An oddness about this return type is that it is quite different from the
|
|
superclass's method of the same name. Rather, `Field` returns a
|
|
`VariantArrayHandle`. Since this had to be corrected anyway, it was decided
|
|
to change `CoordinateSystem`'s `GetData` to also return a
|
|
`VariantArrayHandle`, although its typelist is set to just `vtkm::Vec3f`.
|
|
|
|
To try to still support old code that expects the deprecated behavior of
|
|
returning an `ArrayHandleVirtualCoordinates`, `CoordinateSystem::GetData`
|
|
actually returns a "hidden" subclass of `VariantArrayHandle` that
|
|
automatically converts itself to an `ArrayHandleVirtualCoordinates`. (A
|
|
deprecation warning is given if this is done.)
|
|
|
|
This approach to support deprecated code is not perfect. The returned value
|
|
for `CoordinateSystem::GetData` can only be used as an `ArrayHandle` if a
|
|
method is directly called on it or if it is cast specifically to
|
|
`ArrayHandleVirtualCoordinates` or its superclass. For example, if passing
|
|
it to a method argument typed as `vtkm::cont::ArrayHandle<T, S>` where `T`
|
|
and `S` are template parameters, then the conversion will fail.
|
|
|
|
To continue to support ease of use, `CoordinateSystem` now has a method
|
|
named `GetDataAsMultiplexer` that returns the data as an
|
|
`ArrayHandleMultiplexer`. This can be employed to quickly use the
|
|
`CoordinateSystem` as an array without the overhead of a `CastAndCall`.
|
|
|
|
## ArrayHandleDecorator Allocate and Shrink Support
|
|
|
|
`ArrayHandleDecorator` can now be resized when given an appropriate
|
|
decorator implementation.
|
|
|
|
Since the mapping between the size of an `ArrayHandleDecorator` and its source
|
|
`ArrayHandle`s is not well defined, resize operations (such as `Shrink` and
|
|
`Allocate`) are not defined by default, and will throw an exception if called.
|
|
|
|
However, by implementing the methods `AllocateSourceArrays` and/or
|
|
`ShrinkSourceArrays` on the implementation class, resizing the decorator is
|
|
allowed. These methods are passed in a new size along with each of the
|
|
`ArrayHandleDecorator`'s source arrays, allowing developers to control how
|
|
the resize operation should affect the source arrays.
|
|
|
|
For example, the following decorator implementation can be used to create a
|
|
resizable `ArrayHandleDecorator` that is implemented using two arrays, which
|
|
are combined to produce values via the expression:
|
|
|
|
```
|
|
[decorator value i] = [source1 value i] * 10 + [source2 value i]
|
|
```
|
|
|
|
Implementation:
|
|
|
|
```c++
|
|
template <typename ValueType>
|
|
struct DecompositionDecorImpl
|
|
{
|
|
template <typename Portal1T, typename Portal2T>
|
|
struct Functor
|
|
{
|
|
Portal1T Portal1;
|
|
Portal2T Portal2;
|
|
|
|
VTKM_EXEC_CONT
|
|
ValueType operator()(vtkm::Id idx) const
|
|
{
|
|
return static_cast<ValueType>(this->Portal1.Get(idx) * 10 + this->Portal2.Get(idx));
|
|
}
|
|
};
|
|
|
|
template <typename Portal1T, typename Portal2T>
|
|
struct InverseFunctor
|
|
{
|
|
Portal1T Portal1;
|
|
Portal2T Portal2;
|
|
|
|
VTKM_EXEC_CONT
|
|
void operator()(vtkm::Id idx, const ValueType& val) const
|
|
{
|
|
this->Portal1.Set(idx, static_cast<ValueType>(std::floor(val / 10)));
|
|
this->Portal2.Set(idx, static_cast<ValueType>(std::fmod(val, 10)));
|
|
}
|
|
};
|
|
|
|
template <typename Portal1T, typename Portal2T>
|
|
VTKM_CONT Functor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
|
|
CreateFunctor(Portal1T&& p1, Portal2T&& p2) const
|
|
{
|
|
return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
|
|
}
|
|
|
|
template <typename Portal1T, typename Portal2T>
|
|
VTKM_CONT InverseFunctor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
|
|
CreateInverseFunctor(Portal1T&& p1, Portal2T&& p2) const
|
|
{
|
|
return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
|
|
}
|
|
|
|
// Resize methods:
|
|
template <typename Array1T, typename Array2T>
|
|
VTKM_CONT
|
|
void AllocateSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
|
|
{
|
|
array1.Allocate(numVals);
|
|
array2.Allocate(numVals);
|
|
}
|
|
|
|
template <typename Array1T, typename Array2T>
|
|
VTKM_CONT
|
|
void ShrinkSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
|
|
{
|
|
array1.Shrink(numVals);
|
|
array2.Shrink(numVals);
|
|
}
|
|
};
|
|
|
|
// Usage:
|
|
vtkm::cont::ArrayHandle<ValueType> a1;
|
|
vtkm::cont::ArrayHandle<ValueType> a2;
|
|
auto decor = vtkm::cont::make_ArrayHandleDecorator(0, DecompositionDecorImpl<ValueType>{}, a1, a2);
|
|
|
|
decor.Allocate(5);
|
|
{
|
|
auto decorPortal = decor.GetPortalControl();
|
|
decorPortal.Set(0, 13);
|
|
decorPortal.Set(1, 8);
|
|
decorPortal.Set(2, 43);
|
|
decorPortal.Set(3, 92);
|
|
decorPortal.Set(4, 117);
|
|
}
|
|
|
|
// a1: { 1, 0, 4, 9, 11 }
|
|
// a2: { 3, 8, 3, 2, 7 }
|
|
// decor: { 13, 8, 43, 92, 117 }
|
|
|
|
decor.Shrink(3);
|
|
|
|
// a1: { 1, 0, 4 }
|
|
// a2: { 3, 8, 3 }
|
|
// decor: { 13, 8, 43 }
|
|
|
|
```
|
|
|
|
|
|
## Portals may advertise custom iterators
|
|
|
|
The `ArrayPortalToIterator` utilities are used to produce STL-style iterators
|
|
from vtk-m's `ArrayHandle` portals. By default, a facade class is constructed
|
|
around the portal API, adapting it to an iterator interface.
|
|
|
|
However, some portals use iterators internally, or may be able to construct a
|
|
lightweight iterator easily. For these, it is preferable to directly use the
|
|
specialized iterators instead of going through the generic facade. A portal may
|
|
now declare the following optional API to advertise that it has custom
|
|
iterators:
|
|
|
|
```
|
|
struct MyPortal
|
|
{
|
|
using IteratorType = ...; // alias to the portal's specialized iterator type
|
|
IteratorType GetIteratorBegin(); // Return the begin iterator
|
|
IteratorType GetIteratorEnd(); // Return the end iterator
|
|
|
|
// ...rest of ArrayPortal API...
|
|
};
|
|
```
|
|
|
|
If these members are present, `ArrayPortalToIterators` will forward the portal's
|
|
specialized iterators instead of constructing a facade. This works when using
|
|
the `ArrayPortalToIterators` class directly, and also with the
|
|
`ArrayPortalToIteratorBegin` and `ArrayPortalToIteratorEnd` convenience
|
|
functions.
|
|
|
|
|
|
## Redesign of ArrayHandle to access data using typeless buffers
|
|
|
|
The original implementation of `ArrayHandle` is meant to be very generic.
|
|
To define an `ArrayHandle`, you actually create a `Storage` class that
|
|
maintains the data and provides portals to access it (on the host). Because
|
|
the `Storage` can provide any type of data structure it wants, you also
|
|
need to define an `ArrayTransfer` that describes how to move the
|
|
`ArrayHandle` to and from a device. It also has to be repeated for every
|
|
translation unit that uses them.
|
|
|
|
This is a very powerful mechanism. However, one of the major problems with
|
|
this approach is that every `ArrayHandle` type needs to have a separate
|
|
compile path for every value type crossed with every device. Because of
|
|
this limitation, the `ArrayHandle` for the basic storage has a special
|
|
implementation that manages the actual data allocation and movement as
|
|
`void *` arrays. In this way all the data management can be compiled once
|
|
and put into the `vtkm_cont` library. This has dramatically improved the
|
|
VTK-m compile time.
|
|
|
|
This new design replicates the basic `ArrayHandle`'s success to all other
|
|
storage types. The basic idea is to make the implementation of
|
|
`ArrayHandle` storage slightly less generic. Instead of requiring it to
|
|
manage the data it stores, it instead just builds `ArrayPortal`s from
|
|
`void` pointers that it is given. The management of `void` pointers can be
|
|
done in non-templated classes that are compiled into a library.
|
|
|
|
This initial implementation does not convert all `ArrayHandle`s to avoid
|
|
making non-backward compatible changes before the next minor revision of
|
|
VTK-m. In particular, it would be particularly difficult to convert
|
|
`ArrayHandleVirtual`. It could be done, but it would be a lot of work for a
|
|
class that will likely be removed.
|
|
|
|
### Buffer
|
|
|
|
Key to these changes is the introduction of a
|
|
`vtkm::cont::internal::Buffer` object. As the name implies, the `Buffer`
|
|
object manages a single block of bytes. `Buffer` is agnostic to the type of
|
|
data being stored. It only knows the length of the buffer in bytes. It is
|
|
responsible for allocating space on the host and any devices as necessary
|
|
and for transferring data among them. (Since `Buffer` knows nothing about
|
|
the type of data, a precondition of VTK-m would be that the host and all
|
|
devices have to have the same endian.)
|
|
|
|
The idea of the `Buffer` object is similar in nature to the existing
|
|
`vtkm::cont::internal::ExecutionArrayInterfaceBasicBase` except that it
|
|
will manage a buffer of data among the control and all devices rather than
|
|
in one device through a templated subclass.
|
|
|
|
As explained below, `ArrayHandle` holds some fixed number of `Buffer`
|
|
objects. (The number can be zero for implicit `ArrayHandle`s.) Because all
|
|
the interaction with the devices happen through `Buffer`, it will no longer
|
|
be necessary to compile any reference to `ArrayHandle` for devices (e.g.
|
|
you won't have to use nvcc just because the code links `ArrayHandle.h`).
|
|
|
|
### Storage
|
|
|
|
The `vtkm::cont::internal::Storage` class changes dramatically. Although an
|
|
instance will be kept, the intention is for `Storage` itself to be a
|
|
stateless object. It will manage its data through `Buffer` objects provided
|
|
from the `ArrayHandle`.
|
|
|
|
That said, it is possible for `Storage` to have some state. For example,
|
|
the `Storage` for `ArrayHandleImplicit` must hold on to the instance of the
|
|
portal used to manage the state.
|
|
|
|
|
|
### ArrayTransport
|
|
|
|
The `vtkm::cont::internal::ArrayTransfer` class will be removed completely.
|
|
All data transfers will be handled internally with the `Buffer` object
|
|
|
|
### Portals
|
|
|
|
A big change for this design is that the type of a portal for an
|
|
`ArrayHandle` will be the same for all devices and the host. Thus, we no
|
|
longer need specialized versions of portals for each device. We only have
|
|
one portal type. And since they are constructed from `void *` pointers, one
|
|
method can create them all.
|
|
|
|
|
|
### Advantages
|
|
|
|
The `ArrayHandle` interface should not change significantly for external
|
|
uses, but this redesign offers several advantages.
|
|
|
|
#### Faster Compiles
|
|
|
|
Because the memory management is contained in a non-templated `Buffer`
|
|
class, it can be compiled once in a library and used by all template
|
|
instances of `ArrayHandle`. It should have similar compile advantages to
|
|
our current specialization of the basic `ArrayHandle`, but applied to all
|
|
types of `ArrayHandle`s.
|
|
|
|
#### Fewer Templates
|
|
|
|
Hand-in-hand with faster compiles, the new design should require fewer
|
|
templates and template instances. We have immediately gotten rid of
|
|
`ArrayTransport`. `Storage` is also much shorter. Because all
|
|
`ArrayPortal`s are the same for every device and the host, we need many
|
|
fewer versions of those classes. In the device adapter, we can probably
|
|
collapse the three `ArrayManagerExecution` classes into a single, much
|
|
simpler class that does simple memory allocation and copy.
|
|
|
|
#### Fewer files need to be compiled for CUDA
|
|
|
|
Including `ArrayHandle.h` no longer adds code that compiles for a device.
|
|
Thus, we should no longer need to compile for a specific device adapter
|
|
just because we access an `ArrayHandle`. This should make it much easier to
|
|
achieve our goal of a "firewall". That is, code that just calls VTK-m
|
|
filters does not need to support all its compilers and flags.
|
|
|
|
#### Simpler ArrayHandle specialization
|
|
|
|
The newer code should simplify the implementation of special `ArrayHandle`s
|
|
a bit. You need only implement an `ArrayPortal` that operates on one or
|
|
more `void *` arrays and a simple `Storage` class.
|
|
|
|
#### Out of band memory sharing
|
|
|
|
With the current version of `ArrayHandle`, if you want to take data from
|
|
one `ArrayHandle` you pretty much have to create a special template to wrap
|
|
another `ArrayHandle` around that. With this new design, it is possible to
|
|
take data from one `ArrayHandle` and give it to another `ArrayHandle` of a
|
|
completely different type. You can't do this willy-nilly since different
|
|
`ArrayHandle` types will interpret buffers differently. But there can be
|
|
some special important use cases.
|
|
|
|
One such case could be an `ArrayHandle` that provides strided access to a
|
|
buffer. (Let's call it `ArrayHandleStride`.) The idea is that it interprets
|
|
the buffer as an array for a particular type (like a basic `ArrayHandle`)
|
|
but also defines a stride, skip, and repeat so that given an index it looks
|
|
up the value `((index / skip) % repeat) * stride`. The point is that it can
|
|
take an AoS array of tuples and represent an array of one of the
|
|
components.
|
|
|
|
The point would be that if you had a `VariantArrayHandle` or `Field`, you
|
|
could pull out an array of one of the components as an `ArrayHandleStride`.
|
|
An `ArrayHandleStride<vtkm::Float32>` could be used to represent that data
|
|
that comes from any basic `ArrayHandle` with `vtkm::Float32` or a
|
|
`vtkm::Vec` of that type. It could also represent data from an
|
|
`ArrayHandleCartesianProduct` and `ArrayHandleSoA`. We could even represent
|
|
an `ArrayHandleUniformPointCoordinates` by just making a small array. This
|
|
allows us to statically access a whole bunch of potential array storage
|
|
classes with a single type.
|
|
|
|
#### Potentially faster device transfers
|
|
|
|
There is currently a fast-path for basic `ArrayHandle`s that does a block
|
|
cuda memcpy between host and device. But for other `ArrayHandle`s that do
|
|
not defer their `ArrayTransfer` to a sub-array, the transfer first has to
|
|
copy the data into a known buffer.
|
|
|
|
Because this new design stores all data in `Buffer` objects, any of these
|
|
can be easily and efficiently copied between devices.
|
|
|
|
### Disadvantages
|
|
|
|
This new design gives up some features of the original `ArrayHandle` design.
|
|
|
|
#### Can only interface data that can be represented in a fixed number of buffers
|
|
|
|
Because the original `ArrayHandle` design required the `Storage` to
|
|
completely manage the data, it could represent it in any way possible. In
|
|
this redesign, the data need to be stored in some fixed number of memory
|
|
buffers.
|
|
|
|
This is a pretty open requirement. I suspect most data formats will be
|
|
storable in this. The user's guide has an example of data stored in a
|
|
`std::deque` that will not be representable. But that is probably not a
|
|
particularly practical example.
|
|
|
|
#### VTK-m would only be able to support hosts and devices with the same endian
|
|
|
|
Because data are transferred as `void *` blocks of memory, there is no way
|
|
to correct words if the endian on the two devices does not agree. As far as
|
|
I know, there should be no issues with the proposed ECP machines.
|
|
|
|
If endian becomes an issue, it might be possible to specify a word length
|
|
in the `Buffer`. That would assume that all numbers stored in the `Buffer`
|
|
have the same word length.
|
|
|
|
#### ArrayPortals must be completely recompiled in each translation unit
|
|
|
|
We can declare that an `ArrayHandle` does not need to include the device
|
|
adapter header files in part because it no longer needs specialized
|
|
`ArrayPortal`s for each device. However, that means that a translation unit
|
|
compiled with the host compiler (say gcc) will produce different code for
|
|
the `ArrayPortal`s than those with the device compiler (say nvcc). This
|
|
could lead to numerous linking problems.
|
|
|
|
To get around these issues, we will probably have to enforce no exporting
|
|
of any of the `ArrayPotal` symbols and force them all to be recompiled for
|
|
each translation unit. This will serve to increase the compile times a bit.
|
|
We will probably also still encounter linking errors as there would be no
|
|
way to enforce this requirement.
|
|
|
|
#### Cannot have specialized portals for the control environment
|
|
|
|
Because the new design unifies `ArrayPortal` types across control and
|
|
execution environments, it is no longer possible to have a special version
|
|
for the control environment to manage resources. This will require removing
|
|
some recent behavior of control portals such as with MR !1988.
|
|
|
|
|
|
## `ArrayRangeCompute` works on any array type without compiling device code
|
|
|
|
Originally, `ArrayRangeCompute` required you to know specifically the
|
|
`ArrayHandle` type (value type and storage type) and to compile using any
|
|
device compiler. The method is changed to include only overloads that have
|
|
precompiled versions of `ArrayRangeCompute`.
|
|
|
|
Additionally, an `ArrayRangeCompute` overload that takes an
|
|
`UnknownArrayHandle` has been added. In addition to allowing you to compute
|
|
the range of arrays of unknown types, this implementation of
|
|
`ArrayRangeCompute` serves as a fallback for `ArrayHandle` types that are
|
|
not otherwise explicitly supported.
|
|
|
|
If you really want to make sure that you compute the range directly on an
|
|
`ArrayHandle` of a particular type, you can include
|
|
`ArrayRangeComputeTemplate.h`, which contains a templated overload of
|
|
`ArrayRangeCompute` that directly computes the range of an `ArrayHandle`.
|
|
Including this header requires compiling for device code.
|
|
|
|
|
|
## Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
|
|
|
|
ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal were added to provide
|
|
an efficient way to generate pseudo random numbers in parallel. They are based on the
|
|
Philox parallel pseudo random number generator. ArrayHandleRandomUniformBits provides
|
|
64-bits random bits in the whole range of UInt64 as its content while
|
|
ArrayHandleRandomUniformReal provides random Float64 in the range of [0, 1). User can
|
|
either provide a seed in the form of Vec<vtkm::Uint32, 1> or use the default random
|
|
source provided by the C++ standard library. Both of the ArrayHandles are lazy evaluated
|
|
as other Fancy ArrayHandles such that they only have O(1) memory overhead. They are
|
|
stateless and functional and does not change once constructed. To generate a new set of
|
|
random numbers, for example as part of a iterative algorithm, a new ArrayHandle
|
|
needs to be constructed in each iteration. See the user's guide for more detail and
|
|
examples.
|
|
|
|
|
|
## Extract component arrays from unknown arrays
|
|
|
|
One of the problems with the data structures of VTK-m is that non-templated
|
|
classes like `DataSet`, `Field`, and `UnknownArrayHandle` (formally
|
|
`VariantArrayHandle`) internally hold an `ArrayHandle` of a particular type
|
|
that has to be cast to the correct task before it can be reasonably used.
|
|
That in turn is problematic because the list of possible `ArrayHandle`
|
|
types is very long.
|
|
|
|
At one time we were trying to compensate for this by using
|
|
`ArrayHandleVirtual`. However, for technical reasons this class is
|
|
infeasible for every use case of VTK-m and has been deprecated. Also, this
|
|
was only a partial solution since using it still required different code
|
|
paths for, say, handling values of `vtkm::Float32` and `vtkm::Vec3f_32`
|
|
even though both are essentially arrays of 32-bit floats.
|
|
|
|
The extract component feature compensates for this problem by allowing you
|
|
to extract the components from an `ArrayHandle`. This feature allows you to
|
|
create a single code path to handle `ArrayHandle`s containing scalars or
|
|
vectors of any size. Furthermore, when you extract a component from an
|
|
array, the storage gets normalized so that one code path covers all storage
|
|
types.
|
|
|
|
### `ArrayExtractComponent`
|
|
|
|
The basic enabling feature is a new function named `ArrayExtractComponent`.
|
|
This function takes takes an `ArrayHandle` and an index to a component. It
|
|
then returns an `ArrayHandleStride` holding the selected component of each
|
|
entry in the original array.
|
|
|
|
We will get to the structure of `ArrayHandleStride` later. But the
|
|
important part is that `ArrayHandleStride` does _not_ depend on the storage
|
|
type of the original `ArrayHandle`. That means whether you extract a
|
|
component from `ArrayHandleBasic`, `ArrayHandleSOA`,
|
|
`ArrayHandleCartesianProduct`, or any other type, you get back the same
|
|
`ArrayHandleStride`. Likewise, regardless of whether the input
|
|
`ArrayHandle` has a `ValueType` of `FloatDefault`, `Vec2f`, `Vec3f`, or any
|
|
other `Vec` of a default float, you get the same `ArrayHandleStride`. Thus,
|
|
you can see how this feature can dramatically reduce code paths if used
|
|
correctly.
|
|
|
|
It should be noted that `ArrayExtractComponent` will (logically) flatten
|
|
the `ValueType` before extracting the component. Thus, nested `Vec`s such
|
|
as `Vec<Vec3f, 3>` will be treated as a `Vec<FloatDefault, 9>`. The
|
|
intention is so that the extracted component will always be a basic C type.
|
|
For the purposes of this document when we refer to the "component type", we
|
|
really mean the base component type.
|
|
|
|
Different `ArrayHandle` implementations provide their own implementations
|
|
for `ArrayExtractComponent` so that the component can be extracted without
|
|
deep copying all the data. We will visit how `ArrayHandleStride` can
|
|
represent different data layouts later, but first let's go into the main
|
|
use case.
|
|
|
|
### Extract components from `UnknownArrayHandle`
|
|
|
|
The principle use case for `ArrayExtractComponent` is to get an
|
|
`ArrayHandle` from an unknown array handle without iterating over _every_
|
|
possible type. (Rather, we iterate over a smaller set of types.) To
|
|
facilitate this, an `ExtractComponent` method has been added to
|
|
`UnknownArrayHandle`.
|
|
|
|
To use `UnknownArrayHandle::ExtractComponent`, you must give it the
|
|
component type. You can check for the correct component type by using the
|
|
`IsBaseComponentType` method. The method will then return an
|
|
`ArrayHandleStride` for the component type specified.
|
|
|
|
#### Example
|
|
|
|
As an example, let's say you have a worklet, `FooWorklet`, that does some
|
|
per component operation on an array. Furthermore, let's say that you want
|
|
to implement a function that, to the best of your ability, can apply
|
|
`FooWorklet` on an array of any type. This function should be pre-compiled
|
|
into a library so it doesn't have to be compiled over and over again.
|
|
(`MapFieldPermutation` and `MapFieldMergeAverage` are real and important
|
|
examples that have this behavior.)
|
|
|
|
Without the extract component feature, the implementation might look
|
|
something like this (many practical details left out):
|
|
|
|
``` cpp
|
|
struct ApplyFooFunctor
|
|
{
|
|
template <typename ArrayType>
|
|
void operator()(const ArrayType& input, vtkm::cont::UnknownArrayHandle& output) const
|
|
{
|
|
ArrayType outputArray;
|
|
vtkm::cont::Invoke invoke;
|
|
invoke(FooWorklet{}, input, outputArray);
|
|
output = outputArray;
|
|
}
|
|
};
|
|
|
|
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
|
|
{
|
|
vtkm::cont::UnknownArrayHandle output;
|
|
input.CastAndCallForTypes<vtkm::TypeListAll, VTKM_DEFAULT_STORAGE_LIST_TAG>(
|
|
ApplyFooFunctor{}, output);
|
|
return output;
|
|
}
|
|
```
|
|
|
|
Take a look specifically at the `CastAndCallForTypes` call near the bottom
|
|
of this example. It calls for all types in `vtkm::TypeListAll`, which is
|
|
about 40 instances. Then, it needs to be called for any type in the desired
|
|
storage list. This could include basic arrays, SOA arrays, and lots of
|
|
other specialized types. It would be expected for this code to generate
|
|
over 100 paths for `ApplyFooFunctor`. This in turn contains a worklet
|
|
invoke, which is not a small amount of code.
|
|
|
|
Now consider how we can use the `ExtractComponent` feature to reduce the
|
|
code paths:
|
|
|
|
``` cpp
|
|
struct ApplyFooFunctor
|
|
{
|
|
template <typename T>
|
|
void operator()(T,
|
|
const vtkm::cont::UnknownArrayHandle& input,
|
|
cont vtkm::cont::UnknownArrayHandle& output) const
|
|
{
|
|
if (!input.IsBasicComponentType<T>()) { return; }
|
|
VTKM_ASSERT(output.IsBasicComponentType<T>());
|
|
|
|
vtkm::cont::Invoke invoke;
|
|
invoke(FooWorklet{}, input.ExtractComponent<T>(), output.ExtractComponent<T>());
|
|
}
|
|
};
|
|
|
|
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
|
|
{
|
|
vtkm::cont::UnknownArrayHandle output = input.NewInstanceBasic();
|
|
output.Allocate(input.GetNumberOfValues());
|
|
vtkm::cont::ListForEach(ApplyFooFunctor{}, vtkm::TypeListScalarAll{}, input, output);
|
|
return output;
|
|
}
|
|
```
|
|
|
|
The number of lines of code is about the same, but take a look at the
|
|
`ListForEach` (which replaces the `CastAndCallForTypes`). This calling code
|
|
takes `TypeListScalarAll` instead of `TypeListAll`, which reduces the
|
|
instances created from around 40 to 13 (every basic C type). It is also no
|
|
longer dependent on the storage, so these 13 instances are it. As an
|
|
example of potential compile savings, changing the implementation of the
|
|
`MapFieldMergePermutation` and `MapFieldMergeAverage` functions in this way
|
|
reduced the filters_common library (on Mac, Debug build) by 24 MB (over a
|
|
third of the total size).
|
|
|
|
Another great advantage of this approach is that even though it takes less
|
|
time to compile and generates less code, it actually covers more cases.
|
|
Have an array containg values of `Vec<short, 13>`? No problem. The values
|
|
were actually stored in an `ArrayHandleReverse`? It will still work.
|
|
|
|
### `ArrayHandleStride`
|
|
|
|
This functionality is made possible with the new `ArrayHandleStride`. This
|
|
array behaves much like `ArrayHandleBasic`, except that it contains an
|
|
_offset_ parameter to specify where in the buffer array to start reading
|
|
and a _stride_ parameter to specify how many entries to skip for each
|
|
successive entry. `ArrayHandleStride` also has optional parameters
|
|
`divisor` and `modulo` that allow indices to be repeated at regular
|
|
intervals.
|
|
|
|
Here are how `ArrayHandleStride` extracts components from several common
|
|
arrays. For each of these examples, we assume that the `ValueType` of the
|
|
array is `Vec<T, N>`. They are each extracting _component_.
|
|
|
|
#### Extracting from `ArrayHandleBasic`
|
|
|
|
When extracting from an `ArrayHandleBasic`, we just need to start at the
|
|
proper component and skip the length of the `Vec`.
|
|
|
|
* _offset_: _component_
|
|
* _stride_: `N`
|
|
|
|
#### Extracting from `ArrayHandleSOA`
|
|
|
|
Since each component is held in a separate array, they are densly packed.
|
|
Each component could be represented by `ArrayHandleBasic`, but of course we
|
|
use `ArrayHandleStride` to keep the type consistent.
|
|
|
|
* _offset_: 0
|
|
* _stride_: 1
|
|
|
|
#### Extracting from `ArrayHandleCartesianProduct`
|
|
|
|
This array is the basic reason for implementing the _divisor_ and _modulo_
|
|
parameters. Each of the 3 components have different parameters, which are
|
|
the following (given that _dims_[3] captures the size of the 3 arrays for
|
|
each dimension).
|
|
|
|
* _offset_: 0
|
|
* _stride_: 1
|
|
* case _component_ == 0
|
|
* _divisor_: _ignored_
|
|
* _modulo_: _dims_[0]
|
|
* case _component_ == 1
|
|
* _divisor_: _dims_[0]
|
|
* _modulo_: _dims_[1]
|
|
* case _component_ == 2
|
|
* _divisor_: _dims_[0]
|
|
* _modulo_: _ignored_
|
|
|
|
#### Extracting from `ArrayHandleUniformPointCoordinates`
|
|
|
|
This array cannot be represented directly because it is fully implicit.
|
|
However, it can be trivially converted to `ArrayHandleCartesianProduct` in
|
|
typically very little memory. (In fact, EAVL always represented uniform
|
|
point coordinates by explicitly storing a Cartesian product.) Thus, for
|
|
very little overhead the `ArrayHandleStride` can be created.
|
|
|
|
### Runtime overhead of extracting components
|
|
|
|
These benefits come at a cost, but not a large one. The "biggest" cost is
|
|
the small cost of computing index arithmetic for each access into
|
|
`ArrayHandleStride`. To make this as efficient as possible, there are
|
|
conditions that skip over the modulo and divide steps if they are not
|
|
necessary. (Integer modulo and divide tend to take much longer than
|
|
addition and multiplication.) It is for this reason that we probably do not
|
|
want to use this method all the time.
|
|
|
|
Another cost is the fact that not every `ArrayHandle` can be represented by
|
|
`ArrayHandleStride` directly without copying. If you ask to extract a
|
|
component that cannot be directly represented, it will be copied into a
|
|
basic array, which is not great. To make matters worse, for technical
|
|
reasons this copy happens on the host rather than the device.
|
|
|
|
|
|
## `ArrayHandleGroupVecVariable` holds now one more offset.
|
|
|
|
This change affects the usage of both `ConvertNumComponentsToOffsets` and
|
|
`make_ArrayHandleGroupVecVariable`.
|
|
|
|
The reason of this change is to remove a branch in
|
|
`ArrayHandleGroupVecVariable::Get` which is used to avoid an array overflow,
|
|
this in theory would increases the performance since at the CPU level it will
|
|
remove penalties due to wrong branch predictions.
|
|
|
|
The change affects `ConvertNumComponentsToOffsets` by both:
|
|
|
|
1. Increasing the numbers of elements in `offsetsArray` (its second parameter)
|
|
by one.
|
|
|
|
2. Setting `sourceArraySize` as the sum of all the elements plus the new one
|
|
in `offsetsArray`
|
|
|
|
Note that not every specialization of `ConvertNumComponentsToOffsets` does
|
|
return `offsetsArray`. Thus, some of them would not be affected.
|
|
|
|
Similarly, this change affects `make_ArrayHandleGroupVecVariable` since it
|
|
expects its second parameter (offsetsArray) to be one element bigger than
|
|
before.
|
|
|
|
# Control Environment
|
|
|
|
## Algorithms for Control and Execution Environments
|
|
|
|
The `<vtkm/Algorithms.h>` header has been added to provide common STL-style
|
|
generic algorithms that are suitable for use in both the control and execution
|
|
environments. This is necessary as the STL algorithms in the `<algorithm>`
|
|
header are not marked up for use in execution environments such as CUDA.
|
|
|
|
In addition to the markup, these algorithms have convenience overloads to
|
|
support ArrayPortals directly, simplifying their usage with VTK-m data
|
|
structures.
|
|
|
|
Currently, three related algorithms are provided: `LowerBounds`, `UpperBounds`,
|
|
and `BinarySearch`. `BinarySearch` differs from the STL `std::binary_search`
|
|
algorithm in that it returns an iterator (or index) to a matching element,
|
|
rather than just a boolean indicating whether a or not key is present.
|
|
|
|
The new algorithm signatures are:
|
|
|
|
```c++
|
|
namespace vtkm
|
|
{
|
|
|
|
template <typename IterT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
IterT BinarySearch(IterT first, IterT last, const T& val, Comp comp);
|
|
|
|
template <typename IterT, typename T>
|
|
VTKM_EXEC_CONT
|
|
IterT BinarySearch(IterT first, IterT last, const T& val);
|
|
|
|
template <typename PortalT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id BinarySearch(const PortalT& portal, const T& val, Comp comp);
|
|
|
|
template <typename PortalT, typename T>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id BinarySearch(const PortalT& portal, const T& val);
|
|
|
|
template <typename IterT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
IterT LowerBound(IterT first, IterT last, const T& val, Comp comp);
|
|
|
|
template <typename IterT, typename T>
|
|
VTKM_EXEC_CONT
|
|
IterT LowerBound(IterT first, IterT last, const T& val);
|
|
|
|
template <typename PortalT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id LowerBound(const PortalT& portal, const T& val, Comp comp);
|
|
|
|
template <typename PortalT, typename T>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id LowerBound(const PortalT& portal, const T& val);
|
|
|
|
template <typename IterT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
IterT UpperBound(IterT first, IterT last, const T& val, Comp comp);
|
|
|
|
template <typename IterT, typename T>
|
|
VTKM_EXEC_CONT
|
|
IterT UpperBound(IterT first, IterT last, const T& val);
|
|
|
|
template <typename PortalT, typename T, typename Comp>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id UpperBound(const PortalT& portal, const T& val, Comp comp);
|
|
|
|
template <typename PortalT, typename T>
|
|
VTKM_EXEC_CONT
|
|
vtkm::Id UpperBound(const PortalT& portal, const T& val);
|
|
|
|
}
|
|
```
|
|
|
|
# Execution Environment
|
|
|
|
## Scope ExecObjects with Tokens
|
|
|
|
When VTK-m's `ArrayHandle` was originally designed, it was assumed that the
|
|
control environment would run on a single thread. However, multiple users
|
|
have expressed realistic use cases in which they would like to control
|
|
VTK-m from multiple threads (for example, to control multiple devices).
|
|
Consequently, it is important that VTK-m's control classes work correctly
|
|
when used simultaneously from multiple threads.
|
|
|
|
The original `PrepareFor*` methods of `ArrayHandle` returned an object to
|
|
be used in the execution environment on a particular device that pointed to
|
|
data in the array. The pointer to the data was contingent on the state of
|
|
the `ArrayHandle` not changing. The assumption was that the calling code
|
|
would immediately use the returned execution environment object and would
|
|
not further change the `ArrayHandle` until done with the execution
|
|
environment object.
|
|
|
|
This assumption is broken if multiple threads are running in the control
|
|
environment. For example, if one thread has called `PrepareForInput` to get
|
|
an execution array portal, the portal or its data could become invalid if
|
|
another thread calls `PrepareForOutput` on the same array. Initially one
|
|
would think that a well designed program should not share `ArrayHandle`s in
|
|
this way, but there are good reasons to need to do so. For example, when
|
|
using `vtkm::cont::PartitionedDataSet` where multiple partitions share a
|
|
coordinate system (very common), it becomes unsafe to work on multiple
|
|
blocks in parallel on different devices.
|
|
|
|
What we really want is the code to be able to specify more explicitly when
|
|
the execution object is in use. Ideally, the execution object itself would
|
|
maintain the resources it is using. However, that will not work in this
|
|
case since the object has to pass from control to execution environment and
|
|
back. The resource allocation will break when the object is passed to an
|
|
offloaded device and back.
|
|
|
|
Because we cannot use the object itself to manage its own resources, we use
|
|
a proxy object we are calling a `Token`. The `Token` object manages the
|
|
scope of the return execution object. As long as the `Token` is still in
|
|
scope, the execution object will remain valid. When the `Token` is
|
|
destroyed (or `DetachFromAll` is called on it), then the execution object
|
|
is no longer protected.
|
|
|
|
When a `Token` is attached to an `ArrayHandle` to protect an execution
|
|
object, it's read or write mode is recorded. Multiple `Token`s can be
|
|
attached to read the `ArrayHandle` at the same time. However, only one
|
|
`Token` can be used to write to the `ArrayHandle`.
|
|
|
|
### Basic `ArrayHandle` use
|
|
|
|
The basic use of the `PrepareFor*` methods of `ArrayHandle` remain the
|
|
same. The only difference is the addition of a `Token` parameter.
|
|
|
|
``` cpp
|
|
template <typename Device>
|
|
void LowLevelArray(vtkm::cont::ArrayHandle<vtkm::Float32> array, Device)
|
|
{
|
|
vtkm::cont::Token token;
|
|
auto portal = array.PrepareForOutput(ARRAY_SIZE, Device{}, token);
|
|
// At this point, array is locked from anyone else from reading or modifying
|
|
vtkm::cont::DeviceAdapterAlgorithm<Device>::Schedule(MyKernel(portal), ARRAY_SIZE);
|
|
|
|
// When the function finishes, token goes out of scope and array opens up
|
|
// for other uses.
|
|
}
|
|
```
|
|
|
|
### Execution objects
|
|
|
|
To make sure that execution objects are scoped correctly, many changes
|
|
needed to be made to propagate a `Token` reference from the top of the
|
|
scope to where the execution object is actually made. The most noticeable
|
|
place for this was for implementations of
|
|
`vtkm::cont::ExecutionObjectBase`. Most implementations of
|
|
`ExecutionObjectBase` create an object that requires data from an
|
|
`ArrayHandle`.
|
|
|
|
Previously, a subclass of `ExecutionObjectBase` was expected to have a
|
|
method named `PrepareForExecution` that had a single argument: the device
|
|
tag (or id) to make an object for. Now, subclasses of `ExecutionObjectBase`
|
|
should have a `PrepareForExecution` that takes two arguments: the device
|
|
and a `Token` to use for scoping the execution object.
|
|
|
|
``` cpp
|
|
struct MyExecObject : vtkm::cont::ExecutionObjectBase
|
|
{
|
|
vtkm::cont::ArrayHandle<vtkm::Float32> Array;
|
|
|
|
template <typename Device>
|
|
VTKM_CONT
|
|
MyExec<Device> PrepareForExecution(Device device, vtkm::cont::Token& token)
|
|
{
|
|
MyExec<Device> object;
|
|
object.Portal = this->Array.PrepareForInput(device, token);
|
|
return object;
|
|
}
|
|
};
|
|
```
|
|
|
|
It actually still works to use the old style of `PrepareForExecution`.
|
|
However, you will get a deprecation warning (on supported compilers) when
|
|
you try to use it.
|
|
|
|
### Invoke and Dispatcher
|
|
|
|
The `Dispatcher` classes now internally define a `Token` object during the
|
|
call to `Invoke`. (Likewise, `Invoker` will have a `Token` defined during
|
|
its invoke.) This internal `Token` is used when preparing `ArrayHandle`s
|
|
and `ExecutionObject`s for the execution environment. (Details in the next
|
|
section on how that works.)
|
|
|
|
Because the invoke uses a `Token` to protect its arguments, it will block
|
|
the execution of other worklets attempting to access arrays in a way that
|
|
could cause read-write hazards. In the following example, the second
|
|
worklet will not be able to execute until the first worklet finishes.
|
|
|
|
``` cpp
|
|
vtkm::cont::Invoker invoke;
|
|
invoke(Worklet1{}, input, intermediate);
|
|
invoke(Worklet2{}, intermediate, output); // Will not execute until Worklet1 finishes.
|
|
```
|
|
|
|
That said, invocations _can_ share arrays if their use will not cause
|
|
read-write hazards. In particular, two invocations can both use the same
|
|
array if they are both strictly reading from it. In the following example,
|
|
both worklets can potentially execute at the same time.
|
|
|
|
``` cpp
|
|
vtkm::cont::Invoker invoke;
|
|
invoke(Worklet1{}, input, output1);
|
|
invoke(Worklet2{}, input, output2); // Will not block
|
|
```
|
|
|
|
The same `Token` is used for all arguments to the `Worklet`. This deatil is
|
|
important to prevent deadlocks if the same object is used in more than one
|
|
`Worklet` parameter. As a simple example, if a `Worklet` has a control
|
|
signature like
|
|
|
|
``` cpp
|
|
using ControlSignature = void(FieldIn, FieldOut);
|
|
```
|
|
|
|
it should continue to work to use the same array as both fields.
|
|
|
|
``` cpp
|
|
vtkm::cont::Invoker invoke;
|
|
invoke(Worklet1{}, array, array);
|
|
```
|
|
|
|
### Transport
|
|
|
|
The dispatch mechanism of worklets internally uses
|
|
`vtkm::cont::arg::Transport` objects to automatically move data from the
|
|
control environment to the execution environment. These `Transport` object
|
|
now take a `Token` when doing the transportation. This all happens under
|
|
the covers for most users.
|
|
|
|
### Control Portals
|
|
|
|
The `GetPortalConstControl` and `GetPortalControl` methods have been
|
|
deprecated. Instead, the methods `ReadPortal` and `WritePortal` should be
|
|
used. The calling signature is the same as their predecessors, but the
|
|
returned portal contains a reference back to the original `ArrayHandle`.
|
|
The reference keeps track of whether the memory allocation has changed.
|
|
|
|
If the `ArrayHandle` is changed while the `ArrayPortal` still exists,
|
|
nothing will happen immediately. However, if the portal is subsequently
|
|
accessed (i.e. `Set` or `Get` is called on it), then a fatal error will be
|
|
reported to the log.
|
|
|
|
### Deadlocks
|
|
|
|
Now that portal objects from `ArrayHandle`s have finite scope (as opposed
|
|
to able to be immediately invalidated), the scopes have the ability to
|
|
cause operations to block. This can cause issues if the `ArrayHandle` is
|
|
attempted to be used by multiple `Token`s at once.
|
|
|
|
The following is a contrived example of causing a deadlock.
|
|
|
|
``` cpp
|
|
vtkm::cont::Token token1;
|
|
auto portal1 = array.PrepareForInPlace(Device{}, token1);
|
|
|
|
vtkm::cont::Token token2;
|
|
auto portal2 = array.PrepareForInput(Device{}, token2);
|
|
```
|
|
|
|
The last line will deadlock as `PrepareForInput` waits for `token1` to
|
|
detach, which will never happen. To prevent this from happening, if you use
|
|
the same `Token` on the array, it will always allow the action. Thus, the
|
|
following will work fine.
|
|
|
|
``` cpp
|
|
vtkm::cont::Token token;
|
|
|
|
auto portal1 = array.PrepareForInPlace(Device{}, token);
|
|
auto portal2 = array.PrepareForInput(Device{}, token);
|
|
```
|
|
|
|
This prevents deadlock during the invocation of a worklet (so long as no
|
|
intermediate object tries to create its own `Token`, which would be bad
|
|
practice).
|
|
|
|
Deadlocks are more likely when actually running multiple threads in the
|
|
control environment, but still pretty unlikely. One way it can occur is if
|
|
you have one (or more) worklet that has two output fields. You then try to
|
|
run the worklet(s) simultaneously on multiple threads. It could be that one
|
|
thread locks the first output array and the other thread locks the second
|
|
output array.
|
|
|
|
However, having multiple threads trying to write to the same output arrays
|
|
at the same time without its own coordination is probably a bad idea in itself.
|
|
|
|
|
|
## Masks and Scatters Supported for 3D Scheduling
|
|
|
|
Previous to this change worklets that wanted to use non-default
|
|
`vtkm::worklet::Mask` or `vtkm::worklet::Scatter` wouldn't work when scheduled
|
|
to run across `vtkm::cont::CellSetStructured` or other `InputDomains` that
|
|
supported 3D scheduling.
|
|
|
|
This restriction was an inadvertent limitation of the VTK-m worklet scheduling
|
|
algorithm. Lifting the restriction and providing sufficient information has
|
|
been achieved in a manner that shouldn't degrade performance of any existing
|
|
worklets.
|
|
|
|
|
|
## Virtual methods in execution environment deprecated
|
|
|
|
The use of classes with any virtual methods in the execution environment is
|
|
deprecated. Although we had code to correctly build virtual methods on some
|
|
devices such as CUDA, this feature was not universally supported on all
|
|
programming models we wish to support. Plus, the implementation of virtual
|
|
methods is not hugely convenient on CUDA because the virtual methods could
|
|
not be embedded in a library. To get around virtual methods declared in
|
|
different libraries, all builds had to be static, and a special linking
|
|
step to pull in possible virtual method implementations was required.
|
|
|
|
For these reasons, VTK-m is no longer relying on virtual methods. (Other
|
|
approaches like multiplexers are used instead.) The code will be officially
|
|
removed in version 2.0. It is still supported in a deprecated sense (you
|
|
should get a warning). However, if you want to build without virtual
|
|
methods, you can set the `VTKm_NO_DEPRECATED_VIRTUAL` CMake flag, and they
|
|
will not be compiled.
|
|
|
|
|
|
## Deprecate Execute with policy
|
|
|
|
The version of `Filter::Execute` that takes a policy as an argument is now
|
|
deprecated. Filters are now able to specify their own fields and types,
|
|
which is often why you want to customize the policy for an execution. The
|
|
other reason is that you are compiling VTK-m into some other source that
|
|
uses a particular types of storage. However, there is now a mechanism in
|
|
the CMake configuration to allow you to provide a header that customizes
|
|
the "default" types used in filters. This is a much more convenient way to
|
|
compile filters for specific types.
|
|
|
|
One thing that filters were not able to do was to customize what cell sets
|
|
they allowed using. This allows filters to self-select what types of cell
|
|
sets they support (beyond simply just structured or unstructured). To
|
|
support this, the lists `SupportedCellSets`, `SupportedStructuredCellSets`,
|
|
and `SupportedUnstructuredCellSets` have been added to `Filter`. When you
|
|
apply a policy to a cell set, you now have to also provide the filter.
|
|
|
|
# Worklets and Filters
|
|
|
|
## Enable setting invalid value in probe filter
|
|
|
|
Initially, the probe filter would simply not set a value if a sample was
|
|
outside the input `DataSet`. This is not great as the memory could be
|
|
left uninitalized and lead to unpredictable results. The testing
|
|
compared these invalid results to 0, which seemed to work but is
|
|
probably unstable.
|
|
|
|
This was partially fixed by a previous change that consolidated to
|
|
mapping of cell data with a general routine that permuted data. However,
|
|
the fix did not extend to point data in the input, and it was not
|
|
possible to specify a particular invalid value.
|
|
|
|
This change specifically updates the probe filter so that invalid values
|
|
are set to a user-specified value.
|
|
|
|
|
|
## Avoid raising errors when operating on cells
|
|
|
|
Cell operations like interpolate and finding parametric coordinates can
|
|
fail under certain conditions. The previous behavior was to call
|
|
`RaiseError` on the worklet. By design, this would cause the worklet
|
|
execution to fail. However, that makes the worklet unstable for a conditin
|
|
that might be relatively common in data. For example, you wouldn't want a
|
|
large streamline worklet to fail just because one cell was not found
|
|
correctly.
|
|
|
|
To work around this, many of the cell operations in the execution
|
|
environment have been changed to return an error code rather than raise an
|
|
error in the worklet.
|
|
|
|
### Error Codes
|
|
|
|
To support cell operations efficiently returning errors, a new enum named
|
|
`vtkm::ErrorCode` is available. This is the current implementation of
|
|
`ErrorCode`.
|
|
|
|
``` cpp
|
|
enum class ErrorCode
|
|
{
|
|
Success,
|
|
InvalidShapeId,
|
|
InvalidNumberOfPoints,
|
|
WrongShapeIdForTagType,
|
|
InvalidPointId,
|
|
InvalidEdgeId,
|
|
InvalidFaceId,
|
|
SolutionDidNotConverge,
|
|
MatrixFactorizationFailed,
|
|
DegenerateCellDetected,
|
|
MalformedCellDetected,
|
|
OperationOnEmptyCell,
|
|
CellNotFound,
|
|
|
|
UnknownError
|
|
};
|
|
```
|
|
|
|
A convenience function named `ErrorString` is provided to make it easy to
|
|
convert the `ErrorCode` to a descriptive string that can be placed in an
|
|
error.
|
|
|
|
### New Calling Specification
|
|
|
|
Previously, most execution environment functions took as an argument the
|
|
worklet calling the function. This made it possible to call `RaiseError` on
|
|
the worklet. The result of the operation was typically returned. For
|
|
example, here is how the _old_ version of interpolate was called.
|
|
|
|
``` cpp
|
|
FieldType interpolatedValue =
|
|
vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, worklet);
|
|
```
|
|
|
|
The worklet is now no longer passed to the function. It is no longer needed
|
|
because an error is never directly raised. Instead, an `ErrorCode` is
|
|
returned from the function. Because the `ErrorCode` is returned, the
|
|
computed result of the function is returned by passing in a reference to a
|
|
variable. This is usually placed as the last argument (where the worklet
|
|
used to be). here is the _new_ version of how interpolate is called.
|
|
|
|
``` cpp
|
|
FieldType interpolatedValue;
|
|
vtkm::ErrorCode result =
|
|
vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, interpolatedValue);
|
|
```
|
|
|
|
The success of the operation can be determined by checking that the
|
|
returned `ErrorCode` is equal to `vtkm::ErrorCode::Success`.
|
|
|
|
|
|
## Add atomic free functions
|
|
|
|
Previously, all atomic functions were stored in classes named
|
|
`AtomicInterfaceControl` and `AtomicInterfaceExecution`, which required
|
|
you to know at compile time which device was using the methods. That in
|
|
turn means that anything using an atomic needed to be templated on the
|
|
device it is running on.
|
|
|
|
That can be a big hassle (and is problematic for some code structure).
|
|
Instead, these methods are moved to free functions in the `vtkm`
|
|
namespace. These functions operate like those in `Math.h`. Using
|
|
compiler directives, an appropriate version of the function is compiled
|
|
for the current device the compiler is using.
|
|
|
|
## Flying Edges
|
|
|
|
Added the flying edges contouring algorithm to VTK-m. This algorithm only
|
|
works on structured grids, but operates much faster than the traditional
|
|
Marching Cubes algorithm.
|
|
|
|
The speed of VTK-m's flying edges is comprable to VTK's running on the same
|
|
CPUs. VTK-m's implementation also works well on CUDA hardware.
|
|
|
|
The Flying Edges algorithm was introduced in this paper:
|
|
|
|
Schroeder, W.; Maynard, R. & Geveci, B.
|
|
"Flying edges: A high-performance scalable isocontouring algorithm."
|
|
Large Data Analysis and Visualization (LDAV), 2015.
|
|
DOI 10.1109/LDAV.2015.7348069
|
|
|
|
|
|
## Filters specify their own field types
|
|
|
|
Previously, the policy specified which field types the filter should
|
|
operate on. The filter could remove some types, but it was not able to
|
|
add any types.
|
|
|
|
This is backward. Instead, the filter should specify what types its
|
|
supports and the policy may cull out some of those.
|
|
|
|
# Build
|
|
|
|
## Disable asserts for CUDA architecture builds
|
|
|
|
`assert` is supported on recent CUDA cards, but compiling it appears to be
|
|
very slow. By default, the `VTKM_ASSERT` macro has been disabled whenever
|
|
compiling for a CUDA device (i.e. when `__CUDA_ARCH__` is defined).
|
|
|
|
Asserts for CUDA devices can be turned back on by turning the
|
|
`VTKm_NO_ASSERT_CUDA` CMake variable off. Turning this CMake variable off
|
|
will enable assertions in CUDA kernels unless there is another reason
|
|
turning off all asserts (such as a release build).
|
|
|
|
## Disable asserts for HIP architecture builds
|
|
|
|
`assert` is supported on recent HIP cards, but compiling it is very slow,
|
|
as it triggers the usage of `printf` which. Currently (ROCm 3.7) `printf`
|
|
has a severe performance penalty and should be avoided when possible.
|
|
By default, the `VTKM_ASSERT` macro has been disabled whenever compiling
|
|
for a HIP device via kokkos.
|
|
|
|
Asserts for HIP devices can be turned back on by turning the
|
|
`VTKm_NO_ASSERT_HIP` CMake variable off. Turning this CMake variable off
|
|
will enable assertions in HIP kernels unless there is another reason
|
|
turning off all asserts (such as a release build).
|
|
|
|
## Add VTKM_DEPRECATED macro
|
|
|
|
The `VTKM_DEPRECATED` macro allows us to remove (and usually replace)
|
|
features from VTK-m in minor releases while still following the conventions
|
|
of semantic versioning. The idea is that when we want to remove or replace
|
|
a feature, we first mark the old feature as deprecated. The old feature
|
|
will continue to work, but compilers that support it will start to issue a
|
|
warning that the use is deprecated and should stop being used. The
|
|
deprecated features should remain viable until at least the next major
|
|
version. At the next major version, deprecated features from the previous
|
|
version may be removed.
|
|
|
|
### Declaring things deprecated
|
|
|
|
Classes and methods are marked deprecated using the `VTKM_DEPRECATED`
|
|
macro. The first argument of `VTKM_DEPRECATED` should be set to the first
|
|
version in which the feature is deprecated. For example, if the last
|
|
released version of VTK-m was 1.5, and on the master branch a developer
|
|
wants to deprecate a class foo, then the `VTKM_DEPRECATED` release version
|
|
should be given as 1.6, which will be the next minor release of VTK-m. The
|
|
second argument of `VTKM_DEPRECATED`, which is optional but highly
|
|
encouraged, is a short message that should clue developers on how to update
|
|
their code to the new changes. For example, it could point to the
|
|
replacement class or method for the changed feature.
|
|
|
|
`VTKM_DEPRECATED` can be used to deprecate a class by adding it between the
|
|
`struct` or `class` keyword and the class name.
|
|
|
|
``` cpp
|
|
struct VTKM_DEPRECATED(1.6, "OldClass replaced with NewClass.") OldClass
|
|
{
|
|
};
|
|
```
|
|
|
|
Aliases can similarly be depreciated, except the `VTKM_DEPRECATED` macro
|
|
goes after the name in this case.
|
|
|
|
``` cpp
|
|
using OldAlias VTKM_DEPRECATED(1.6, "Use NewClass instead.") = NewClass;
|
|
```
|
|
|
|
Functions and methods are marked as deprecated by adding `VTKM_DEPRECATED`
|
|
as a modifier before the return value and any markup (VTKM_CONT, VTKM_EXEC, or VTKM_EXEC_CONT).
|
|
|
|
``` cpp
|
|
VTKM_DEPRECATED(1.6, "You must now specify a tolerance.") void ImportantMethod(double x)
|
|
VTKM_EXEC_CONT
|
|
{
|
|
this->ImportantMethod(x, 1e-6);
|
|
}
|
|
```
|
|
|
|
`enum`s can be deprecated like classes using similar syntax.
|
|
|
|
``` cpp
|
|
enum struct VTKM_DEPRECATED(1.7, "Use NewEnum instead.") OldEnum
|
|
{
|
|
OLD_VALUE
|
|
};
|
|
```
|
|
|
|
Individual items in an `enum` can also be marked as deprecated and
|
|
intermixed with regular items.
|
|
|
|
``` cpp
|
|
enum struct NewEnum
|
|
{
|
|
OLD_VALUE1 VTKM_DEPRECATED(1.7, "Use NEW_VALUE instead."),
|
|
NEW_VALUE,
|
|
OLD_VALUE2 VTKM_DEPRECATED(1.7) = 42
|
|
};
|
|
```
|
|
|
|
### Using deprecated items
|
|
|
|
Using deprecated items should work, but the compiler will give a warning.
|
|
That is the point. However, sometimes you need to legitimately use a
|
|
deprecated item without a warning. This is usually because you are
|
|
implementing another deprecated item or because you have a test for a
|
|
deprecated item (that can be easily removed with the deprecated bit). To
|
|
support this a pair of macros, `VTKM_DEPRECATED_SUPPRESS_BEGIN` and
|
|
`VTKM_DEPRECATED_SUPPRESS_END` are provided. Code that legitimately uses
|
|
deprecated items should be wrapped in these macros.
|
|
|
|
``` cpp
|
|
VTKM_DEPRECATED(1.6, "You must now specify both a value and tolerance.")
|
|
VTKM_EXEC_CONT
|
|
void ImportantMethod()
|
|
{
|
|
// It can be the case that to implement a deprecated method you need to
|
|
// use other deprecated features. To do that, just temporarily suppress
|
|
// those warnings.
|
|
VTKM_DEPRECATED_SUPPRESS_BEGIN
|
|
this->ImportantMethod(0.0);
|
|
VTKM_DEPRECATED_SUPPRESS_END
|
|
}
|
|
```
|
|
|
|
# Other
|
|
|
|
## Porting layer for future std features
|
|
|
|
Currently, VTK-m is using C++11. However, it is often useful to use
|
|
features in the `std` namespace that are defined for C++14 or later. We can
|
|
provide our own versions (sometimes), but it is preferable to use the
|
|
version provided by the compiler if available.
|
|
|
|
There were already some examples of defining portable versions of C++14 and
|
|
C++17 classes in a `vtkmstd` namespace, but these were sprinkled around the
|
|
source code.
|
|
|
|
There is now a top level `vtkmstd` directory and in it are header files
|
|
that provide portable versions of these future C++ classes. In each case,
|
|
preprocessor macros are used to select which version of the class to use.
|
|
|
|
|
|
## Removed OpenGL Rendering Classes
|
|
|
|
When the rendering library was first built, OpenGL was used to implement
|
|
the components (windows, mappers, annotation, etc.). However, as the native
|
|
ray casting became viable, the majority of the work has focused on using
|
|
that. Since then, the original OpenGL classes have been largely ignored.
|
|
|
|
It has for many months been determined that it is not work attempting to
|
|
maintain two different versions of the rendering libraries as features are
|
|
added and changed. Thus, the OpenGL classes have fallen out of date and did
|
|
not actually work.
|
|
|
|
These classes have finally been officially removed.
|
|
|
|
|
|
## Reorganization of `io` directory
|
|
|
|
The `vtkm/io` directory has been flattened.
|
|
Namely, the files in `vtkm/io/reader` and `vtkm/io/writer` have been moved up into `vtkm/io`,
|
|
with the associated changes in namespaces.
|
|
|
|
In addition, `vtkm/cont/EncodePNG.h` and `vtkm/cont/DecodePNG.h` have been moved to a more natural home in `vtkm/io`.
|
|
|
|
|
|
## Implemented PNG/PPM image Readers/Writers
|
|
|
|
The original implementation of writing image data was only performed as a
|
|
proxy through the Canvas rendering class. In order to implement true support
|
|
for image-based regression testing, this interface needed to be expanded upon
|
|
to support reading/writing arbitrary image data and storing it in a `vtkm::DataSet`.
|
|
Using the new `vtkm::io::PNGReader` and `vtkm::io::PPMReader` it is possible
|
|
to read data from files and Cavases directly and store them as a point field
|
|
in a 2D uniform `vtkm::DataSet`
|
|
|
|
```cpp
|
|
auto reader = vtkm::io::PNGReader();
|
|
auto imageDataSet = reader.ReadFromFile("read_image.png");
|
|
```
|
|
|
|
Similarly, the new `vtkm::io::PNGWriter` and `vtkm::io::PPMWriter` make it possible
|
|
to write out a 2D uniform `vtkm::DataSet` directly to a file.
|
|
|
|
```cpp
|
|
auto writer = vtkm::io::PNGWriter();
|
|
writer.WriteToFile("write_image.png", imageDataSet);
|
|
```
|
|
|
|
If canvas data is to be written out, the reader provides a method for converting
|
|
a canvas's data to a `vtkm::DataSet`.
|
|
|
|
```cpp
|
|
auto reader = vtkm::io::PNGReader();
|
|
auto dataSet = reader.CreateImageDataSet(canvas);
|
|
auto writer = vtkm::io::PNGWriter();
|
|
writer.WriteToFile("output.png", dataSet);
|
|
```
|
|
|
|
|
|
## Updated Benchmark Framework
|
|
|
|
The benchmarking framework has been updated to use Google Benchmark.
|
|
|
|
A benchmark is now a single function, which is passed to a macro:
|
|
|
|
```
|
|
void MyBenchmark(::benchmark::State& state)
|
|
{
|
|
MyClass someClass;
|
|
|
|
// Optional: Add a descriptive label with additional benchmark details:
|
|
state.SetLabel("Blah blah blah.");
|
|
|
|
// Must use a vtkm timer to properly capture eg. CUDA execution times.
|
|
vtkm::cont::Timer timer;
|
|
for (auto _ : state)
|
|
{
|
|
someClass.Reset();
|
|
|
|
timer.Start();
|
|
someClass.DoWork();
|
|
timer.Stop();
|
|
|
|
state.SetIterationTime(timer.GetElapsedTime());
|
|
}
|
|
|
|
// Optional: Report items and/or bytes processed per iteration in output:
|
|
state.SetItemsProcessed(state.iterations() * someClass.GetNumberOfItems());
|
|
state.SetBytesProcessed(state.iterations() * someClass.GetNumberOfBytes());
|
|
}
|
|
}
|
|
VTKM_BENCHMARK(MyBenchmark);
|
|
```
|
|
|
|
Google benchmark also makes it easy to implement parameter sweep benchmarks:
|
|
|
|
```
|
|
void MyParameterSweep(::benchmark::State& state)
|
|
{
|
|
// The current value in the sweep:
|
|
const vtkm::Id currentValue = state.range(0);
|
|
|
|
MyClass someClass;
|
|
someClass.SetSomeParameter(currentValue);
|
|
|
|
vtkm::cont::Timer timer;
|
|
for (auto _ : state)
|
|
{
|
|
someClass.Reset();
|
|
|
|
timer.Start();
|
|
someClass.DoWork();
|
|
timer.Stop();
|
|
|
|
state.SetIterationTime(timer.GetElapsedTime());
|
|
}
|
|
}
|
|
VTKM_BENCHMARK_OPTS(MyBenchmark, ->ArgName("Param")->Range(32, 1024 * 1024));
|
|
```
|
|
|
|
will generate and launch several benchmarks, exploring the parameter space of
|
|
`SetSomeParameter` between the values of 32 and (1024*1024). The chain of
|
|
functions calls in the second argument is applied to an instance of
|
|
::benchmark::internal::Benchmark. See Google Benchmark's documentation for
|
|
more details.
|
|
|
|
For more complex benchmark configurations, the VTKM_BENCHMARK_APPLY macro
|
|
accepts a function with the signature
|
|
`void Func(::benchmark::internal::Benchmark*)` that may be used to generate
|
|
more complex configurations.
|
|
|
|
To instantiate a templated benchmark across a list of types, the
|
|
VTKM_BENCHMARK_TEMPLATE* macros take a vtkm::List of types as an additional
|
|
parameter. The templated benchmark function will be instantiated and called
|
|
for each type in the list:
|
|
|
|
```
|
|
template <typename T>
|
|
void MyBenchmark(::benchmark::State& state)
|
|
{
|
|
MyClass<T> someClass;
|
|
|
|
// Must use a vtkm timer to properly capture eg. CUDA execution times.
|
|
vtkm::cont::Timer timer;
|
|
for (auto _ : state)
|
|
{
|
|
someClass.Reset();
|
|
|
|
timer.Start();
|
|
someClass.DoWork();
|
|
timer.Stop();
|
|
|
|
state.SetIterationTime(timer.GetElapsedTime());
|
|
}
|
|
}
|
|
}
|
|
VTKM_BENCHMARK_TEMPLATE(MyBenchmark, vtkm::List<vtkm::Float32, vtkm::Vec3f_32>);
|
|
```
|
|
|
|
The benchmarks are executed by calling the `VTKM_EXECUTE_BENCHMARKS(argc, argv)`
|
|
macro from `main`. There is also a `VTKM_EXECUTE_BENCHMARKS_PREAMBLE(argc, argv, some_string)`
|
|
macro that appends the contents of `some_string` to the Google Benchmark preamble.
|
|
|
|
If a benchmark is not compatible with some configuration, it may call
|
|
`state.SkipWithError("Error message");` on the `::benchmark::State` object and return. This is
|
|
useful, for instance in the filter tests when the input is not compatible with the filter.
|
|
|
|
When launching a benchmark executable, the following options are supported by Google Benchmark:
|
|
|
|
- `--benchmark_list_tests`: List all available tests.
|
|
- `--benchmark_filter="[regex]"`: Only run benchmark with names that match `[regex]`.
|
|
- `--benchmark_filter="-[regex]"`: Only run benchmark with names that DON'T match `[regex]`.
|
|
- `--benchmark_min_time=[float]`: Make sure each benchmark repetition gathers `[float]` seconds
|
|
of data.
|
|
- `--benchmark_repetitions=[int]`: Run each benchmark `[int]` times and report aggregate statistics
|
|
(mean, stdev, etc). A "repetition" refers to a single execution of the benchmark function, not
|
|
an "iteration", which is a loop of the `for(auto _:state){...}` section.
|
|
- `--benchmark_report_aggregates_only="true|false"`: If true, only the aggregate statistics are
|
|
reported (affects both console and file output). Requires `--benchmark_repetitions` to be useful.
|
|
- `--benchmark_display_aggregates_only="true|false"`: If true, only the aggregate statistics are
|
|
printed to the terminal. Any file output will still contain all repetition info.
|
|
- `--benchmark_format="console|json|csv"`: Specify terminal output format: human readable
|
|
(`console`) or `csv`/`json` formats.
|
|
- `--benchmark_out_format="console|json|csv"`: Specify file output format: human readable
|
|
(`console`) or `csv`/`json` formats.
|
|
- `--benchmark_out=[filename]`: Specify output file.
|
|
- `--benchmark_color="true|false"`: Toggle color output in terminal when using `console` output.
|
|
- `--benchmark_counters_tabular="true|false"`: Print counter information (e.g. bytes/sec, items/sec)
|
|
in the table, rather than appending them as a label.
|
|
|
|
For more information and examples of practical usage, take a look at the existing benchmarks in
|
|
vtk-m/benchmarking/.
|
|
|
|
|
|
## Provide scripts to build Gitlab-ci workers locally
|
|
|
|
To simplify reproducing docker based CI workers locally, VTK-m has python program that handles all the
|
|
work automatically for you.
|
|
|
|
The program is located in `[Utilities/CI/reproduce_ci_env.py ]` and requires python3 and pyyaml.
|
|
|
|
To use the program is really easy! The following two commands will create the `build:rhel8` gitlab-ci
|
|
worker as a docker image and setup a container just as how gitlab-ci would be before the actual
|
|
compilation of VTK-m. Instead of doing the compilation, instead you will be given an interactive shell.
|
|
|
|
```
|
|
./reproduce_ci_env.py create rhel8
|
|
./reproduce_ci_env.py run rhel8
|
|
```
|
|
|
|
To compile VTK-m from the the interactive shell you would do the following:
|
|
```
|
|
> src]## cd build/
|
|
> build]## cmake --build .
|
|
```
|
|
|
|
|
|
## Replaced `vtkm::ListTag` with `vtkm::List`
|
|
|
|
The original `vtkm::ListTag` was designed when we had to support compilers
|
|
that did not provide C++11's variadic templates. Thus, the design hides
|
|
type lists, which were complicated to support.
|
|
|
|
Now that we support C++11, variadic templates are trivial and we can easily
|
|
create templated type aliases with `using`. Thus, it is now simpler to deal
|
|
with a template that lists types directly.
|
|
|
|
Hence, `vtkm::ListTag` is deprecated and `vtkm::List` is now supported. The
|
|
main difference between the two is that whereas `vtkm::ListTag` allowed you
|
|
to create a list by subclassing another list, `vtkm::List` cannot be
|
|
subclassed. (Well, it can be subclassed, but the subclass ceases to be
|
|
considered a list.) Thus, where before you would declare a list like
|
|
|
|
``` cpp
|
|
struct MyList : vtkm::ListTagBase<Type1, Type2, Type3>
|
|
{
|
|
};
|
|
```
|
|
|
|
you now make an alias
|
|
|
|
``` cpp
|
|
using MyList = vtkm::List<Type1, Type2, Type3>;
|
|
```
|
|
|
|
If the compiler reports the `MyList` type in an error or warning, it
|
|
actually uses the fully qualified `vtkm::List<Type1, Type2, Type3>`.
|
|
Although this makes errors more verbose, it makes it easier to diagnose
|
|
problems because the types are explicitly listed.
|
|
|
|
The new `vtkm::List` comes with a list of utility templates to manipulate
|
|
lists that mostly mirrors those in `vtkm::ListTag`: `VTKM_IS_LIST`,
|
|
`ListApply`, `ListSize`, `ListAt`, `ListIndexOf`, `ListHas`, `ListAppend`,
|
|
`ListIntersect`, `ListTransform`, `ListRemoveIf`, and `ListCross`. All of
|
|
these utilities become `vtkm::List<>` types (where applicable), which makes
|
|
them more consistent than the old `vtkm::ListTag` versions.
|
|
|
|
Thus, if you have a declaration like
|
|
|
|
``` cpp
|
|
vtkm::ListAppend(vtkm::List<Type1a, Type2a>, vtkm::List<Type1b, Type2b>>
|
|
```
|
|
|
|
this gets changed automatically to
|
|
|
|
``` cpp
|
|
vtkm::List<Type1a, Type2a, Type1b, Type2b>
|
|
```
|
|
|
|
This is in contrast to the equivalent old version, which would create a new
|
|
type for `vtkm::ListTagAppend` in addition to the ultimate actual list it
|
|
constructs.
|
|
|
|
|
|
## Add `ListTagRemoveIf`
|
|
|
|
It is sometimes useful to remove types from `ListTag`s. This is especially
|
|
the case when combining lists of types together where some of the type
|
|
combinations may be invalid and should be removed. To handle this
|
|
situation, a new `ListTag` type is added: `ListTagRemoveIf`.
|
|
|
|
`ListTagRemoveIf` is a template structure that takes two arguments. The
|
|
first argument is another `ListTag` type to operate on. The second argument
|
|
is a template that acts as a predicate. The predicate takes a type and
|
|
declares a Boolean `value` that should be `true` if the type should be
|
|
removed and `false` if the type should remain.
|
|
|
|
Here is an example of using `ListTagRemoveIf` to get basic types that hold
|
|
only integral values.
|
|
|
|
``` cpp
|
|
template <typename T>
|
|
using IsRealValue =
|
|
std::is_same<
|
|
typename vtkm::TypeTraits<typename vtkm::VecTraits<T>::BaseComponentType>::NumericTag,
|
|
vtkm::TypeTraitsRealTag>;
|
|
|
|
using MixedTypes =
|
|
vtkm::ListTagBase<vtkm::Id, vtkm::FloatDefault, vtkm::Id3, vtkm::Vec3f>;
|
|
|
|
using IntegralTypes = vtkm::ListTagRemoveIf<MixedTypes, IsRealValue>;
|
|
// IntegralTypes now equivalent to vtkm::ListTagBase<vtkm::Id, vtkm::Id3>
|
|
```
|
|
|
|
|
|
## Write uniform and rectilinear grids to legacy VTK files
|
|
|
|
As a programming convenience, all `vtkm::cont::DataSet` written by
|
|
`vtkm::io::VTKDataSetWriter` were written as a structured grid. Although
|
|
technically correct, it changed the structure of the data. This meant that
|
|
if you wanted to capture data to run elsewhere, it would run as a different
|
|
data type. This was particularly frustrating if the data of that structure
|
|
was causing problems and you wanted to debug it.
|
|
|
|
Now, `VTKDataSetWriter` checks the type of the `CoordinateSystem` to
|
|
determine whether the data should be written out as `STRUCTURED_POINTS`
|
|
(i.e. a uniform grid), `RECTILINEAR_GRID`, or `STRUCTURED_GRID`
|
|
(curvilinear).
|
|
|
|
# References
|
|
|
|
| Feature | Merge Request |
|
|
| --------------------------------------------------------------------------| ------------------------ |
|
|
| Add Kokkos backend | Merge-request: !2164 |
|
|
| Extract component arrays from unknown arrays | Merge-request: !2354 |
|
|
| `ArrayHandleGroupVecVariable` holds now one more offset. | Merge-request: !1964 |
|
|
| Create `ArrayHandleOffsetsToNumComponents` | Merge-request: !2299 |
|
|
| Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal | Merge-request: !2116 |
|
|
| `ArrayRangeCompute` works on any array type without compiling device code | Merge-request: !2409 |
|
|
| Algorithms for Control and Execution Environments | Merge-request: !1920 |
|
|
| Redesign of ArrayHandle to access data using typeless buffers | Merge-request: !2347 |
|
|
| `vtkm::cont::internal::Buffer` now can have ownership transferred | Merge-request: !2200 |
|
|
| Provide scripts to build Gitlab-ci workers locally | Merge-request: !2030 |
|
|
| Configurable default types | Merge-request: !1997 |
|
|
| Result DataSet of coordinate transform has its CoordinateSystem changed | Merge-request: !2099 |
|
|
| Precompiled `ArrayCopy` for `UnknownArrayHandle` | Merge-request: !2396 |
|
|
| Disable asserts for CUDA architecture builds | Merge-request: !2157 |
|
|
| Portals may advertise custom iterators | Merge-request: !1929 |
|
|
| DataSet now only allows unique field names | Merge-request: !2099 |
|
|
| ArrayHandleDecorator Allocate and Shrink Support | Merge-request: !1933 |
|
|
| Deprecate ArrayHandleVirtualCoordinates | Merge-request: !2177 |
|
|
| Deprecate `DataSetFieldAdd` | Merge-request: !2106 |
|
|
| Deprecate Execute with policy | Merge-request: !2093 |
|
|
| Virtual methods in execution environment deprecated | Merge-request: !2256 |
|
|
| Add VTKM_DEPRECATED macro | Merge-request: !2266 |
|
|
| Filters specify their own field types | Merge-request: !2099 |
|
|
| Flying Edges | Merge-request: !2099 |
|
|
| Updated Benchmark Framework | Merge-request: !1936 |
|
|
| Disable asserts for HIP architecture builds | Merge-request: !2270 |
|
|
| Implemented PNG/PPM image Readers/Writers | Merge-request: !1967 |
|
|
| Reorganization of `io` directory | Merge-request: !2067 |
|
|
| Add `ListTagRemoveIf` | Merge-request: !1901 |
|
|
| Masks and Scatters Supported for 3D Scheduling | Merge-request: !1975 |
|
|
| Improvements to moving data into ArrayHandle | Merge-request: !2184 |
|
|
| Avoid raising errors when operating on cells | Merge-request: !2099 |
|
|
| Order asynchronous `ArrayHandle` access | Merge-request: !2130 |
|
|
| Enable setting invalid value in probe filter | Merge-request: !2122 |
|
|
| `ReadPortal().Get(idx)` | Merge-request: !2078 |
|
|
| Recombine extracted component arrays from unknown arrays | Merge-request: !2381 |
|
|
| Removed old `ArrayHandle` transfer mechanism | Merge-request: !2347 |
|
|
| Removed OpenGL Rendering Classes | Merge-request: !2099 |
|
|
| Scope ExecObjects with Tokens | Merge-request: !1988 |
|
|
| Shorter fancy array handle classnames | Merge-request: !1937 |
|
|
| Support `ArrayHandleSOA` as a "default" array | Merge-request: !2349 |
|
|
| Porting layer for future std features | Merge-request: !1977 |
|
|
| Add a vtkm::Tuple class | Merge-request: !1977 |
|
|
| UnknownArrayHandle and UncertainArrayHandle for runtime-determined types | Merge-request: !2202 |
|
|
| Added VecFlat class | Merge-request: !2354 |
|
|
| Remove VTKDataSetWriter::WriteDataSet just_points parameter | Merge-request: !2185 |
|
|
| Move VTK file readers and writers into vtkm_io | Merge-request: !2100 |
|
|
| Write uniform and rectilinear grids to legacy VTK files | Merge-request: !2173 |
|
|
| Add atomic free functions | Merge-request: !2223 |
|
|
| Replaced `vtkm::ListTag` with `vtkm::List` | Merge-request: !1918 |
|