The major changes to VTK-m from 1.5.1 can be found in: docs/changelog/1.6.0/release-notes.md Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
112 KiB
VTK-m 1.6 Release Notes
Table of Contents
- Add Kokkos backend
- Deprecate
DataSetFieldAdd
- Move VTK file readers and writers into vtkm_io
- Remove VTKDataSetWriter::WriteDataSet just_points parameter
- Added VecFlat class
- Add a vtkm::Tuple class
- DataSet now only allows unique field names
- Result DataSet of coordinate transform has its CoordinateSystem changed
vtkm::cont::internal::Buffer
now can have ownership transferred- Configurable default types
- Shorter fancy array handle classnames
ReadPortal().Get(idx)
- Precompiled
ArrayCopy
forUnknownArrayHandle
- Create
ArrayHandleOffsetsToNumComponents
- Recombine extracted component arrays from unknown arrays
- UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
- Support
ArrayHandleSOA
as a "default" array - Removed old
ArrayHandle
transfer mechanism - Order asynchronous
ArrayHandle
access - Improvements to moving data into ArrayHandle
- Deprecate ArrayHandleVirtualCoordinates
- ArrayHandleDecorator Allocate and Shrink Support
- Portals may advertise custom iterators
- Redesign of ArrayHandle to access data using typeless buffers
ArrayRangeCompute
works on any array type without compiling device code- Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
- Extract component arrays from unknown arrays
ArrayHandleGroupVecVariable
holds now one more offset.
- Algorithms for Control and Execution Environments
- Scope ExecObjects with Tokens
- Masks and Scatters Supported for 3D Scheduling
- Virtual methods in execution environment deprecated
- Deprecate Execute with policy
- Enable setting invalid value in probe filter
- Avoid raising errors when operating on cells
- Add atomic free functions
- Flying Edges
- Filters specify their own field types
- Disable asserts for CUDA architecture builds
- Disable asserts for HIP architecture builds
- Add VTKM_DEPRECATED macro
- Porting layer for future std features
- Removed OpenGL Rendering Classes
- Reorganization of
io
directory - Implemented PNG/PPM image Readers/Writers
- Updated Benchmark Framework
- Provide scripts to build Gitlab-ci workers locally
- Replaced
vtkm::ListTag
withvtkm::List
- Add
ListTagRemoveIf
- Write uniform and rectilinear grids to legacy VTK files
Core
Add Kokkos backend
Adds a new device backend Kokkos
which uses the kokkos library for parallelism.
User must provide the kokkos build and Vtk-m will use the default configured execution
space.
Deprecate DataSetFieldAdd
The class vtkm::cont::DataSetFieldAdd
is now deprecated.
Its methods, AddPointField
and AddCellField
have been moved to member functions
of vtkm::cont::DataSet
, which simplifies many calls.
For example, the following code
vtkm::cont::DataSetFieldAdd fieldAdder;
fieldAdder.AddCellField(dataSet, "cellvar", values);
would now be
dataSet.AddCellField("cellvar", values);
Move VTK file readers and writers into vtkm_io
The legacy VTK file reader and writer were created back when VTK-m was a header-only library. Things have changed and we now compile quite a bit of code into libraries. At this point, there is no reason why the VTK file reader/writer should be any different.
Thus, VTKDataSetReader
, VTKDataSetWriter
, and several supporting
classes are now compiled into the vtkm_io
library. Also similarly updated
BOVDataSetReader
for good measure.
As a side effect, code using VTK-m will need to link to vtkm_io
if they
are using any readers or writers.
Remove VTKDataSetWriter::WriteDataSet just_points parameter
In the method VTKDataSetWriter::WriteDataSet
, just_points
parameter has been
removed due to lack of usage.
The purpose of just_points
was to allow exporting only the points of a
DataSet without its cell data.
Added VecFlat class
vtkm::VecFlat
is a wrapper around a Vec
-like class that may be a nested
series of vectors. For example, if you run a gradient operation on a vector
field, you are probably going to get a Vec
of Vec
s that looks something
like vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 3>
. That is fine, but what if
you want to treat the result simply as a Vec
of size 9?
The VecFlat
wrapper class allows you to do this. Simply place the nested
Vec
as an argument to VecFlat
and it will behave as a flat Vec
class.
(In fact, VecFlat
is a subclass of Vec
.) The VecFlat
class can be
copied to and from the nested Vec
it is wrapping.
There is a vtkm::make_VecFlat
convenience function that takes an object
and returns a vtkm::VecFlat
wrapped around it.
VecFlat
works with any Vec
-like object as well as scalar values.
However, any type used with VecFlat
must have VecTraits
defined and the
number of components must be static (i.e. known at compile time).
Add a vtkm::Tuple class
This change added a vtkm::Tuple
class that is very similar in nature to
std::tuple
. This should replace our use of tao tuple.
The motivation for this change was some recent attempts at removing objects
like Invocation
and FunctionInterface
. I expected these changes to
speed up the build, but in fact they ended up slowing down the build. I
believe the problem was that these required packing variable parameters
into a tuple. I was using the tao tuple
class, but it seemed to slow down
the compile. (That is, compiling tao's tuple
seemed much slower than
compiling the equivalent FunctionInterface
class.)
The implementation of vtkm::Tuple
is using pyexpander
to build lots of
simple template cases for the object (with a backup implementation for even
longer argument lists). I believe the compiler is better and parsing
through thousands of lines of simple templates than to employ clever MPL to
build general templates.
Usage
The vtkm::Tuple
class is defined in the vtkm::Tuple.h
header file. A
Tuple
is designed to behave much like a std::tuple
with some minor
syntax differences to fit VTK-m coding standards.
A tuple is declared with a list of template argument types.
vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>> myTuple;
If given no arguments, a vtkm::Tuple
will default-construct its contained
objects. A vtkm::Tuple
can also be constructed with the initial values of
all contained objects.
vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>>
myTuple(0, vtkm::Vec3f(0, 1, 2), array);
For convenience there is a vtkm::MakeTuple
function that takes arguments
and packs them into a Tuple
of the appropriate type. (There is also a
vtkm::make_tuple
alias to the function to match the std
version.)
auto myTuple = vtkm::MakeTuple(0, vtkm::Vec3f(0, 1, 2), array);
Data is retrieved from a Tuple
by using the vtkm::Get
method. The Get
method is templated on the index to get the value from. The index is of
type vtkm::IdComponent
. (There is also a vtkm::get
that uses a
std::size_t
as the index type as an alias to the function to match the
std
version.)
vtkm::Id a = vtkm::Get<0>(myTuple);
vtkm::Vec3f b = vtkm::Get<1>(myTuple);
vtkm::cont::ArrayHandle<vtkm::Float32> c = vtkm::Get<2>(myTuple);
Likewise vtkm::TupleSize
and vtkm::TupleElement
(and their aliases
vtkm::Tuple_size
, vtkm::tuple_element
, and vtkm::tuple_element_t
) are
provided.
Extended Functionality
The vtkm::Tuple
class contains some functionality beyond that of
std::tuple
to cover some common use cases in VTK-m that are tricky to
implement. In particular, these methods allow you to use a Tuple
as you
would commonly use parameter packs. This allows you to stash parameter
packs in a Tuple
and then get them back out again.
For Each
vtkm::Tuple::ForEach()
is a method that takes a function or functor and
calls it for each of the items in the tuple. Nothing is returned from
ForEach
and any return value from the function is ignored.
ForEach
can be used to check the validity of each item.
void CheckPositive(vtkm::Float64 x)
{
if (x < 0)
{
throw vtkm::cont::ErrorBadValue("Values need to be positive.");
}
}
// ...
vtkm::Tuple<vtkm::Float64, vtkm::Float64, vtkm::Float64> tuple(
CreateValue1(), CreateValue2(), CreateValue3());
// Will throw an error if any of the values are negative.
tuple.ForEach(CheckPositive);
ForEach
can also be used to aggregate values.
struct SumFunctor
{
vtkm::Float64 Sum = 0;
template <typename T>
void operator()(const T& x)
{
this->Sum = this->Sum + static_cast<vtkm::Float64>(x);
}
};
// ...
vtkm::Tuple<vtkm::Float32, vtkm::Float64, vtkm::Id> tuple(
CreateValue1(), CreateValue2(), CreateValue3());
SumFunctor sum;
tuple.ForEach(sum);
vtkm::Float64 average = sum.Sum / 3;
Transform
vtkm::Tuple::Transform
is a method that builds a new Tuple
by calling a
function or functor on each of the items. The return value is placed in the
corresponding part of the resulting Tuple
, and the type is automatically
created from the return type of the function.
struct GetReadPortalFunctor
{
template <typename Array>
typename Array::ReadPortal operator()(const Array& array) const
{
VTKM_IS_ARRAY_HANDLE(Array);
return array.ReadPortal();
}
};
// ...
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
auto portalTuple = arrayTuple.Transform(GetReadPortalFunctor{});
Apply
vtkm::Tuple::Apply
is a method that calls a function or functor using the
objects in the Tuple
as the arguments. If the function returns a value,
that value is returned from Apply
.
struct AddArraysFunctor
{
template <typename Array1, typename Array2, typename Array3>
vtkm::Id operator()(Array1 inArray1, Array2 inArray2, Array3 outArray) const
{
VTKM_IS_ARRAY_HANDLE(Array1);
VTKM_IS_ARRAY_HANDLE(Array2);
VTKM_IS_ARRAY_HANDLE(Array3);
vtkm::Id length = inArray1.GetNumberOfValues();
VTKM_ASSERT(inArray2.GetNumberOfValues() == length);
outArray.Allocate(length);
auto inPortal1 = inArray1.ReadPortal();
auto inPortal2 = inArray2.ReadPortal();
auto outPortal = outArray.WritePortal();
for (vtkm::Id index = 0; index < length; ++index)
{
outPortal.Set(index, inPortal1.Get(index) + inPortal2.Get(index));
}
return length;
}
};
// ...
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
vtkm::Id arrayLength = arrayTuple.Apply(AddArraysFunctor{});
If additional arguments are given to Apply
, they are also passed to the
function (before the objects in the Tuple
). This is helpful for passing
state to the function. (This feature is not available in either ForEach
or Transform
for technical implementation reasons.)
struct ScanArrayLengthFunctor
{
template <std::size_t N, typename Array, typename... Remaining>
std::array<vtkm::Id, N + 1 + sizeof...(Remaining)>
operator()(const std::array<vtkm::Id, N>& partialResult,
const Array& nextArray,
const Remaining&... remainingArrays) const
{
std::array<vtkm::Id, N + 1> nextResult;
std::copy(partialResult.begin(), partialResult.end(), nextResult.begin());
nextResult[N] = nextResult[N - 1] + nextArray.GetNumberOfValues();
return (*this)(nextResult, remainingArray);
}
template <std::size_t N>
std::array<vtkm::Id, N> operator()(const std::array<vtkm::Id, N>& result) const
{
return result;
}
};
// ...
auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
std::array<vtkm::Id, 4> =
arrayTuple.Apply(ScanArrayLengthFunctor{}, std::array<vtkm::Id, 1>{ 0 });
DataSet now only allows unique field names
When you add a vtkm::cont::Field
to a vtkm::cont::DataSet
, it now
requires every Field
to have a unique name. When you attempt to add a
Field
to a DataSet
that already has a Field
of the same name and
association, the old Field
is removed and replaced with the new Field
.
You are allowed, however, to have two Field
s with the same name but
different associations. For example, you could have a point Field
named
"normals" and also have a cell Field
named "normals" in the same
DataSet
.
This new behavior matches how VTK's data sets manage fields.
The old behavior allowed you to add multiple Field
s with the same name,
but it would be unclear which one you would get if you asked for a Field
by name.
Result DataSet of coordinate transform has its CoordinateSystem changed
When you run one of the coordinate transform filters,
CylindricalCoordinateTransform
or SphericalCoordinateTransform
, the
transform coordiantes are placed as the first CoordinateSystem
in the
returned DataSet
. This means that after running this filter, the data
will be moved to this new coordinate space.
Previously, the result of these filters was just placed in a named Field
of the output. This caused some confusion because the filter did not seem
to have any effect (unless you knew to modify the output data). Not using
the result as the coordinate system seems like a dubious use case (and not
hard to work around), so this is much better behavior.
vtkm::cont::internal::Buffer
now can have ownership transferred
Memory once transferred to Buffer
always had to be managed by VTK-m. This is problematic
for applications that needed VTK-m to allocate memory, but have the memory ownership
be longer than VTK-m.
Buffer::TakeHostBufferOwnership
allows for easy transfer ownership of memory out of VTK-m.
When taking ownership of an VTK-m buffer you are provided the following information:
- Memory: A
void*
pointer to the array - Container: A
void*
pointer used to free the memory. This is necessary to support cases such as allocations transferred into VTK-m from astd::vector
. - Delete: The function to call to actually delete the transferred memory
- Reallocate: The function to call to re-allocate the transferred memory. This will throw an exception if users try to reallocate a buffer that was 'view' only
- Size: The size in number of elements of the array
To properly steal memory from VTK-m you do the following:
vtkm::cont::ArrayHandle<T> arrayHandle;
...
auto stolen = arrayHandle.GetBuffers()->TakeHostBufferOwnership();
...
stolen.Delete(stolen.Container);
Configurable default types
Because VTK-m compiles efficient code for accelerator architectures, it
often has to compile for static types. This means that dynamic types often
have to be determined at runtime and converted to static types. This is the
reason for the CastAndCall
architecture in VTK-m.
For this CastAndCall
to work, there has to be a finite set of static
types to try at runtime. If you don't compile in the types you need, you
will get runtime errors. However, the more types you compile in, the longer
the compile time and executable size. Thus, getting the types right is
important.
The "right" types to use can change depending on the application using VTK-m. For example, when VTK links in VTK-m, it needs to support lots of types and can sacrifice the compile times to do so. However, if using VTK-m in situ with a fortran simulation, space and time are critical and you might only need to worry about double SoA arrays.
Thus, it is important to customize what types VTK-m uses based on the application. This leads to the oxymoronic phrase of configuring the default types used by VTK-m.
This is being implemented by providing VTK-m with a header file that defines the default types. The header file provided to VTK-m should define one or more of the following preprocessor macros:
VTKM_DEFAULT_TYPE_LIST
- avtkm::List
of value types for fields that filters should directly operate on (where applicable).VTKM_DEFAULT_STORAGE_LIST
- avtkm::List
of storage tags for fields that filters should directly operate on.VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED
- avtkm::List
ofvtkm::cont::CellSet
types that filters should operate on as a strutured cell set.VTKM_DEFAULT_CELL_SET_LIST_UNSTRUCTURED
- avtkm::List
ofvtkm::cont::CellSet
types that filters should operate on as an unstrutured cell set.VTKM_DEFAULT_CELL_SET_LIST
- avtkm::List
ofvtkm::cont::CellSet
types that filters should operate on (where applicable). The default ofvtkm::ListAppend<VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED, VTKM_DEFAULT_CELL_SET_LIST>
is usually correct.
If any of these macros are not defined, a default version will be defined. (This is the same default used if no header file is provided.)
This header file is provided to the build by setting the
VTKm_DEFAULT_TYPES_HEADER
CMake variable. VTKm_DEFAULT_TYPES_HEADER
points to the file, which will be configured and copied to VTK-m's build
directory.
For convenience, header files can be added to the VTK_m source directory (conventionally under vtkm/cont/internal). If this is the case, an advanced CMake option should be added to select the provided header file.
ArrayHandle
Shorter fancy array handle classnames
Many of the fancy ArrayHandle
s use the generic builders like
ArrayHandleTransform
and ArrayHandleImplicit
for their implementation.
Such is fine, but because they use functors and other such generic items to
template their Storage
, you can end up with very verbose classnames. This
is an issue for humans trying to discern classnames. It can also be an
issue for compilers that end up with very long resolved classnames that
might get truncated if they extend past what was expected.
The fix was for these classes to declare their own Storage
tag and then
implement their Storage
and ArrayTransport
classes as trivial
subclasses of the generic ArrayHandleImplicit
or ArrayHandleTransport
.
As an added bonus, a lot of this shortening also means that storage that relies on other array handles now are just typed by the storage of the decorated type, not the array itself. This should make the types a little more robust.
Here is a list of classes that were updated.
ArrayHandleCast<TargetT, vtkm::cont::ArrayHandle<SourceT, SourceStorage>>
Old storage:
vtkm::cont::internal::StorageTagTransform<
vtkm::cont::ArrayHandle<SourceT, SourceStorage>,
vtkm::cont::internal::Cast<TargetT, SourceT>,
vtkm::cont::internal::Cast<SourceT, TargetT>>
New Storage:
vtkm::cont::StorageTagCast<SourceT, SourceStorage>
(Developer's note: Implementing this change to ArrayHandleCast
was a much bigger PITA than expected.)
ArrayHandleCartesianProduct<AH1, AH2, AH3>
Old storage:
vtkm::cont::internal::StorageTagCartesianProduct<
vtkm::cont::ArrayHandle<ValueType, StorageTag1,
vtkm::cont::ArrayHandle<ValueType, StorageTag2,
vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
New storage:
vtkm::cont::StorageTagCartesianProduct<StorageTag1, StorageTag2, StorageTag3>
ArrayHandleCompositeVector<AH1, AH2, ...>
Old storage:
vtkm::cont::internal::StorageTagCompositeVector<
tao::tuple<
vtkm::cont::ArrayHandle<ValueType, StorageType1>,
vtkm::cont::ArrayHandle<ValueType, StorageType2>,
...
>
>
New storage:
vtkm::cont::StorageTagCompositeVec<StorageType1, StorageType2>
ArrayHandleConcatinate
First an example with two simple types.
Old storage:
vtkm::cont::StorageTagConcatenate<
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
New storage:
vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>
Now a more specific example taken from the unit test of a concatination of a concatination.
Old storage:
vtkm::cont::StorageTagConcatenate<
vtkm::cont::ArrayHandleConcatenate<
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>,
vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
New storage:
vtkm::cont::StorageTagConcatenate<
vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>, StorageTag3>
ArrayHandleConstant
Old storage:
vtkm::cont::StorageTagImplicit<
vtkm::cont::detail::ArrayPortalImplicit<
vtkm::cont::detail::ConstantFunctor<ValueType>>>
New storage:
vtkm::cont::StorageTagConstant
ArrayHandleCounting
Old storage:
vtkm::cont::StorageTagImplicit<vtkm::cont::internal::ArrayPortalCounting<ValueType>>
New storage:
vtkm::cont::StorageTagCounting
ArrayHandleGroupVec
Old storage:
vtkm::cont::internal::StorageTagGroupVec<
vtkm::cont::ArrayHandle<ValueType, StorageTag>, N>
New storage:
vtkm::cont::StorageTagGroupVec<StorageTag, N>
ArrayHandleGroupVecVariable
Old storage:
vtkm::cont::internal::StorageTagGroupVecVariable<
vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
vtkm::cont::ArrayHandle<vtkm::Id, StorageTag2>>
New storage:
vtkm::cont::StorageTagGroupVecVariable<StorageTag1, StorageTag2>
ArrayHandleIndex
Old storage:
vtkm::cont::StorageTagImplicit<
vtkm::cont::detail::ArrayPortalImplicit<vtkm::cont::detail::IndexFunctor>>
New storage:
vtkm::cont::StorageTagIndex
ArrayHandlePermutation
Old storage:
vtkm::cont::internal::StorageTagPermutation<
vtkm::cont::ArrayHandle<vtkm::Id, StorageTag1>,
vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
New storage:
vtkm::cont::StorageTagPermutation<StorageTag1, StorageTag2>
ArrayHandleReverse
Old storage:
vtkm::cont::StorageTagReverse<vtkm::cont::ArrayHandle<ValueType, vtkm::cont::StorageTag>>
New storage:
vtkm::cont::StorageTagReverse<StorageTag>
ArrayHandleUniformPointCoordinates
Old storage:
vtkm::cont::StorageTagImplicit<vtkm::internal::ArrayPortalUniformPointCoordinates>
New Storage:
vtkm::cont::StorageTagUniformPoints
ArrayHandleView
Old storage:
vtkm::cont::StorageTagView<vtkm::cont::ArrayHandle<ValueType, StorageTag>>
New storage:
'vtkm::cont::StorageTagView<StorageTag>
ArrayPortalZip
Old storage:
vtkm::cont::internal::StorageTagZip<
vtkm::cont::ArrayHandle<ValueType1, StorageTag1>,
vtkm::cont::ArrayHandle<ValueType2, StorageTag2>>
New storage:
vtkm::cont::StorageTagZip<StorageTag1, StorageTag2>
ReadPortal().Get(idx)
Calling ReadPortal()
in a tight loop is an antipattern.
A call to ReadPortal()
causes the array to be copied back to the control environment,
and hence code like
for (vtkm::Id i = 0; i < array.GetNumberOfValues(); ++i) {
vtkm::FloatDefault x = array.ReadPortal().Get(i);
}
is a quadratic-scaling loop.
We have remove almost all internal uses of the ReadPortal().Get
antipattern,
with the exception of 4 API calls into which the pattern is baked in:
CellSetExplicit::GetCellShape
, CellSetPermutation::GetNumberOfPointsInCell
, CellSetPermutation::GetCellShape
, and CellSetPermutation::GetCellPointIds
.
We expect these will need to be deprecated in the future.
Precompiled ArrayCopy
for UnknownArrayHandle
Previously, in order to copy an UnknownArrayHandle
, you had to specify
some subset of types and then specially compile a copy for each potential
type. With the new ability to extract a component from an
UnknownArrayHandle
, it is now feasible to precompile copying an
UnknownArrayHandle
to another array. This greatly reduces the overhead of
using ArrayCopy
to copy UnknownArrayHandle
s while simultaneously
increasing the likelihood that the copy will be successful.
Create ArrayHandleOffsetsToNumComponents
ArrayHandleOffsetsToNumComponents
is a fancy array that takes an array of
offsets and converts it to an array of the number of components for each
packed entry.
It is common in VTK-m to pack small vectors of variable sizes into a single contiguous array. For example, cells in an explicit cell set can each have a different amount of vertices (triangles = 3, quads = 4, tetra = 4, hexa = 8, etc.). Generally, to access items in this list, you need an array of components in each entry and the offset for each entry. However, if you have just the array of offsets in sorted order, you can easily derive the number of components for each entry by subtracting adjacent entries. This works best if the offsets array has a size that is one more than the number of packed vectors with the first entry set to 0 and the last entry set to the total size of the packed array (the offset to the end).
When packing data of this nature, it is common to start with an array that
is the number of components. You can convert that to an offsets array using
the vtkm::cont::ConvertNumComponentsToOffsets
function. This will create
an offsets array with one extra entry as previously described. You can then
throw out the original number of components array and use the offsets with
ArrayHandleOffsetsToNumComponents
to represent both the offsets and num
components while storing only one array.
This replaces the use of ArrayHandleDecorator
in CellSetExplicit
.
The two implementation should do the same thing, but the new
ArrayHandleOffsetsToNumComponents
should be less complex for
compilers.
Recombine extracted component arrays from unknown arrays
Building on the recent capability to extract component arrays from unknown
arrays, there is now also the ability to
recombine these extracted arrays to a single ArrayHandle
. It might seem
counterintuitive to break an ArrayHandle
into component arrays and then
combine the component arrays back into a single ArrayHandle
, but this is
a very handy way to run algorithms without knowing the exact ArrayHandle
type.
Recall that when extracting a component array from an UnknownArrayHandle
you only need to know the base component of the value type of the contained
ArrayHandle
. That makes extracting a component array independent from
either the size of any Vec
value type and any storage type.
The added UnknownArrayHandle::ExtractArrayFromComponents
method allows
you to use the functionality to transform the unknown array handle to a
form of ArrayHandle
that depends only on this base component type. This
method internally uses a new ArrayHandleRecombineVec
class, but this
class is mostly intended for internal use by this class.
As an added convenience, UnknownArrayHandle
now also provides the
CastAndCallWithExtractedArray
method. This method works like other
CastAndCall
s except that it uses the ExtractArrayFromComponents
feature
to allow you to handle most ArrayHandle
types with few template
instances.
UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
Two new classes have been added to VTK-m: UnknownArrayHandle
and
UncertainArrayHandle
. These classes serve the same purpose as the set of
VariantArrayHandle
classes and will replace them.
Motivated mostly by the desire to move away from ArrayHandleVirtual
, we
have multiple reasons to completely refactor the VariantArrayHandle
class. These include changing the implementation, some behavior, and even
the name.
Motivation
We have several reasons that have accumulated to revisit the implementation
of VariantArrayHandle
.
Move away from ArrayHandleVirtual
The current implementation of VariantArrayHandle
internally stores the
array wrapped in an ArrayHandleVirtual
. That makes sense since you might
as well consolidate the hierarchy of virtual objects into one.
Except ArrayHandleVirtual
is being deprecated, so it no longer makes
sense to use that internally.
So we will transition the class back to managing the data as typeless on its own. We will consider using function pointers rather than actual virtual functions because compilers can be slow in creating lots of virtual subclasses.
Reintroduce storage tag lists
The original implementation of VariantArrayHandle
(which at the time was
called DynamicArrayHandle
) actually had two type lists: one for the array
value type and one for the storage type. The storage type list was removed
soon after ArrayHandleVirtual
was introduced because whatever the type of
array it could be access as ArrayHandleVirtual
.
However, with ArrayHandleVirtual
being deprecated, this feature is no
longer possible. We are in need again for the list of storage types to try.
Thus, we need to reintroduce this template argument to
VariantArrayHandle
.
More clear name
The name of this class has always been unsatisfactory. The first name,
DynamicArrayHandle
, makes it sound like the data is always changing. The
second name, VariantArrayHandle
, makes it sound like an array that holds
a value type that can vary (like an std::variant
).
We can use a more clear name that expresses better that it is holding an
ArrayHandle
of an unknown type.
Take advantage of default types for less templating
Once upon a time everything in VTK-m was templated header library. Things have changed quite a bit since then. The most recent development is the ability to select the "default types" with CMake configuration that allows you to select a global set of types you care about during compilation. This is so units like filters can be compiled into a library with all types we care about, and we don't have to constantly recompile units.
This means that we are becoming less concerned about maintaining type lists everywhere. Often we can drop the type list and pass data across libraries.
With that in mind, it makes less sense for VariantArrayHandle
to actually
be a using
alias for VariantArrayHandleBase<VTKM_DEFAULT_TYPE_LIST>
.
In response, we can revert the is-a relationship between the two. Have a completely typeless version as the base class and have a second version templated version to express when the type of the array has been partially narrowed down to given type lists.
New Name and Structure
The ultimate purpose of this class is to store an ArrayHandle
where the
value and storage types are unknown. Thus, an appropriate name for the
class is UnknownArrayHandle
.
UnknownArrayHandle
is not templated. It simply stores an ArrayHandle
in a typeless (void *
) buffer. It does, however, contain many templated
methods that allow you to query whether the contained array matches given
types, to cast to given types, and to cast and call to a given functor
(from either given type lists or default lists).
Rather than have a virtual class structure to manage the typeless array, the new management will use function pointers. This has shown to sometimes improve compile times and generate less code.
Sometimes it is the case that the set of potential types can be narrowed. In
this case, the array ceases to be unknown and becomes uncertain. Thus,
the companion class to UnknownArrayHandle
is UncertainArrayHandle
.
UncertainArrayHandle
has two template parameters: a list of potential
value types and a list of potential storage types. The behavior of
UncertainArrayHandle
matches that of UnknownArrayHandle
(and might
inherit from it). However, for CastAndCall
operations, it will use the
type lists defined in its template parameters.
Serializing UnknownArrayHandle
Because UnknownArrayHandle
is not templated, it contains some
opportunities to compile things into the vtkm_cont
library. Templated
methods like CastAndCall
cannot be, but the specializations of DIY's
serialize can be.
And since it only has to be compiled once into a library, we can spend some
extra time compiling for more types. We don't have to restrict ourselves to
VTKM_DEFAULT_TYPE_LIST
. We can compile for vtkm::TypeListTagAll.
Support ArrayHandleSOA
as a "default" array
Many programs, particularly simulations, store fields of vectors in
separate arrays for each component. This maps to the storage of
ArrayHandleSOA
. The VTK-m code tends to prefer the AOS storage (which is
what is implemented in ArrayHandleBasic
, and the behavior of which is
inherited from VTK). VTK-m should better support adding ArrayHandleSOA
as
one of the types.
We now have a set of default types for Ascent that uses SOA as one of the basic types.
Part of this change includes an intentional feature regression of
ArrayHandleSOA
to only support value types of Vec
. Previously, scalar
types were supported. However, the behavior of ArrayHandleSOA
is exactly
the same as ArrayHandleBasic
, except a lot more template code has to be
generated. That itself is not a huge deal, but because you have 2 types
that essentially do the same thing, a lot of template code in VTK-m would
unwind to create two separate code paths that do the same thing with the
same data. To avoid creating those code paths, we simply make any use of
ArrayHandleSOA
without a Vec
value invalid. This will prevent VTK-m
from creating those code paths.
Removed old ArrayHandle
transfer mechanism
Deleted the default implementation of ArrayTransfer
. ArrayTransfer
is
used with the old ArrayHandle
style to move data between host and device.
The new version of ArrayHandle
does not use ArrayTransfer
at all
because this functionality is wrapped in Buffer
(where it can exist in a
precompiled library).
Once all the old ArrayHandle
classes are gone, this class will be removed
completely. Although all the remaining ArrayHandle
classes provide their
own versions of ArrayTransfer
, they still need the prototype to be
defined to specialize. Thus, the guts of the default ArrayTransfer
are
removed and replaced with a compile error if you try to compile it.
Also removed ArrayManagerExecution
. This class was used indirectly by the
old ArrayHandle
, through ArrayHandleTransfer
, to move data to and from
a device. This functionality has been replaced in the new ArrayHandle
s
through the Buffer
class (which can be compiled into libraries rather
than make every translation unit compile their own template).
Order asynchronous ArrayHandle
access
The recent feature of tokens that scope access to
ArrayHandle
s allows multiple threads to use the same
ArrayHandle
s without read/write hazards. The intent is twofold. First, it
allows two separate threads in the control environment to independently
schedule tasks. Second, it allows us to move toward scheduling worklets and
other algorithms asynchronously.
However, there was a flaw with the original implementation. Once requests
to an ArrayHandle
get queued up, they are resolved in arbitrary order.
This might mean that things run in surprising and incorrect order.
Problematic use case
To demonstrate the flaw in the original implementation, let us consider a
future scenario where when you invoke a worklet (on OpenMP or TBB), the
call to invoke returns immediately and the actual work is scheduled
asynchronously. Now let us say we have a sequence of 3 worklets we wish to
run: Worklet1
, Worklet2
, and Worklet3
. One of Worklet1
's parameters
is a FieldOut
that creates an intermediate ArrayHandle
that we will
simply call array
. Worklet2
is given array
as a FieldInOut
to
modify its values. Finally, Worklet3
is given array
as a FieldIn
. It
is clear that for the computation to be correct, the three worklets must
execute in the correct order of Worklet1
, Worklet2
, and Worklet3
.
The problem is that if Worklet2
and Worklet3
are both scheduled before
Worklet1
finishes, the order they are executed could be arbitrary. Let us
say that Worklet1
is invoked, and the invoke call returns before the
execution of Worklet1
finishes.
The calling code immediately invokes Worklet2
. Because array
is already
locked by Worklet1
, Worklet2
does not execute right away. Instead, it
waits on a condition variable of array
until it is free. But even though
the scheduling of Worklet2
is blocked, the invoke returns because we are
scheduling asynchronously.
Likewise, the calling code then immediately calls invoke for Worklet3
.
Worklet3
similarly waits on the condition variable of array
until it is
free.
Let us assume the likely event that both Worklet2
and Worklet3
get
scheduled before Worklet1
finishes. When Worklet1
then later does
finish, it's token relinquishes the lock on array
, which wakes up the
threads waiting for access to array
. However, there is no imposed order on
in what order the waiting threads will acquire the lock and run. (At least,
I'm not aware of anything imposing an order.) Thus, it is quite possible
that Worklet3
will wake up first. It will see that array
is no longer
locked (because Worklet1
has released it and Worklet2
has not had a
chance to claim it).
Oops. Now Worklet3
is operating on array
before Worklet2
has had a
chance to put the correct values in it. The results will be wrong.
Queuing requests
What we want is to impose the restriction that locks to an ArrayHandle
get resolved in the order that they are requested. In the previous example,
we have 3 requests on an array that happen in a known order. We want
control given to them in the same order.
To implement this, we need to impose another restriction on the
condition_variable
when waiting to read or write. We want the lock to go
to the thread that first started waiting. To do this, we added an
internal queue of Token
s to the ArrayHandle
.
In ArrayHandle::WaitToRead
and ArrayHandle::WaitToWrite
, it first adds
its Token
to the back of the queue before waiting on the condition
variable. In the CanRead
and CanWrite
methods, it checks this queue to
see if the provided Token
is at the front. If not, then the lock is
denied and the thread must continue to wait.
Early enqueuing
Another issue that can happen in the previous example is that as threads
are spawned for the 3 different worklets, they may actually start running
in an unexpected order. So the thread running Worklet3
might actually
start before the other 2 and place itself in the queue first.
The solution is to add a method to ArrayHandle
called Enqueue
. This
method takes a Token
object and adds that Token
to the queue. However,
regardless of where the Token
ends up on the queue, the method
immediately returns. It does not attempt to lock the ArrayHandle
.
So now we can ensure that Worklet1
properly locks array
with this
sequence of events. First, the main thread calls array.Enqueue
. Then a
thread is spawned to call PrepareForOutput
.
Even if control returns to the calling code and it calls invoke for
Worklet2
before this spawned thread starts, Worklet2
cannot start
first. When PrepareForInput
is called on array
, it is queued after the
Token
for Worklet1
, even if Worklet1
has not started waiting on the
array
.
Improvements to moving data into ArrayHandle
We have made several improvements to adding data into an ArrayHandle
.
Moving data from an std::vector
For numerous reasons, it is convenient to define data in a std::vector
and then wrap that into an ArrayHandle
. There are two obvious ways to do
this. First, you could deep copy the data into an ArrayHandle
, which has
obvious drawbacks. Second, you could take the pointer for the data in the
std::vector
and use that as user-allocated memory in the ArrayHandle
without deep copying it. The problem with this shallow copy is that it is
unsafe. If the std::vector
goes out of scope (or gets resized), then the
data the ArrayHandle
is pointing to becomes unallocated, which will lead
to unpredictable behavior.
However, there is a third option. It is often the case that an
std::vector
is filled and then becomes unused once it is converted to an
ArrayHandle
. In this case, what we really want is to pass the data off to
the ArrayHandle
so that the ArrayHandle
is now managing the data and
not the std::vector
.
C++11 has a mechanism to do this: move semantics. You can now pass
variables to functions as an "rvalue" (right-hand value). When something is
passed as an rvalue, it can pull state out of that variable and move it
somewhere else. std::vector
implements this movement so that an rvalue
can be moved to another std::vector
without actually copying the data.
make_ArrayHandle
now also takes advantage of this feature to move rvalue
std::vector
s.
There is a special form of make_ArrayHandle
named make_ArrayHandleMove
that takes an rvalue. There is also a special overload of
make_ArrayHandle
itself that handles an rvalue vector
. (However, using
the explicit move version is better if you want to make sure the data is
actually moved.)
So if you create the std::vector
in the call to make_ArrayHandle
, then
the data only gets created once.
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
Note that there is now a better way to express an initializer list to
ArrayHandle
documented below. But this form of ArrayHandleMove
can be
particularly useful for initializing an array to all of a particular value.
For example, an easy way to initialize an array of 1000 elements all to 1
is
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>(1000, 1));
You can also move the data from an already created std::vector
by using
the std::move
function to convert it to an rvalue. When you do this, the
std::vector
becomes invalid after the call and any use will be undefined.
std::vector<vtkm::Id> vector;
// fill vector
auto array = vtkm::cont::make_ArrayHandleMove(std::move(vector));
Make ArrayHandle
from initalizer list
A common use case for using std::vector
(particularly in our unit tests)
is to quickly add an initalizer list into an ArrayHandle
. Repeating the
example from above:
auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
However, creating the std::vector
should be unnecessary. Why not be able
to create the ArrayHandle
directly from an initializer list? Now you can
by simply passing an initializer list to make_ArrayHandle
.
auto array = vtkm::cont::make_ArrayHandle({ 2, 6, 1, 7, 4, 3, 9 });
There is an issue here. The type here can be a little ambiguous (for
humans). In this case, array
will be of type
vtkm::cont::ArrayHandleBasic<int>
, since that is what an integer literal
defaults to. This could be a problem if, for example, you want to use
array
as an array of vtkm::Id
, which could be of type vtkm::Int64
.
This is easily remedied by specifying the desired value type as a template
argument to make_ArrayHandle
.
auto array = vtkm::cont::make_ArrayHandle<vtkm::Id>({ 2, 6, 1, 7, 4, 3, 9 });
Deprecated make_ArrayHandle
with default shallow copy
For historical reasons, passing an std::vector
or a pointer to
make_ArrayHandle
does a shallow copy (i.e. CopyFlag
defaults to Off
).
Although more efficient, this mode is inherintly unsafe, and making it the
default is asking for trouble.
To combat this, calling make_ArrayHandle
without a copy flag is
deprecated. In this way, if you wish to do the faster but more unsafe
creation of an ArrayHandle
you should explicitly express that.
This requried quite a few changes through the VTK-m source (particularly in the tests).
Similar changes to Field
vtkm::cont::Field
has a make_Field
helper function that is similar to
make_ArrayHandle
. It also features the ability to create fields from
std::vector
s and C arrays. It also likewise had the same unsafe behavior
by default of not copying from the source of the arrays.
That behavior has similarly been depreciated. You now have to specify a copy flag.
The ability to construct a Field
from an initializer list of values has
also been added.
Deprecate ArrayHandleVirtualCoordinates
As we port VTK-m to more types of accelerator architectures, supporting virtual methods is becoming more problematic. Thus, we are working to back out of using virtual methods in the execution environment.
One of the most widespread users of virtual methods in the execution
environment is ArrayHandleVirtual
. As a first step of deprecating this
class, we first deprecate the ArrayHandleVirtualCoordinates
subclass.
Not surprisingly, ArrayHandleVirtualCoordinates
is used directly by
CoordinateSystem
. The biggest change necessary was that the GetData
method returned an ArrayHandleVirtualCoordinates
, which obviously would
not work if that class is deprecated.
An oddness about this return type is that it is quite different from the
superclass's method of the same name. Rather, Field
returns a
VariantArrayHandle
. Since this had to be corrected anyway, it was decided
to change CoordinateSystem
's GetData
to also return a
VariantArrayHandle
, although its typelist is set to just vtkm::Vec3f
.
To try to still support old code that expects the deprecated behavior of
returning an ArrayHandleVirtualCoordinates
, CoordinateSystem::GetData
actually returns a "hidden" subclass of VariantArrayHandle
that
automatically converts itself to an ArrayHandleVirtualCoordinates
. (A
deprecation warning is given if this is done.)
This approach to support deprecated code is not perfect. The returned value
for CoordinateSystem::GetData
can only be used as an ArrayHandle
if a
method is directly called on it or if it is cast specifically to
ArrayHandleVirtualCoordinates
or its superclass. For example, if passing
it to a method argument typed as vtkm::cont::ArrayHandle<T, S>
where T
and S
are template parameters, then the conversion will fail.
To continue to support ease of use, CoordinateSystem
now has a method
named GetDataAsMultiplexer
that returns the data as an
ArrayHandleMultiplexer
. This can be employed to quickly use the
CoordinateSystem
as an array without the overhead of a CastAndCall
.
ArrayHandleDecorator Allocate and Shrink Support
ArrayHandleDecorator
can now be resized when given an appropriate
decorator implementation.
Since the mapping between the size of an ArrayHandleDecorator
and its source
ArrayHandle
s is not well defined, resize operations (such as Shrink
and
Allocate
) are not defined by default, and will throw an exception if called.
However, by implementing the methods AllocateSourceArrays
and/or
ShrinkSourceArrays
on the implementation class, resizing the decorator is
allowed. These methods are passed in a new size along with each of the
ArrayHandleDecorator
's source arrays, allowing developers to control how
the resize operation should affect the source arrays.
For example, the following decorator implementation can be used to create a
resizable ArrayHandleDecorator
that is implemented using two arrays, which
are combined to produce values via the expression:
[decorator value i] = [source1 value i] * 10 + [source2 value i]
Implementation:
template <typename ValueType>
struct DecompositionDecorImpl
{
template <typename Portal1T, typename Portal2T>
struct Functor
{
Portal1T Portal1;
Portal2T Portal2;
VTKM_EXEC_CONT
ValueType operator()(vtkm::Id idx) const
{
return static_cast<ValueType>(this->Portal1.Get(idx) * 10 + this->Portal2.Get(idx));
}
};
template <typename Portal1T, typename Portal2T>
struct InverseFunctor
{
Portal1T Portal1;
Portal2T Portal2;
VTKM_EXEC_CONT
void operator()(vtkm::Id idx, const ValueType& val) const
{
this->Portal1.Set(idx, static_cast<ValueType>(std::floor(val / 10)));
this->Portal2.Set(idx, static_cast<ValueType>(std::fmod(val, 10)));
}
};
template <typename Portal1T, typename Portal2T>
VTKM_CONT Functor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
CreateFunctor(Portal1T&& p1, Portal2T&& p2) const
{
return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
}
template <typename Portal1T, typename Portal2T>
VTKM_CONT InverseFunctor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
CreateInverseFunctor(Portal1T&& p1, Portal2T&& p2) const
{
return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
}
// Resize methods:
template <typename Array1T, typename Array2T>
VTKM_CONT
void AllocateSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
{
array1.Allocate(numVals);
array2.Allocate(numVals);
}
template <typename Array1T, typename Array2T>
VTKM_CONT
void ShrinkSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
{
array1.Shrink(numVals);
array2.Shrink(numVals);
}
};
// Usage:
vtkm::cont::ArrayHandle<ValueType> a1;
vtkm::cont::ArrayHandle<ValueType> a2;
auto decor = vtkm::cont::make_ArrayHandleDecorator(0, DecompositionDecorImpl<ValueType>{}, a1, a2);
decor.Allocate(5);
{
auto decorPortal = decor.GetPortalControl();
decorPortal.Set(0, 13);
decorPortal.Set(1, 8);
decorPortal.Set(2, 43);
decorPortal.Set(3, 92);
decorPortal.Set(4, 117);
}
// a1: { 1, 0, 4, 9, 11 }
// a2: { 3, 8, 3, 2, 7 }
// decor: { 13, 8, 43, 92, 117 }
decor.Shrink(3);
// a1: { 1, 0, 4 }
// a2: { 3, 8, 3 }
// decor: { 13, 8, 43 }
Portals may advertise custom iterators
The ArrayPortalToIterator
utilities are used to produce STL-style iterators
from vtk-m's ArrayHandle
portals. By default, a facade class is constructed
around the portal API, adapting it to an iterator interface.
However, some portals use iterators internally, or may be able to construct a lightweight iterator easily. For these, it is preferable to directly use the specialized iterators instead of going through the generic facade. A portal may now declare the following optional API to advertise that it has custom iterators:
struct MyPortal
{
using IteratorType = ...; // alias to the portal's specialized iterator type
IteratorType GetIteratorBegin(); // Return the begin iterator
IteratorType GetIteratorEnd(); // Return the end iterator
// ...rest of ArrayPortal API...
};
If these members are present, ArrayPortalToIterators
will forward the portal's
specialized iterators instead of constructing a facade. This works when using
the ArrayPortalToIterators
class directly, and also with the
ArrayPortalToIteratorBegin
and ArrayPortalToIteratorEnd
convenience
functions.
Redesign of ArrayHandle to access data using typeless buffers
The original implementation of ArrayHandle
is meant to be very generic.
To define an ArrayHandle
, you actually create a Storage
class that
maintains the data and provides portals to access it (on the host). Because
the Storage
can provide any type of data structure it wants, you also
need to define an ArrayTransfer
that describes how to move the
ArrayHandle
to and from a device. It also has to be repeated for every
translation unit that uses them.
This is a very powerful mechanism. However, one of the major problems with
this approach is that every ArrayHandle
type needs to have a separate
compile path for every value type crossed with every device. Because of
this limitation, the ArrayHandle
for the basic storage has a special
implementation that manages the actual data allocation and movement as
void *
arrays. In this way all the data management can be compiled once
and put into the vtkm_cont
library. This has dramatically improved the
VTK-m compile time.
This new design replicates the basic ArrayHandle
's success to all other
storage types. The basic idea is to make the implementation of
ArrayHandle
storage slightly less generic. Instead of requiring it to
manage the data it stores, it instead just builds ArrayPortal
s from
void
pointers that it is given. The management of void
pointers can be
done in non-templated classes that are compiled into a library.
This initial implementation does not convert all ArrayHandle
s to avoid
making non-backward compatible changes before the next minor revision of
VTK-m. In particular, it would be particularly difficult to convert
ArrayHandleVirtual
. It could be done, but it would be a lot of work for a
class that will likely be removed.
Buffer
Key to these changes is the introduction of a
vtkm::cont::internal::Buffer
object. As the name implies, the Buffer
object manages a single block of bytes. Buffer
is agnostic to the type of
data being stored. It only knows the length of the buffer in bytes. It is
responsible for allocating space on the host and any devices as necessary
and for transferring data among them. (Since Buffer
knows nothing about
the type of data, a precondition of VTK-m would be that the host and all
devices have to have the same endian.)
The idea of the Buffer
object is similar in nature to the existing
vtkm::cont::internal::ExecutionArrayInterfaceBasicBase
except that it
will manage a buffer of data among the control and all devices rather than
in one device through a templated subclass.
As explained below, ArrayHandle
holds some fixed number of Buffer
objects. (The number can be zero for implicit ArrayHandle
s.) Because all
the interaction with the devices happen through Buffer
, it will no longer
be necessary to compile any reference to ArrayHandle
for devices (e.g.
you won't have to use nvcc just because the code links ArrayHandle.h
).
Storage
The vtkm::cont::internal::Storage
class changes dramatically. Although an
instance will be kept, the intention is for Storage
itself to be a
stateless object. It will manage its data through Buffer
objects provided
from the ArrayHandle
.
That said, it is possible for Storage
to have some state. For example,
the Storage
for ArrayHandleImplicit
must hold on to the instance of the
portal used to manage the state.
ArrayTransport
The vtkm::cont::internal::ArrayTransfer
class will be removed completely.
All data transfers will be handled internally with the Buffer
object
Portals
A big change for this design is that the type of a portal for an
ArrayHandle
will be the same for all devices and the host. Thus, we no
longer need specialized versions of portals for each device. We only have
one portal type. And since they are constructed from void *
pointers, one
method can create them all.
Advantages
The ArrayHandle
interface should not change significantly for external
uses, but this redesign offers several advantages.
Faster Compiles
Because the memory management is contained in a non-templated Buffer
class, it can be compiled once in a library and used by all template
instances of ArrayHandle
. It should have similar compile advantages to
our current specialization of the basic ArrayHandle
, but applied to all
types of ArrayHandle
s.
Fewer Templates
Hand-in-hand with faster compiles, the new design should require fewer
templates and template instances. We have immediately gotten rid of
ArrayTransport
. Storage
is also much shorter. Because all
ArrayPortal
s are the same for every device and the host, we need many
fewer versions of those classes. In the device adapter, we can probably
collapse the three ArrayManagerExecution
classes into a single, much
simpler class that does simple memory allocation and copy.
Fewer files need to be compiled for CUDA
Including ArrayHandle.h
no longer adds code that compiles for a device.
Thus, we should no longer need to compile for a specific device adapter
just because we access an ArrayHandle
. This should make it much easier to
achieve our goal of a "firewall". That is, code that just calls VTK-m
filters does not need to support all its compilers and flags.
Simpler ArrayHandle specialization
The newer code should simplify the implementation of special ArrayHandle
s
a bit. You need only implement an ArrayPortal
that operates on one or
more void *
arrays and a simple Storage
class.
Out of band memory sharing
With the current version of ArrayHandle
, if you want to take data from
one ArrayHandle
you pretty much have to create a special template to wrap
another ArrayHandle
around that. With this new design, it is possible to
take data from one ArrayHandle
and give it to another ArrayHandle
of a
completely different type. You can't do this willy-nilly since different
ArrayHandle
types will interpret buffers differently. But there can be
some special important use cases.
One such case could be an ArrayHandle
that provides strided access to a
buffer. (Let's call it ArrayHandleStride
.) The idea is that it interprets
the buffer as an array for a particular type (like a basic ArrayHandle
)
but also defines a stride, skip, and repeat so that given an index it looks
up the value ((index / skip) % repeat) * stride
. The point is that it can
take an AoS array of tuples and represent an array of one of the
components.
The point would be that if you had a VariantArrayHandle
or Field
, you
could pull out an array of one of the components as an ArrayHandleStride
.
An ArrayHandleStride<vtkm::Float32>
could be used to represent that data
that comes from any basic ArrayHandle
with vtkm::Float32
or a
vtkm::Vec
of that type. It could also represent data from an
ArrayHandleCartesianProduct
and ArrayHandleSoA
. We could even represent
an ArrayHandleUniformPointCoordinates
by just making a small array. This
allows us to statically access a whole bunch of potential array storage
classes with a single type.
Potentially faster device transfers
There is currently a fast-path for basic ArrayHandle
s that does a block
cuda memcpy between host and device. But for other ArrayHandle
s that do
not defer their ArrayTransfer
to a sub-array, the transfer first has to
copy the data into a known buffer.
Because this new design stores all data in Buffer
objects, any of these
can be easily and efficiently copied between devices.
Disadvantages
This new design gives up some features of the original ArrayHandle
design.
Can only interface data that can be represented in a fixed number of buffers
Because the original ArrayHandle
design required the Storage
to
completely manage the data, it could represent it in any way possible. In
this redesign, the data need to be stored in some fixed number of memory
buffers.
This is a pretty open requirement. I suspect most data formats will be
storable in this. The user's guide has an example of data stored in a
std::deque
that will not be representable. But that is probably not a
particularly practical example.
VTK-m would only be able to support hosts and devices with the same endian
Because data are transferred as void *
blocks of memory, there is no way
to correct words if the endian on the two devices does not agree. As far as
I know, there should be no issues with the proposed ECP machines.
If endian becomes an issue, it might be possible to specify a word length
in the Buffer
. That would assume that all numbers stored in the Buffer
have the same word length.
ArrayPortals must be completely recompiled in each translation unit
We can declare that an ArrayHandle
does not need to include the device
adapter header files in part because it no longer needs specialized
ArrayPortal
s for each device. However, that means that a translation unit
compiled with the host compiler (say gcc) will produce different code for
the ArrayPortal
s than those with the device compiler (say nvcc). This
could lead to numerous linking problems.
To get around these issues, we will probably have to enforce no exporting
of any of the ArrayPotal
symbols and force them all to be recompiled for
each translation unit. This will serve to increase the compile times a bit.
We will probably also still encounter linking errors as there would be no
way to enforce this requirement.
Cannot have specialized portals for the control environment
Because the new design unifies ArrayPortal
types across control and
execution environments, it is no longer possible to have a special version
for the control environment to manage resources. This will require removing
some recent behavior of control portals such as with MR !1988.
ArrayRangeCompute
works on any array type without compiling device code
Originally, ArrayRangeCompute
required you to know specifically the
ArrayHandle
type (value type and storage type) and to compile using any
device compiler. The method is changed to include only overloads that have
precompiled versions of ArrayRangeCompute
.
Additionally, an ArrayRangeCompute
overload that takes an
UnknownArrayHandle
has been added. In addition to allowing you to compute
the range of arrays of unknown types, this implementation of
ArrayRangeCompute
serves as a fallback for ArrayHandle
types that are
not otherwise explicitly supported.
If you really want to make sure that you compute the range directly on an
ArrayHandle
of a particular type, you can include
ArrayRangeComputeTemplate.h
, which contains a templated overload of
ArrayRangeCompute
that directly computes the range of an ArrayHandle
.
Including this header requires compiling for device code.
Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal were added to provide an efficient way to generate pseudo random numbers in parallel. They are based on the Philox parallel pseudo random number generator. ArrayHandleRandomUniformBits provides 64-bits random bits in the whole range of UInt64 as its content while ArrayHandleRandomUniformReal provides random Float64 in the range of [0, 1). User can either provide a seed in the form of Vec<vtkm::Uint32, 1> or use the default random source provided by the C++ standard library. Both of the ArrayHandles are lazy evaluated as other Fancy ArrayHandles such that they only have O(1) memory overhead. They are stateless and functional and does not change once constructed. To generate a new set of random numbers, for example as part of a iterative algorithm, a new ArrayHandle needs to be constructed in each iteration. See the user's guide for more detail and examples.
Extract component arrays from unknown arrays
One of the problems with the data structures of VTK-m is that non-templated
classes like DataSet
, Field
, and UnknownArrayHandle
(formally
VariantArrayHandle
) internally hold an ArrayHandle
of a particular type
that has to be cast to the correct task before it can be reasonably used.
That in turn is problematic because the list of possible ArrayHandle
types is very long.
At one time we were trying to compensate for this by using
ArrayHandleVirtual
. However, for technical reasons this class is
infeasible for every use case of VTK-m and has been deprecated. Also, this
was only a partial solution since using it still required different code
paths for, say, handling values of vtkm::Float32
and vtkm::Vec3f_32
even though both are essentially arrays of 32-bit floats.
The extract component feature compensates for this problem by allowing you
to extract the components from an ArrayHandle
. This feature allows you to
create a single code path to handle ArrayHandle
s containing scalars or
vectors of any size. Furthermore, when you extract a component from an
array, the storage gets normalized so that one code path covers all storage
types.
ArrayExtractComponent
The basic enabling feature is a new function named ArrayExtractComponent
.
This function takes takes an ArrayHandle
and an index to a component. It
then returns an ArrayHandleStride
holding the selected component of each
entry in the original array.
We will get to the structure of ArrayHandleStride
later. But the
important part is that ArrayHandleStride
does not depend on the storage
type of the original ArrayHandle
. That means whether you extract a
component from ArrayHandleBasic
, ArrayHandleSOA
,
ArrayHandleCartesianProduct
, or any other type, you get back the same
ArrayHandleStride
. Likewise, regardless of whether the input
ArrayHandle
has a ValueType
of FloatDefault
, Vec2f
, Vec3f
, or any
other Vec
of a default float, you get the same ArrayHandleStride
. Thus,
you can see how this feature can dramatically reduce code paths if used
correctly.
It should be noted that ArrayExtractComponent
will (logically) flatten
the ValueType
before extracting the component. Thus, nested Vec
s such
as Vec<Vec3f, 3>
will be treated as a Vec<FloatDefault, 9>
. The
intention is so that the extracted component will always be a basic C type.
For the purposes of this document when we refer to the "component type", we
really mean the base component type.
Different ArrayHandle
implementations provide their own implementations
for ArrayExtractComponent
so that the component can be extracted without
deep copying all the data. We will visit how ArrayHandleStride
can
represent different data layouts later, but first let's go into the main
use case.
Extract components from UnknownArrayHandle
The principle use case for ArrayExtractComponent
is to get an
ArrayHandle
from an unknown array handle without iterating over every
possible type. (Rather, we iterate over a smaller set of types.) To
facilitate this, an ExtractComponent
method has been added to
UnknownArrayHandle
.
To use UnknownArrayHandle::ExtractComponent
, you must give it the
component type. You can check for the correct component type by using the
IsBaseComponentType
method. The method will then return an
ArrayHandleStride
for the component type specified.
Example
As an example, let's say you have a worklet, FooWorklet
, that does some
per component operation on an array. Furthermore, let's say that you want
to implement a function that, to the best of your ability, can apply
FooWorklet
on an array of any type. This function should be pre-compiled
into a library so it doesn't have to be compiled over and over again.
(MapFieldPermutation
and MapFieldMergeAverage
are real and important
examples that have this behavior.)
Without the extract component feature, the implementation might look something like this (many practical details left out):
struct ApplyFooFunctor
{
template <typename ArrayType>
void operator()(const ArrayType& input, vtkm::cont::UnknownArrayHandle& output) const
{
ArrayType outputArray;
vtkm::cont::Invoke invoke;
invoke(FooWorklet{}, input, outputArray);
output = outputArray;
}
};
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
{
vtkm::cont::UnknownArrayHandle output;
input.CastAndCallForTypes<vtkm::TypeListAll, VTKM_DEFAULT_STORAGE_LIST_TAG>(
ApplyFooFunctor{}, output);
return output;
}
Take a look specifically at the CastAndCallForTypes
call near the bottom
of this example. It calls for all types in vtkm::TypeListAll
, which is
about 40 instances. Then, it needs to be called for any type in the desired
storage list. This could include basic arrays, SOA arrays, and lots of
other specialized types. It would be expected for this code to generate
over 100 paths for ApplyFooFunctor
. This in turn contains a worklet
invoke, which is not a small amount of code.
Now consider how we can use the ExtractComponent
feature to reduce the
code paths:
struct ApplyFooFunctor
{
template <typename T>
void operator()(T,
const vtkm::cont::UnknownArrayHandle& input,
cont vtkm::cont::UnknownArrayHandle& output) const
{
if (!input.IsBasicComponentType<T>()) { return; }
VTKM_ASSERT(output.IsBasicComponentType<T>());
vtkm::cont::Invoke invoke;
invoke(FooWorklet{}, input.ExtractComponent<T>(), output.ExtractComponent<T>());
}
};
vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
{
vtkm::cont::UnknownArrayHandle output = input.NewInstanceBasic();
output.Allocate(input.GetNumberOfValues());
vtkm::cont::ListForEach(ApplyFooFunctor{}, vtkm::TypeListScalarAll{}, input, output);
return output;
}
The number of lines of code is about the same, but take a look at the
ListForEach
(which replaces the CastAndCallForTypes
). This calling code
takes TypeListScalarAll
instead of TypeListAll
, which reduces the
instances created from around 40 to 13 (every basic C type). It is also no
longer dependent on the storage, so these 13 instances are it. As an
example of potential compile savings, changing the implementation of the
MapFieldMergePermutation
and MapFieldMergeAverage
functions in this way
reduced the filters_common library (on Mac, Debug build) by 24 MB (over a
third of the total size).
Another great advantage of this approach is that even though it takes less
time to compile and generates less code, it actually covers more cases.
Have an array containg values of Vec<short, 13>
? No problem. The values
were actually stored in an ArrayHandleReverse
? It will still work.
ArrayHandleStride
This functionality is made possible with the new ArrayHandleStride
. This
array behaves much like ArrayHandleBasic
, except that it contains an
offset parameter to specify where in the buffer array to start reading
and a stride parameter to specify how many entries to skip for each
successive entry. ArrayHandleStride
also has optional parameters
divisor
and modulo
that allow indices to be repeated at regular
intervals.
Here are how ArrayHandleStride
extracts components from several common
arrays. For each of these examples, we assume that the ValueType
of the
array is Vec<T, N>
. They are each extracting component.
Extracting from ArrayHandleBasic
When extracting from an ArrayHandleBasic
, we just need to start at the
proper component and skip the length of the Vec
.
- offset: component
- stride:
N
Extracting from ArrayHandleSOA
Since each component is held in a separate array, they are densly packed.
Each component could be represented by ArrayHandleBasic
, but of course we
use ArrayHandleStride
to keep the type consistent.
- offset: 0
- stride: 1
Extracting from ArrayHandleCartesianProduct
This array is the basic reason for implementing the divisor and modulo parameters. Each of the 3 components have different parameters, which are the following (given that dims[3] captures the size of the 3 arrays for each dimension).
- offset: 0
- stride: 1
- case component == 0
- divisor: ignored
- modulo: dims[0]
- case component == 1
- divisor: dims[0]
- modulo: dims[1]
- case component == 2
- divisor: dims[0]
- modulo: ignored
Extracting from ArrayHandleUniformPointCoordinates
This array cannot be represented directly because it is fully implicit.
However, it can be trivially converted to ArrayHandleCartesianProduct
in
typically very little memory. (In fact, EAVL always represented uniform
point coordinates by explicitly storing a Cartesian product.) Thus, for
very little overhead the ArrayHandleStride
can be created.
Runtime overhead of extracting components
These benefits come at a cost, but not a large one. The "biggest" cost is
the small cost of computing index arithmetic for each access into
ArrayHandleStride
. To make this as efficient as possible, there are
conditions that skip over the modulo and divide steps if they are not
necessary. (Integer modulo and divide tend to take much longer than
addition and multiplication.) It is for this reason that we probably do not
want to use this method all the time.
Another cost is the fact that not every ArrayHandle
can be represented by
ArrayHandleStride
directly without copying. If you ask to extract a
component that cannot be directly represented, it will be copied into a
basic array, which is not great. To make matters worse, for technical
reasons this copy happens on the host rather than the device.
ArrayHandleGroupVecVariable
holds now one more offset.
This change affects the usage of both ConvertNumComponentsToOffsets
and
make_ArrayHandleGroupVecVariable
.
The reason of this change is to remove a branch in
ArrayHandleGroupVecVariable::Get
which is used to avoid an array overflow,
this in theory would increases the performance since at the CPU level it will
remove penalties due to wrong branch predictions.
The change affects ConvertNumComponentsToOffsets
by both:
-
Increasing the numbers of elements in
offsetsArray
(its second parameter) by one. -
Setting
sourceArraySize
as the sum of all the elements plus the new one inoffsetsArray
Note that not every specialization of ConvertNumComponentsToOffsets
does
return offsetsArray
. Thus, some of them would not be affected.
Similarly, this change affects make_ArrayHandleGroupVecVariable
since it
expects its second parameter (offsetsArray) to be one element bigger than
before.
Control Environment
Algorithms for Control and Execution Environments
The <vtkm/Algorithms.h>
header has been added to provide common STL-style
generic algorithms that are suitable for use in both the control and execution
environments. This is necessary as the STL algorithms in the <algorithm>
header are not marked up for use in execution environments such as CUDA.
In addition to the markup, these algorithms have convenience overloads to support ArrayPortals directly, simplifying their usage with VTK-m data structures.
Currently, three related algorithms are provided: LowerBounds
, UpperBounds
,
and BinarySearch
. BinarySearch
differs from the STL std::binary_search
algorithm in that it returns an iterator (or index) to a matching element,
rather than just a boolean indicating whether a or not key is present.
The new algorithm signatures are:
namespace vtkm
{
template <typename IterT, typename T, typename Comp>
VTKM_EXEC_CONT
IterT BinarySearch(IterT first, IterT last, const T& val, Comp comp);
template <typename IterT, typename T>
VTKM_EXEC_CONT
IterT BinarySearch(IterT first, IterT last, const T& val);
template <typename PortalT, typename T, typename Comp>
VTKM_EXEC_CONT
vtkm::Id BinarySearch(const PortalT& portal, const T& val, Comp comp);
template <typename PortalT, typename T>
VTKM_EXEC_CONT
vtkm::Id BinarySearch(const PortalT& portal, const T& val);
template <typename IterT, typename T, typename Comp>
VTKM_EXEC_CONT
IterT LowerBound(IterT first, IterT last, const T& val, Comp comp);
template <typename IterT, typename T>
VTKM_EXEC_CONT
IterT LowerBound(IterT first, IterT last, const T& val);
template <typename PortalT, typename T, typename Comp>
VTKM_EXEC_CONT
vtkm::Id LowerBound(const PortalT& portal, const T& val, Comp comp);
template <typename PortalT, typename T>
VTKM_EXEC_CONT
vtkm::Id LowerBound(const PortalT& portal, const T& val);
template <typename IterT, typename T, typename Comp>
VTKM_EXEC_CONT
IterT UpperBound(IterT first, IterT last, const T& val, Comp comp);
template <typename IterT, typename T>
VTKM_EXEC_CONT
IterT UpperBound(IterT first, IterT last, const T& val);
template <typename PortalT, typename T, typename Comp>
VTKM_EXEC_CONT
vtkm::Id UpperBound(const PortalT& portal, const T& val, Comp comp);
template <typename PortalT, typename T>
VTKM_EXEC_CONT
vtkm::Id UpperBound(const PortalT& portal, const T& val);
}
Execution Environment
Scope ExecObjects with Tokens
When VTK-m's ArrayHandle
was originally designed, it was assumed that the
control environment would run on a single thread. However, multiple users
have expressed realistic use cases in which they would like to control
VTK-m from multiple threads (for example, to control multiple devices).
Consequently, it is important that VTK-m's control classes work correctly
when used simultaneously from multiple threads.
The original PrepareFor*
methods of ArrayHandle
returned an object to
be used in the execution environment on a particular device that pointed to
data in the array. The pointer to the data was contingent on the state of
the ArrayHandle
not changing. The assumption was that the calling code
would immediately use the returned execution environment object and would
not further change the ArrayHandle
until done with the execution
environment object.
This assumption is broken if multiple threads are running in the control
environment. For example, if one thread has called PrepareForInput
to get
an execution array portal, the portal or its data could become invalid if
another thread calls PrepareForOutput
on the same array. Initially one
would think that a well designed program should not share ArrayHandle
s in
this way, but there are good reasons to need to do so. For example, when
using vtkm::cont::PartitionedDataSet
where multiple partitions share a
coordinate system (very common), it becomes unsafe to work on multiple
blocks in parallel on different devices.
What we really want is the code to be able to specify more explicitly when the execution object is in use. Ideally, the execution object itself would maintain the resources it is using. However, that will not work in this case since the object has to pass from control to execution environment and back. The resource allocation will break when the object is passed to an offloaded device and back.
Because we cannot use the object itself to manage its own resources, we use
a proxy object we are calling a Token
. The Token
object manages the
scope of the return execution object. As long as the Token
is still in
scope, the execution object will remain valid. When the Token
is
destroyed (or DetachFromAll
is called on it), then the execution object
is no longer protected.
When a Token
is attached to an ArrayHandle
to protect an execution
object, it's read or write mode is recorded. Multiple Token
s can be
attached to read the ArrayHandle
at the same time. However, only one
Token
can be used to write to the ArrayHandle
.
Basic ArrayHandle
use
The basic use of the PrepareFor*
methods of ArrayHandle
remain the
same. The only difference is the addition of a Token
parameter.
template <typename Device>
void LowLevelArray(vtkm::cont::ArrayHandle<vtkm::Float32> array, Device)
{
vtkm::cont::Token token;
auto portal = array.PrepareForOutput(ARRAY_SIZE, Device{}, token);
// At this point, array is locked from anyone else from reading or modifying
vtkm::cont::DeviceAdapterAlgorithm<Device>::Schedule(MyKernel(portal), ARRAY_SIZE);
// When the function finishes, token goes out of scope and array opens up
// for other uses.
}
Execution objects
To make sure that execution objects are scoped correctly, many changes
needed to be made to propagate a Token
reference from the top of the
scope to where the execution object is actually made. The most noticeable
place for this was for implementations of
vtkm::cont::ExecutionObjectBase
. Most implementations of
ExecutionObjectBase
create an object that requires data from an
ArrayHandle
.
Previously, a subclass of ExecutionObjectBase
was expected to have a
method named PrepareForExecution
that had a single argument: the device
tag (or id) to make an object for. Now, subclasses of ExecutionObjectBase
should have a PrepareForExecution
that takes two arguments: the device
and a Token
to use for scoping the execution object.
struct MyExecObject : vtkm::cont::ExecutionObjectBase
{
vtkm::cont::ArrayHandle<vtkm::Float32> Array;
template <typename Device>
VTKM_CONT
MyExec<Device> PrepareForExecution(Device device, vtkm::cont::Token& token)
{
MyExec<Device> object;
object.Portal = this->Array.PrepareForInput(device, token);
return object;
}
};
It actually still works to use the old style of PrepareForExecution
.
However, you will get a deprecation warning (on supported compilers) when
you try to use it.
Invoke and Dispatcher
The Dispatcher
classes now internally define a Token
object during the
call to Invoke
. (Likewise, Invoker
will have a Token
defined during
its invoke.) This internal Token
is used when preparing ArrayHandle
s
and ExecutionObject
s for the execution environment. (Details in the next
section on how that works.)
Because the invoke uses a Token
to protect its arguments, it will block
the execution of other worklets attempting to access arrays in a way that
could cause read-write hazards. In the following example, the second
worklet will not be able to execute until the first worklet finishes.
vtkm::cont::Invoker invoke;
invoke(Worklet1{}, input, intermediate);
invoke(Worklet2{}, intermediate, output); // Will not execute until Worklet1 finishes.
That said, invocations can share arrays if their use will not cause read-write hazards. In particular, two invocations can both use the same array if they are both strictly reading from it. In the following example, both worklets can potentially execute at the same time.
vtkm::cont::Invoker invoke;
invoke(Worklet1{}, input, output1);
invoke(Worklet2{}, input, output2); // Will not block
The same Token
is used for all arguments to the Worklet
. This deatil is
important to prevent deadlocks if the same object is used in more than one
Worklet
parameter. As a simple example, if a Worklet
has a control
signature like
using ControlSignature = void(FieldIn, FieldOut);
it should continue to work to use the same array as both fields.
vtkm::cont::Invoker invoke;
invoke(Worklet1{}, array, array);
Transport
The dispatch mechanism of worklets internally uses
vtkm::cont::arg::Transport
objects to automatically move data from the
control environment to the execution environment. These Transport
object
now take a Token
when doing the transportation. This all happens under
the covers for most users.
Control Portals
The GetPortalConstControl
and GetPortalControl
methods have been
deprecated. Instead, the methods ReadPortal
and WritePortal
should be
used. The calling signature is the same as their predecessors, but the
returned portal contains a reference back to the original ArrayHandle
.
The reference keeps track of whether the memory allocation has changed.
If the ArrayHandle
is changed while the ArrayPortal
still exists,
nothing will happen immediately. However, if the portal is subsequently
accessed (i.e. Set
or Get
is called on it), then a fatal error will be
reported to the log.
Deadlocks
Now that portal objects from ArrayHandle
s have finite scope (as opposed
to able to be immediately invalidated), the scopes have the ability to
cause operations to block. This can cause issues if the ArrayHandle
is
attempted to be used by multiple Token
s at once.
The following is a contrived example of causing a deadlock.
vtkm::cont::Token token1;
auto portal1 = array.PrepareForInPlace(Device{}, token1);
vtkm::cont::Token token2;
auto portal2 = array.PrepareForInput(Device{}, token2);
The last line will deadlock as PrepareForInput
waits for token1
to
detach, which will never happen. To prevent this from happening, if you use
the same Token
on the array, it will always allow the action. Thus, the
following will work fine.
vtkm::cont::Token token;
auto portal1 = array.PrepareForInPlace(Device{}, token);
auto portal2 = array.PrepareForInput(Device{}, token);
This prevents deadlock during the invocation of a worklet (so long as no
intermediate object tries to create its own Token
, which would be bad
practice).
Deadlocks are more likely when actually running multiple threads in the control environment, but still pretty unlikely. One way it can occur is if you have one (or more) worklet that has two output fields. You then try to run the worklet(s) simultaneously on multiple threads. It could be that one thread locks the first output array and the other thread locks the second output array.
However, having multiple threads trying to write to the same output arrays at the same time without its own coordination is probably a bad idea in itself.
Masks and Scatters Supported for 3D Scheduling
Previous to this change worklets that wanted to use non-default
vtkm::worklet::Mask
or vtkm::worklet::Scatter
wouldn't work when scheduled
to run across vtkm::cont::CellSetStructured
or other InputDomains
that
supported 3D scheduling.
This restriction was an inadvertent limitation of the VTK-m worklet scheduling algorithm. Lifting the restriction and providing sufficient information has been achieved in a manner that shouldn't degrade performance of any existing worklets.
Virtual methods in execution environment deprecated
The use of classes with any virtual methods in the execution environment is deprecated. Although we had code to correctly build virtual methods on some devices such as CUDA, this feature was not universally supported on all programming models we wish to support. Plus, the implementation of virtual methods is not hugely convenient on CUDA because the virtual methods could not be embedded in a library. To get around virtual methods declared in different libraries, all builds had to be static, and a special linking step to pull in possible virtual method implementations was required.
For these reasons, VTK-m is no longer relying on virtual methods. (Other
approaches like multiplexers are used instead.) The code will be officially
removed in version 2.0. It is still supported in a deprecated sense (you
should get a warning). However, if you want to build without virtual
methods, you can set the VTKm_NO_DEPRECATED_VIRTUAL
CMake flag, and they
will not be compiled.
Deprecate Execute with policy
The version of Filter::Execute
that takes a policy as an argument is now
deprecated. Filters are now able to specify their own fields and types,
which is often why you want to customize the policy for an execution. The
other reason is that you are compiling VTK-m into some other source that
uses a particular types of storage. However, there is now a mechanism in
the CMake configuration to allow you to provide a header that customizes
the "default" types used in filters. This is a much more convenient way to
compile filters for specific types.
One thing that filters were not able to do was to customize what cell sets
they allowed using. This allows filters to self-select what types of cell
sets they support (beyond simply just structured or unstructured). To
support this, the lists SupportedCellSets
, SupportedStructuredCellSets
,
and SupportedUnstructuredCellSets
have been added to Filter
. When you
apply a policy to a cell set, you now have to also provide the filter.
Worklets and Filters
Enable setting invalid value in probe filter
Initially, the probe filter would simply not set a value if a sample was
outside the input DataSet
. This is not great as the memory could be
left uninitalized and lead to unpredictable results. The testing
compared these invalid results to 0, which seemed to work but is
probably unstable.
This was partially fixed by a previous change that consolidated to mapping of cell data with a general routine that permuted data. However, the fix did not extend to point data in the input, and it was not possible to specify a particular invalid value.
This change specifically updates the probe filter so that invalid values are set to a user-specified value.
Avoid raising errors when operating on cells
Cell operations like interpolate and finding parametric coordinates can
fail under certain conditions. The previous behavior was to call
RaiseError
on the worklet. By design, this would cause the worklet
execution to fail. However, that makes the worklet unstable for a conditin
that might be relatively common in data. For example, you wouldn't want a
large streamline worklet to fail just because one cell was not found
correctly.
To work around this, many of the cell operations in the execution environment have been changed to return an error code rather than raise an error in the worklet.
Error Codes
To support cell operations efficiently returning errors, a new enum named
vtkm::ErrorCode
is available. This is the current implementation of
ErrorCode
.
enum class ErrorCode
{
Success,
InvalidShapeId,
InvalidNumberOfPoints,
WrongShapeIdForTagType,
InvalidPointId,
InvalidEdgeId,
InvalidFaceId,
SolutionDidNotConverge,
MatrixFactorizationFailed,
DegenerateCellDetected,
MalformedCellDetected,
OperationOnEmptyCell,
CellNotFound,
UnknownError
};
A convenience function named ErrorString
is provided to make it easy to
convert the ErrorCode
to a descriptive string that can be placed in an
error.
New Calling Specification
Previously, most execution environment functions took as an argument the
worklet calling the function. This made it possible to call RaiseError
on
the worklet. The result of the operation was typically returned. For
example, here is how the old version of interpolate was called.
FieldType interpolatedValue =
vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, worklet);
The worklet is now no longer passed to the function. It is no longer needed
because an error is never directly raised. Instead, an ErrorCode
is
returned from the function. Because the ErrorCode
is returned, the
computed result of the function is returned by passing in a reference to a
variable. This is usually placed as the last argument (where the worklet
used to be). here is the new version of how interpolate is called.
FieldType interpolatedValue;
vtkm::ErrorCode result =
vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, interpolatedValue);
The success of the operation can be determined by checking that the
returned ErrorCode
is equal to vtkm::ErrorCode::Success
.
Add atomic free functions
Previously, all atomic functions were stored in classes named
AtomicInterfaceControl
and AtomicInterfaceExecution
, which required
you to know at compile time which device was using the methods. That in
turn means that anything using an atomic needed to be templated on the
device it is running on.
That can be a big hassle (and is problematic for some code structure).
Instead, these methods are moved to free functions in the vtkm
namespace. These functions operate like those in Math.h
. Using
compiler directives, an appropriate version of the function is compiled
for the current device the compiler is using.
Flying Edges
Added the flying edges contouring algorithm to VTK-m. This algorithm only works on structured grids, but operates much faster than the traditional Marching Cubes algorithm.
The speed of VTK-m's flying edges is comprable to VTK's running on the same CPUs. VTK-m's implementation also works well on CUDA hardware.
The Flying Edges algorithm was introduced in this paper:
Schroeder, W.; Maynard, R. & Geveci, B. "Flying edges: A high-performance scalable isocontouring algorithm." Large Data Analysis and Visualization (LDAV), 2015. DOI 10.1109/LDAV.2015.7348069
Filters specify their own field types
Previously, the policy specified which field types the filter should operate on. The filter could remove some types, but it was not able to add any types.
This is backward. Instead, the filter should specify what types its supports and the policy may cull out some of those.
Build
Disable asserts for CUDA architecture builds
assert
is supported on recent CUDA cards, but compiling it appears to be
very slow. By default, the VTKM_ASSERT
macro has been disabled whenever
compiling for a CUDA device (i.e. when __CUDA_ARCH__
is defined).
Asserts for CUDA devices can be turned back on by turning the
VTKm_NO_ASSERT_CUDA
CMake variable off. Turning this CMake variable off
will enable assertions in CUDA kernels unless there is another reason
turning off all asserts (such as a release build).
Disable asserts for HIP architecture builds
assert
is supported on recent HIP cards, but compiling it is very slow,
as it triggers the usage of printf
which. Currently (ROCm 3.7) printf
has a severe performance penalty and should be avoided when possible.
By default, the VTKM_ASSERT
macro has been disabled whenever compiling
for a HIP device via kokkos.
Asserts for HIP devices can be turned back on by turning the
VTKm_NO_ASSERT_HIP
CMake variable off. Turning this CMake variable off
will enable assertions in HIP kernels unless there is another reason
turning off all asserts (such as a release build).
Add VTKM_DEPRECATED macro
The VTKM_DEPRECATED
macro allows us to remove (and usually replace)
features from VTK-m in minor releases while still following the conventions
of semantic versioning. The idea is that when we want to remove or replace
a feature, we first mark the old feature as deprecated. The old feature
will continue to work, but compilers that support it will start to issue a
warning that the use is deprecated and should stop being used. The
deprecated features should remain viable until at least the next major
version. At the next major version, deprecated features from the previous
version may be removed.
Declaring things deprecated
Classes and methods are marked deprecated using the VTKM_DEPRECATED
macro. The first argument of VTKM_DEPRECATED
should be set to the first
version in which the feature is deprecated. For example, if the last
released version of VTK-m was 1.5, and on the master branch a developer
wants to deprecate a class foo, then the VTKM_DEPRECATED
release version
should be given as 1.6, which will be the next minor release of VTK-m. The
second argument of VTKM_DEPRECATED
, which is optional but highly
encouraged, is a short message that should clue developers on how to update
their code to the new changes. For example, it could point to the
replacement class or method for the changed feature.
VTKM_DEPRECATED
can be used to deprecate a class by adding it between the
struct
or class
keyword and the class name.
struct VTKM_DEPRECATED(1.6, "OldClass replaced with NewClass.") OldClass
{
};
Aliases can similarly be depreciated, except the VTKM_DEPRECATED
macro
goes after the name in this case.
using OldAlias VTKM_DEPRECATED(1.6, "Use NewClass instead.") = NewClass;
Functions and methods are marked as deprecated by adding VTKM_DEPRECATED
as a modifier before the return value and any markup (VTKM_CONT, VTKM_EXEC, or VTKM_EXEC_CONT).
VTKM_DEPRECATED(1.6, "You must now specify a tolerance.") void ImportantMethod(double x)
VTKM_EXEC_CONT
{
this->ImportantMethod(x, 1e-6);
}
enum
s can be deprecated like classes using similar syntax.
enum struct VTKM_DEPRECATED(1.7, "Use NewEnum instead.") OldEnum
{
OLD_VALUE
};
Individual items in an enum
can also be marked as deprecated and
intermixed with regular items.
enum struct NewEnum
{
OLD_VALUE1 VTKM_DEPRECATED(1.7, "Use NEW_VALUE instead."),
NEW_VALUE,
OLD_VALUE2 VTKM_DEPRECATED(1.7) = 42
};
Using deprecated items
Using deprecated items should work, but the compiler will give a warning.
That is the point. However, sometimes you need to legitimately use a
deprecated item without a warning. This is usually because you are
implementing another deprecated item or because you have a test for a
deprecated item (that can be easily removed with the deprecated bit). To
support this a pair of macros, VTKM_DEPRECATED_SUPPRESS_BEGIN
and
VTKM_DEPRECATED_SUPPRESS_END
are provided. Code that legitimately uses
deprecated items should be wrapped in these macros.
VTKM_DEPRECATED(1.6, "You must now specify both a value and tolerance.")
VTKM_EXEC_CONT
void ImportantMethod()
{
// It can be the case that to implement a deprecated method you need to
// use other deprecated features. To do that, just temporarily suppress
// those warnings.
VTKM_DEPRECATED_SUPPRESS_BEGIN
this->ImportantMethod(0.0);
VTKM_DEPRECATED_SUPPRESS_END
}
Other
Porting layer for future std features
Currently, VTK-m is using C++11. However, it is often useful to use
features in the std
namespace that are defined for C++14 or later. We can
provide our own versions (sometimes), but it is preferable to use the
version provided by the compiler if available.
There were already some examples of defining portable versions of C++14 and
C++17 classes in a vtkmstd
namespace, but these were sprinkled around the
source code.
There is now a top level vtkmstd
directory and in it are header files
that provide portable versions of these future C++ classes. In each case,
preprocessor macros are used to select which version of the class to use.
Removed OpenGL Rendering Classes
When the rendering library was first built, OpenGL was used to implement the components (windows, mappers, annotation, etc.). However, as the native ray casting became viable, the majority of the work has focused on using that. Since then, the original OpenGL classes have been largely ignored.
It has for many months been determined that it is not work attempting to maintain two different versions of the rendering libraries as features are added and changed. Thus, the OpenGL classes have fallen out of date and did not actually work.
These classes have finally been officially removed.
Reorganization of io
directory
The vtkm/io
directory has been flattened.
Namely, the files in vtkm/io/reader
and vtkm/io/writer
have been moved up into vtkm/io
,
with the associated changes in namespaces.
In addition, vtkm/cont/EncodePNG.h
and vtkm/cont/DecodePNG.h
have been moved to a more natural home in vtkm/io
.
Implemented PNG/PPM image Readers/Writers
The original implementation of writing image data was only performed as a
proxy through the Canvas rendering class. In order to implement true support
for image-based regression testing, this interface needed to be expanded upon
to support reading/writing arbitrary image data and storing it in a vtkm::DataSet
.
Using the new vtkm::io::PNGReader
and vtkm::io::PPMReader
it is possible
to read data from files and Cavases directly and store them as a point field
in a 2D uniform vtkm::DataSet
auto reader = vtkm::io::PNGReader();
auto imageDataSet = reader.ReadFromFile("read_image.png");
Similarly, the new vtkm::io::PNGWriter
and vtkm::io::PPMWriter
make it possible
to write out a 2D uniform vtkm::DataSet
directly to a file.
auto writer = vtkm::io::PNGWriter();
writer.WriteToFile("write_image.png", imageDataSet);
If canvas data is to be written out, the reader provides a method for converting
a canvas's data to a vtkm::DataSet
.
auto reader = vtkm::io::PNGReader();
auto dataSet = reader.CreateImageDataSet(canvas);
auto writer = vtkm::io::PNGWriter();
writer.WriteToFile("output.png", dataSet);
Updated Benchmark Framework
The benchmarking framework has been updated to use Google Benchmark.
A benchmark is now a single function, which is passed to a macro:
void MyBenchmark(::benchmark::State& state)
{
MyClass someClass;
// Optional: Add a descriptive label with additional benchmark details:
state.SetLabel("Blah blah blah.");
// Must use a vtkm timer to properly capture eg. CUDA execution times.
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
// Optional: Report items and/or bytes processed per iteration in output:
state.SetItemsProcessed(state.iterations() * someClass.GetNumberOfItems());
state.SetBytesProcessed(state.iterations() * someClass.GetNumberOfBytes());
}
}
VTKM_BENCHMARK(MyBenchmark);
Google benchmark also makes it easy to implement parameter sweep benchmarks:
void MyParameterSweep(::benchmark::State& state)
{
// The current value in the sweep:
const vtkm::Id currentValue = state.range(0);
MyClass someClass;
someClass.SetSomeParameter(currentValue);
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
}
VTKM_BENCHMARK_OPTS(MyBenchmark, ->ArgName("Param")->Range(32, 1024 * 1024));
will generate and launch several benchmarks, exploring the parameter space of
SetSomeParameter
between the values of 32 and (1024*1024). The chain of
functions calls in the second argument is applied to an instance of
::benchmark::internal::Benchmark. See Google Benchmark's documentation for
more details.
For more complex benchmark configurations, the VTKM_BENCHMARK_APPLY macro
accepts a function with the signature
void Func(::benchmark::internal::Benchmark*)
that may be used to generate
more complex configurations.
To instantiate a templated benchmark across a list of types, the VTKM_BENCHMARK_TEMPLATE* macros take a vtkm::List of types as an additional parameter. The templated benchmark function will be instantiated and called for each type in the list:
template <typename T>
void MyBenchmark(::benchmark::State& state)
{
MyClass<T> someClass;
// Must use a vtkm timer to properly capture eg. CUDA execution times.
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
}
}
VTKM_BENCHMARK_TEMPLATE(MyBenchmark, vtkm::List<vtkm::Float32, vtkm::Vec3f_32>);
The benchmarks are executed by calling the VTKM_EXECUTE_BENCHMARKS(argc, argv)
macro from main
. There is also a VTKM_EXECUTE_BENCHMARKS_PREAMBLE(argc, argv, some_string)
macro that appends the contents of some_string
to the Google Benchmark preamble.
If a benchmark is not compatible with some configuration, it may call
state.SkipWithError("Error message");
on the ::benchmark::State
object and return. This is
useful, for instance in the filter tests when the input is not compatible with the filter.
When launching a benchmark executable, the following options are supported by Google Benchmark:
--benchmark_list_tests
: List all available tests.--benchmark_filter="[regex]"
: Only run benchmark with names that match[regex]
.--benchmark_filter="-[regex]"
: Only run benchmark with names that DON'T match[regex]
.--benchmark_min_time=[float]
: Make sure each benchmark repetition gathers[float]
seconds of data.--benchmark_repetitions=[int]
: Run each benchmark[int]
times and report aggregate statistics (mean, stdev, etc). A "repetition" refers to a single execution of the benchmark function, not an "iteration", which is a loop of thefor(auto _:state){...}
section.--benchmark_report_aggregates_only="true|false"
: If true, only the aggregate statistics are reported (affects both console and file output). Requires--benchmark_repetitions
to be useful.--benchmark_display_aggregates_only="true|false"
: If true, only the aggregate statistics are printed to the terminal. Any file output will still contain all repetition info.--benchmark_format="console|json|csv"
: Specify terminal output format: human readable (console
) orcsv
/json
formats.--benchmark_out_format="console|json|csv"
: Specify file output format: human readable (console
) orcsv
/json
formats.--benchmark_out=[filename]
: Specify output file.--benchmark_color="true|false"
: Toggle color output in terminal when usingconsole
output.--benchmark_counters_tabular="true|false"
: Print counter information (e.g. bytes/sec, items/sec) in the table, rather than appending them as a label.
For more information and examples of practical usage, take a look at the existing benchmarks in vtk-m/benchmarking/.
Provide scripts to build Gitlab-ci workers locally
To simplify reproducing docker based CI workers locally, VTK-m has python program that handles all the work automatically for you.
The program is located in [Utilities/CI/reproduce_ci_env.py ]
and requires python3 and pyyaml.
To use the program is really easy! The following two commands will create the build:rhel8
gitlab-ci
worker as a docker image and setup a container just as how gitlab-ci would be before the actual
compilation of VTK-m. Instead of doing the compilation, instead you will be given an interactive shell.
./reproduce_ci_env.py create rhel8
./reproduce_ci_env.py run rhel8
To compile VTK-m from the the interactive shell you would do the following:
> src]## cd build/
> build]## cmake --build .
Replaced vtkm::ListTag
with vtkm::List
The original vtkm::ListTag
was designed when we had to support compilers
that did not provide C++11's variadic templates. Thus, the design hides
type lists, which were complicated to support.
Now that we support C++11, variadic templates are trivial and we can easily
create templated type aliases with using
. Thus, it is now simpler to deal
with a template that lists types directly.
Hence, vtkm::ListTag
is deprecated and vtkm::List
is now supported. The
main difference between the two is that whereas vtkm::ListTag
allowed you
to create a list by subclassing another list, vtkm::List
cannot be
subclassed. (Well, it can be subclassed, but the subclass ceases to be
considered a list.) Thus, where before you would declare a list like
struct MyList : vtkm::ListTagBase<Type1, Type2, Type3>
{
};
you now make an alias
using MyList = vtkm::List<Type1, Type2, Type3>;
If the compiler reports the MyList
type in an error or warning, it
actually uses the fully qualified vtkm::List<Type1, Type2, Type3>
.
Although this makes errors more verbose, it makes it easier to diagnose
problems because the types are explicitly listed.
The new vtkm::List
comes with a list of utility templates to manipulate
lists that mostly mirrors those in vtkm::ListTag
: VTKM_IS_LIST
,
ListApply
, ListSize
, ListAt
, ListIndexOf
, ListHas
, ListAppend
,
ListIntersect
, ListTransform
, ListRemoveIf
, and ListCross
. All of
these utilities become vtkm::List<>
types (where applicable), which makes
them more consistent than the old vtkm::ListTag
versions.
Thus, if you have a declaration like
vtkm::ListAppend(vtkm::List<Type1a, Type2a>, vtkm::List<Type1b, Type2b>>
this gets changed automatically to
vtkm::List<Type1a, Type2a, Type1b, Type2b>
This is in contrast to the equivalent old version, which would create a new
type for vtkm::ListTagAppend
in addition to the ultimate actual list it
constructs.
Add ListTagRemoveIf
It is sometimes useful to remove types from ListTag
s. This is especially
the case when combining lists of types together where some of the type
combinations may be invalid and should be removed. To handle this
situation, a new ListTag
type is added: ListTagRemoveIf
.
ListTagRemoveIf
is a template structure that takes two arguments. The
first argument is another ListTag
type to operate on. The second argument
is a template that acts as a predicate. The predicate takes a type and
declares a Boolean value
that should be true
if the type should be
removed and false
if the type should remain.
Here is an example of using ListTagRemoveIf
to get basic types that hold
only integral values.
template <typename T>
using IsRealValue =
std::is_same<
typename vtkm::TypeTraits<typename vtkm::VecTraits<T>::BaseComponentType>::NumericTag,
vtkm::TypeTraitsRealTag>;
using MixedTypes =
vtkm::ListTagBase<vtkm::Id, vtkm::FloatDefault, vtkm::Id3, vtkm::Vec3f>;
using IntegralTypes = vtkm::ListTagRemoveIf<MixedTypes, IsRealValue>;
// IntegralTypes now equivalent to vtkm::ListTagBase<vtkm::Id, vtkm::Id3>
Write uniform and rectilinear grids to legacy VTK files
As a programming convenience, all vtkm::cont::DataSet
written by
vtkm::io::VTKDataSetWriter
were written as a structured grid. Although
technically correct, it changed the structure of the data. This meant that
if you wanted to capture data to run elsewhere, it would run as a different
data type. This was particularly frustrating if the data of that structure
was causing problems and you wanted to debug it.
Now, VTKDataSetWriter
checks the type of the CoordinateSystem
to
determine whether the data should be written out as STRUCTURED_POINTS
(i.e. a uniform grid), RECTILINEAR_GRID
, or STRUCTURED_GRID
(curvilinear).
References
Feature | Merge Request |
---|---|
Add Kokkos backend | Merge-request: !2164 |
Extract component arrays from unknown arrays | Merge-request: !2354 |
ArrayHandleGroupVecVariable holds now one more offset. |
Merge-request: !1964 |
Create ArrayHandleOffsetsToNumComponents |
Merge-request: !2299 |
Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal | Merge-request: !2116 |
ArrayRangeCompute works on any array type without compiling device code |
Merge-request: !2409 |
Algorithms for Control and Execution Environments | Merge-request: !1920 |
Redesign of ArrayHandle to access data using typeless buffers | Merge-request: !2347 |
vtkm::cont::internal::Buffer now can have ownership transferred |
Merge-request: !2200 |
Provide scripts to build Gitlab-ci workers locally | Merge-request: !2030 |
Configurable default types | Merge-request: !1997 |
Result DataSet of coordinate transform has its CoordinateSystem changed | Merge-request: !2099 |
Precompiled ArrayCopy for UnknownArrayHandle |
Merge-request: !2396 |
Disable asserts for CUDA architecture builds | Merge-request: !2157 |
Portals may advertise custom iterators | Merge-request: !1929 |
DataSet now only allows unique field names | Merge-request: !2099 |
ArrayHandleDecorator Allocate and Shrink Support | Merge-request: !1933 |
Deprecate ArrayHandleVirtualCoordinates | Merge-request: !2177 |
Deprecate DataSetFieldAdd |
Merge-request: !2106 |
Deprecate Execute with policy | Merge-request: !2093 |
Virtual methods in execution environment deprecated | Merge-request: !2256 |
Add VTKM_DEPRECATED macro | Merge-request: !2266 |
Filters specify their own field types | Merge-request: !2099 |
Flying Edges | Merge-request: !2099 |
Updated Benchmark Framework | Merge-request: !1936 |
Disable asserts for HIP architecture builds | Merge-request: !2270 |
Implemented PNG/PPM image Readers/Writers | Merge-request: !1967 |
Reorganization of io directory |
Merge-request: !2067 |
Add ListTagRemoveIf |
Merge-request: !1901 |
Masks and Scatters Supported for 3D Scheduling | Merge-request: !1975 |
Improvements to moving data into ArrayHandle | Merge-request: !2184 |
Avoid raising errors when operating on cells | Merge-request: !2099 |
Order asynchronous ArrayHandle access |
Merge-request: !2130 |
Enable setting invalid value in probe filter | Merge-request: !2122 |
ReadPortal().Get(idx) |
Merge-request: !2078 |
Recombine extracted component arrays from unknown arrays | Merge-request: !2381 |
Removed old ArrayHandle transfer mechanism |
Merge-request: !2347 |
Removed OpenGL Rendering Classes | Merge-request: !2099 |
Scope ExecObjects with Tokens | Merge-request: !1988 |
Shorter fancy array handle classnames | Merge-request: !1937 |
Support ArrayHandleSOA as a "default" array |
Merge-request: !2349 |
Porting layer for future std features | Merge-request: !1977 |
Add a vtkm::Tuple class | Merge-request: !1977 |
UnknownArrayHandle and UncertainArrayHandle for runtime-determined types | Merge-request: !2202 |
Added VecFlat class | Merge-request: !2354 |
Remove VTKDataSetWriter::WriteDataSet just_points parameter | Merge-request: !2185 |
Move VTK file readers and writers into vtkm_io | Merge-request: !2100 |
Write uniform and rectilinear grids to legacy VTK files | Merge-request: !2173 |
Add atomic free functions | Merge-request: !2223 |
Replaced vtkm::ListTag with vtkm::List |
Merge-request: !1918 |