The `DeviceAdapter` provides an abstract interface to the accelerator
devices worklets and other algorithms run on. As such, the programmer has
less control about how the device launches each worklet. Each device
adapter has its own configuration parameters and other ways to attempt to
optimize how things are run, but these are always a universal set of
options that are applied to everything run on the device. There is no way
to specify launch parameters for a particular worklet.
To provide this information, VTK-m now supports `Hint`s to the device
adapter. The `DeviceAdapterAlgorithm::Schedule` method takes a templated
argument that is of the type `HintList`. This object contains a template
list of `Hint` types that provide suggestions on how to launch the parallel
execution. The device adapter will pick out hints that pertain to it and
adjust its launching accordingly.
These are called hints rather than, say, directives, because they don't
force the device adapter to do anything. The device adapter is free to
ignore any (and all) hints. The point is that the device adapter can take
into account the information to try to optimize for itself.
A provided hint can be tied to specific device adapters. In this way, an
worklet can further optimize itself. If multiple hints match a device
adapter, the last one in the list will be selected.
The `Worklet` base now has an internal type named `Hints` that points to a
`HintList` that is applied when the worklet is scheduled. Derived worklet
classes can provide hints by simply defining their own `Hints` type.
With the major revision 2.0 of VTK-m, many items previously marked as
deprecated were removed. If updating to a new version of VTK-m, it is
recommended to first update to VTK-m 1.9, which will include the deprecated
features but provide warnings (with the right compiler) that will point to
the replacement code. Once the deprecations have been fixed, updating to
2.0 should be smoother.
For several versions, VTK-m has had a `Variant` templated class. This acts
like a templated union where the object will store one of a list of types
specified as the template arguments. (There are actually 2 versions for the
control and execution environments, respectively.)
Because this is a complex class that required several iterations to work
through performance and compiler issues, `Variant` was placed in the
`internal` namespace to avoid complications with backward compatibility.
However, the class has been stable for a while, so let us expose this
helpful tool for wider use.
Several revisions ago, the ability to use virtual methods in the
execution environment was deprecated. Completely remove this
functionality for the VTK-m 2.0 release.
This mechanism sets up CMake variables that allow a user to select which
modules/libraries to create. Dependencies will be tracked down to ensure
that all of a module's dependencies are also enabled.
The modules are also arranged into groups.
Groups allow you to set the enable flag for a group of modules at once.
Thus, if you have several modules that are likely to be used together,
you can create a group for them.
This can be handy in converting user-friendly CMake options (such as
`VTKm_ENABLE_RENDERING`) to the modules that enable that by pointing to
the appropriate group.
There was a precompiled version of mapping permutations in the
filter library. However, there are reasons to be able to copy
a permutation of an array outside of filters. So add the
capability to the more general filter library.
`vtkm::cont::UnknownArrayHandle` now provides a set of method that
allows you to copy data from one `UnknownArrayHandle` to another. The
first method, `DeepCopyFrom`, takes a source `UnknownArrayHandle` and
deep copies the data to the called one. If the `UnknownArrayHandle`
already points to a real `ArrayHandle`, the data is copied into that
`ArrayHandle`. If the `UnknownArrayHandle` does not point to an existing
`ArrayHandle`, then a new `ArrayHandleBasic` with the same value type as
the source is created and copied into.
The second method, `CopyShallowIfPossibleFrom` behaves similarly to
`DeepCopyFrom` except that it will perform a shallow copy if possible.
That is, if the target `UnknownArrayHandle` points to an `ArrayHandle`
of the same type as the source `UnknownArrayHandle`, then a shallow copy
occurs and the underlying `ArrayHandle` will point to the source. If the
types differ, then a deep copy is performed. If the target
`UnknownArrayHandle` does not point to an `ArrayHandle`, then the
behavior is the same as the `=` operator.
One of the intentions of these new methods is to allow you to copy
arrays without using a device compiler (e.g. `nvcc`). Calling
`ArrayCopy` requires you to include the `ArrayCopy.h` header file, and
that in turn requires device adapter algorithms. These methods insulate
you from these.
`ConvertNumComponentsToOffsets` has been changed to provide a pre-
compiled version for common arrays. This helps with the dual goals of
compiling less device code and allowing data set builders to not have to
use the device compiler. For cases where you need to compile
`ConvertNumComponentsToOffsets` for a different kind of array, you can
use the internal `ConvertNumComponentsToOffsetsTemplate`.
Deprecated the `CellLocator` class and made all methods of the
other `CellLocator` classes non-virtual. General locators can
still use the `CellLocatorGeneral` class, but this class now
only works with a predefined set of locators. (The functionality
to provide a function to select a locator has been removed.)
What was previously declared as `ArrayHandleNewStyle` is now just the
implementation of `ArrayHandle`. The old implementation of `ArrayHandle`
has been moved to `ArrayHandleDeprecated`, and `ArrayHandle`s still
using this implementation must declare `VTKM_ARRAY_HANDLE_DEPRECATED` to
use it.
This method allows you to extract an `ArrayHandle` from
`UnknownArrayHandle` when you only know the base component type.
Also removed the `Read/WritePortalForBaseComponentType` method
from `UnknownArrayHandle`. This functionality is subsumed by
`ExtractArrayFromComponents`.
This allows you to handle just about every type of array with about 10
basic types. It allows you to ignore both the size of `Vec`s and the
actual storage of the data.
This class was used indirectly by the old `ArrayHandle`, through
`ArrayHandleTransfer`, to move data to and from a device. This
functionality has been replaced in the new `ArrayHandle`s through the
`Buffer` class (which can be compiled into libraries rather than make
every translation unit compile their own template).
This commit removes `ArrayManagerExecution` and all the implementations
that the device adapters were required to make. None of this code was in
any use anymore.
The new-style `ArrayHandle` uses `Buffer` objects to manage data. Thus,
when one is decorating the other, it expects to find the `Buffer`
objects, which the old-style `ArrayHandle`s do not have. To make the two
work together, fake buffers in the old-style arrays.
The buffers in old-style arrays are empty, but have metadata that points
back to the `ArrayHandle.
This portal only works on the control environment, which means it cannot
work with the new `ArrayHandle` type. Recent changes to
`ArrayHandleMultiplexer` also do not allow this, so just remove it
rather than try to fix it.
The `Variant` class is templated to hold objects of other types.
Depending on whether those objects of are meant to be used in the
control or execution side, the methods on `Variant` might need to be
declared with (or without) special modifiers. We can sometimes try to
compile the `Variant` methods for both host and device and ask the
device compiler to ignore incompatibilities, but that does not always
work.
To get around that, create two different implementations of `Variant`.
Their API and implementation is exactly the same except one declares its
methods with `VTKM_CONT` and the other its methods `VTKM_EXEC`.
The implementation of `VariantArrayHandle` has been changed to be a
relatively trivial subclass of `UnknownArrayHandle`.
The advantage of this change is twofold. First, it removes
`VariantArrayHandle`'s dependence on `ArrayHandleVirtual`, which gets us
much closer to deprecating that class. Second, it ensures that
`UnknownArrayHandle` is a reasonable replacement for
`VariantArrayHandle`, so we can move forward with replacing that.
The buffer class encapsulates the movement of raw C arrays between
host and devices.
The `Buffer` class itself is not associated with any device. Instead,
`Buffer` is used in conjunction with a new templated class named
`DeviceAdapterMemoryManager` that can allocate data on a given
device and transfer data as necessary. `DeviceAdapterMemoryManager`
will eventually replace the more complicated device adapter classes
that manage data on a device.
The code in `DeviceAdapterMemoryManager` is actually enclosed in
virtual methods. This allows us to limit the number of classes that
need to be compiled for a device. Rather, the implementation of
`DeviceAdapterMemoryManager` is compiled once with whatever compiler
is necessary, and then the `RuntimeDeviceInformation` is used to
get the correct object instance.
Initially, the probe filter would simply not set a value if a sample was
outside the input `DataSet`. This is not great as the memory could be
left uninitalized and lead to unpredictable results. The testing
compared these invalid results to 0, which seemed to work but is
probably unstable.
This was partially fixed by a previous change that consolidated to
mapping of cell data with a general routine that permuted data. However,
the fix did not extend to point data in the input, and it was not
possible to specify a particular invalid value.
This change specifically updates the probe filter so that invalid values
are set to a user-specified value.
Previously, when a ReadPortal or a WritePortal was returned from an
ArrayHandle, it had wrapped in it a Token that was attached to the
ArrayHandle. This Token would prevent other reads and writes from the
ArrayHandle.
This added safety in the form of making sure that the ArrayPortal was
always valid. Unfortunately, it also made deadlocks very easy. They
happened when an ArrayPortal did not leave scope immediately after use
(which is not all that uncommon).
Now, the ArrayPortal no longer locks up the ArrayHandle. Instead, when
an access happens on the ArrayPortal, it checks to make sure that
nothing has happened to the data being accessed. If it has, a fatal
error is reported to the log.
To get a portal to access ArrayHandle values in the control
environment, you now use the ReadPortal and WritePortal methods.
The portals returned are wrapped in an ArrayPortalToken object
so that the data between the portal and the ArrayHandle are
guaranteed to be consistent.
- Use AtomicInterface to implement device-specific atomic operations.
- Remove DeviceAdapterAtomicArrayImplementations.
- Extend supported atomic types to include unsigned 32/64-bit ints.
- Add a static_assert to check that AtomicArray type is supported.
- Add documentation for AtomicArrayExecutionObject, including a CAS
example.
- Add a `T Get(idx)` method to AtomicArrayExecutionObject that does
an atomic load, and update existing CAS usage to use this instead
of `Add(idx, 0)`.
Fixes#277
DeviceAdapterError existed to make sure that the default device adapter
template was being handled properly. Since the default device adapter doesn't
exist, and nothing is templated over it we can now remove DeviceAdapterError.
661fb64de AtomicInterfaceControl functions are marked with VTKM_SUPPRESS_EXEC_WARNINGS
0c70f9b9a Add BitFieldIn/Out/InOut worklet signature tags.
a66510e81 Add ArrayHandleBitField, a boolean-valued AH backed by a BitField.
56cc5c3d3 Add support for BitFields.
d01b97382 Allow VTKM_SUPPRESS_EXEC_WARNINGS to be used inside macros.
2f2ca9370 Add bit operations FindFirstSetBit and CountSetBits to Math.h.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1629
BitFields are:
- Stored in memory using a contiguous buffer of bits.
- Accessible via portals, a la ArrayHandle.
- Portals operate on individual bits or words.
- Operations may be atomic for safe use from concurrent kernels.
The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle
containing the indices of all set bits, in no particular order.
The new AtomicInterface classes provide an abstraction into bitwise
atomic operations across control and execution environments and are used
to implement the BitPortals.
We want to make sure that VariantArrayHandleContainer has as little
overhead when launch worklets as possible. To do so we cache
type information to make deducing the `T` type of ArrayHandles
as fast as possible.
Also
- Renamed vtkm::cont::make_DeviceAdapterIdFromName to just overload
make_DeviceAdapterId.
- Refactored CMake logic for unit tests
- Since we're now querying the device tracker for the names, they
cannot be all caps.
- Updated usages of InitLogging to use Initialize instead.
- Added changelog.
Benchmarking in VTK showed significant overhead in the computation
of the reverse connectivity calculation in
ConnectivityExplicitInternals::ComputeCellToPointConnectivity.
This patch adds a ReverseConnectivityBuilder that reduces the amount of
time and memory needed to build the table by using an atomic histogram
approach that avoids a costly radix SortByKey.
Key operations in the new helper class are templated to allow this
approach to be reused by VTK-specific cell array converters.
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
* Add FindDeviceAdapterTagAndCall
* Add support for multiple arguments to be passed to the functor in
'ForEachValidDevice' and 'FindDeviceAdapterTagAndCall'.