The `From` and `To` nomenclature for topology mapping has been confusing for
both users and developers, especially at lower levels where the intention of
mapping attributes from one element to another is easily conflated with the
concept of mapping indices (which maps in the exact opposite direction).
These identifiers have been renamed to `VisitTopology` and `IncidentTopology`
to clarify the direction of the mapping. The order in which these template
parameters are specified for `WorkletMapTopology` have also been reversed,
since eventually there may be more than one `IncidentTopology`, and having
`IncidentTopology` at the end will allow us to replace it with a variadic
template parameter pack in the future.
Other implementation details supporting these worklets, include `Fetch` tags,
`Connectivity` classes, and methods on the various `CellSet` classes (such as
`PrepareForInput` have also reversed their template arguments. These will need
to be cautiously updated.
The convenience implementations of `WorkletMapTopology` have been renamed for
clarity as follows:
```
WorkletMapPointToCell --> WorkletVisitCellsWithPoints
WorkletMapCellToPoint --> WorkletVisitPointsWithCells
```
The `ControlSignature` tags have been renamed as follows:
```
FieldInTo --> FieldInVisit
FieldInFrom --> FieldInMap
FromCount --> IncidentElementCount
FromIndices --> IncidentElementIndices
```
Previously, IsWriteableArrayHandle just checked to see if an
ArrayHandle's portal has a ValueType of void* because we had coded the
special read-only array handles to have fake portals for writing.
However, we recently removed that because it was more trouble than it
was worth. Now IsWritableArrayHandle checks for the existance of the Set
method. If it does not exist, then the portal is considered read-only.
Also corrected spelling (writeable -> writable).
As the RuntimeDeviceTracker is a per thread construct we now make
it explicit that you can only get a reference to the per-thread
version and can't copy it.
When TransferInfo is given memory from VirtualObjectTransferShareWithControl
it doesn't have a bound function ptr for the destruction. In those cases
we need to make sure the HostCopyOfDevice is properly deleted, otherwise
we will cause a memory leak.
ff687016e For VTK-m libs all includes of DeviceAdapterTagCuda happen from cuda files
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1648
It is very easy to cause ODR violations with DeviceAdapterTagCuda.
If you include that header from a C++ file and a CUDA file inside
the same program we an ODR violation. The reasons is that the C++
versions will say the tag is invalid, and the CUDA will say the
tag is valid.
The solution to this is that any compilation unit that includes
DeviceAdapterTagCuda from a version of VTK-m that has CUDA enabled
must be invoked by the cuda compiler.
Fixes#277
DeviceAdapterError existed to make sure that the default device adapter
template was being handled properly. Since the default device adapter doesn't
exist, and nothing is templated over it we can now remove DeviceAdapterError.
661fb64de AtomicInterfaceControl functions are marked with VTKM_SUPPRESS_EXEC_WARNINGS
0c70f9b9a Add BitFieldIn/Out/InOut worklet signature tags.
a66510e81 Add ArrayHandleBitField, a boolean-valued AH backed by a BitField.
56cc5c3d3 Add support for BitFields.
d01b97382 Allow VTKM_SUPPRESS_EXEC_WARNINGS to be used inside macros.
2f2ca9370 Add bit operations FindFirstSetBit and CountSetBits to Math.h.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1629
BitFields are:
- Stored in memory using a contiguous buffer of bits.
- Accessible via portals, a la ArrayHandle.
- Portals operate on individual bits or words.
- Operations may be atomic for safe use from concurrent kernels.
The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle
containing the indices of all set bits, in no particular order.
The new AtomicInterface classes provide an abstraction into bitwise
atomic operations across control and execution environments and are used
to implement the BitPortals.
18ff6681f Less vtkm_cont cxx files bring in the cuda device
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1612
This adds an ExecutionSignature tag named Device that passes the
DeviceAdapterTag as an argument to the worklet's operator(). This allows
worklets to specialize their code based on the device.
Previously we just took the optionparser.h file and stuck it right in
our source code. That was problematic for a variety of reasons.
1. It incorrectly assigned our license to external code.
2. It made lots of unnecessary changes to the original source (like
reformatting).
3. It made it near impossible to track patches we make and updates to
the original software.
Instead, use the third-party system to track changes to optionparser.h
in a different repository and then pull that into ours.
The RuntimeDeviceTracker had grown organically to handle multiple
different roles inside VTK-m. Now that we have device tags
that can be passed around at runtime, large portions of
the RuntimeDeviceTracker API aren't needed.
Additionally the RuntimeDeviceTracker had a dependency on knowing
the names of each device, and this wasn't possible
as that information was part of its self. Now we have moved that
information into RuntimeDeviceInformation and have broken
the recursion.
e1f5c4dd9 Modify VariantAH::AsVirtual to cast to new ValueType if needed.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1585
E.g:
```
ArrayHandle<Float64> doubleArray;
VariantArrayHandle varHandle{doubleArray};
ArrayHandleVirtual<Float32> = varHandle.AsVirtual<Float32>();
```
If there is a loss in range and/or precision, a warning is logged. If
the ValueTypes are Vecs with mismatched widths, an ErrorBadType is thrown.
Internally, an ArrayHandleCast is used between the VariantArrayHandle's
stored array and the ArrayHandleVirtual.
Query and report number of components at runtime so that it can work with
arrays of types with runtime varying number of components
Eg. ArrayHandleGroupVecVariable.
The ArrayPortalValueReference is supposed to behave just like the value
it encapsulates and does so by automatically converting to the base type
when necessary. However, when it is possible to convert that to
something else, it is possible to get errors about ambiguous overloads.
To avoid these, add specialized versions of the operators to specify
which ones should be used.
Also consolidated the CUDA version of an ArrayPortalValueReference to the
standard one. The two implementations were equivalent and we would like
changes to apply to both.
`vtkm::cont::testing` now initializes with logging enabled and support
for device being passed on the command line, `vtkm::testing` only
enables logging.
When you call VariantArrayHandle::CastAndCall, it now tries both basic
storage and virtual storage. You can modify the types of storages tried
by giving a type list of storage tags as the first argument.
We want to make sure that VariantArrayHandleContainer has as little
overhead when launch worklets as possible. To do so we cache
type information to make deducing the `T` type of ArrayHandles
as fast as possible.
Previously you had to exactly match the case of a device adapter's name to
construct it, which was a source of lots of problems ( OpenMP versus OPENMP, CUDA or Cuda ).
Now `vtkm::cont::make_DeviceAdapterId` and `vtkm::cont::RuntimeDeviceTracker` support
case-insensitive device construction.
ArrayHandleVirtual can automatically be constructed from any ArrayHandle.
In the cases where the input ArrayHandle doesn't derived from ArrayHandleVirtual,
it will automatically construct StorageAny to hold the array.
Also
- Renamed vtkm::cont::make_DeviceAdapterIdFromName to just overload
make_DeviceAdapterId.
- Refactored CMake logic for unit tests
- Since we're now querying the device tracker for the names, they
cannot be all caps.
- Updated usages of InitLogging to use Initialize instead.
- Added changelog.
VTK-m has been updated to replace old per device worklet testing executables with a device
dependent shared library so that it's able to accept a device adapter
at runtime.
Meanwhile, it updates the testing infrastructure APIs. vtkm::cont::testing::Run
function would call ForceDevice when needed and if users need the device
adapter info at runtime, RunOnDevice function would pass the adapter into the functor.
Optional Parser is bumped from 1.3 to 1.7.
Also add a throwFailedRuntimeDeviceTransfer that throws a nicely
detailed message on why a something couldn't be transfered to
the requested device adapter.
The OpenMP Device Reduction algorithm previously used a std::vector<T>
to store the reduction results of each thread. This caused problems
when T=bool as the types became a proxy type which isn't usable
with vtkm BinaryOperators.
Additionally by fixing this issue in the FunctorsOpenMP we
can remove a workaround in FunctorsGeneral that caused
compile failures when using complex BinaryOperators
such as MinAndMax.
Previously ArrayHandleBasicImpl had no support for OpenMP since
we forgot to update the implementation. This version will
work when adding new devices without any changes.
Benchmarking in VTK showed significant overhead in the computation
of the reverse connectivity calculation in
ConnectivityExplicitInternals::ComputeCellToPointConnectivity.
This patch adds a ReverseConnectivityBuilder that reduces the amount of
time and memory needed to build the table by using an atomic histogram
approach that avoids a costly radix SortByKey.
Key operations in the new helper class are templated to allow this
approach to be reused by VTK-specific cell array converters.
There is no real reason why you cannot construct an
ArrayPortalFromIterators on a device, so go ahead and let that happen.
(This removes some CUDA warnings about calling __host__ from
__device__.)
Error: Throwing an exception in CUDA code.
Fix: Change method throwing exception to VTKM_CONT.
New warning: host/device warning in taotuple.
Fix: Markup additional taotuple methods with suppressions.
This also updates our taotuple checkout to match upstream master.
Previously, to query whether an ArrayHandle was writable with
IsWriteableArrayHandle, you had to specify a device adapter. The idea
was that it would look at the portal used for that device adapter.
Instead, check the control pointer, which should give the same
indication without having to have a separate check for every type of
device.
Found via `codespell` and `grep`
more typos
includes source typo change and a typo that needs further review
follow-up typos
Follow-up typos
Revert a commit
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
c3c8d0fd Refactor ArrayHandleCompositeVector to simplify usage and impl.
ec88b7dd Markup array portals for use in the exec env.
3159b376 Make Swizzle and ExtractComponent array parameters runtime vars.
19344707 Add vtkm_taotuple to build system.
d23c4452 Merge branch 'upstream-taotuple' into tmp
b3b14de7 taotuple 2018-05-15 (e3de8c97)
b00f6c1c Add taocpp/tuple as a thirdparty package.
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1199
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.
The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.
To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
Previously memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors
to support a Storage object that is being moved, we can clearly express that
the ArrayHandle now owns memory it didn't allocate.
Here is an example of how this is done:
```cpp
T* buffer = new T[100];
auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };
vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
storage(buffer, 100, user_free_function);
vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));
```
By hard coding the PrepareForDevice to know about all the different VTK-m
devices, we can have a single base class do the execution allocation, and not
have that logic repeated in each child class.
Generalize DeviceAdapterAlgorithm::Transform to accept input array of different value and storage type.
Add doxygen documentation in DeviceAdapterAlgorithm.h
* Add FindDeviceAdapterTagAndCall
* Add support for multiple arguments to be passed to the functor in
'ForEachValidDevice' and 'FindDeviceAdapterTagAndCall'.
The implementation of the simplified version of
DeviceAdapterAlgorithmGeneral::Unique had two errors.
First, the implementation is such that it calls the more complex version
of Unique (which specifies a binary predicate to establish equality).
However, it was not calling the Unique method in the DerivedAlgorithm
like it should have been. Instead, it was calling its own Unique
algorithm, which might not be as efficient as the specialized Unique for
the device.
Second, it was using std::equal_to as its binary predicate. Using
functors from std can be dangerous because they are not marked with
VTKM_EXEC, so have the potential to not work in the execution
environment. Instead, use the readily available vtkm::Equal binary
predicate.
The implementation of ScanExclusiveByKey in
DeviceAdapterAlgorithmGeneral by shifting values in the input values
array and then calling ScanInclusiveByKey. However, the temporary
shifted values array was created using the key type instead of the
values type. This caused a compile error when the keys and values had
different types.
In generic code, it's a pain to use the equality operators since they
requires the ValueType and Storage to match, else the operator is undefined.
This commit adds operators for such comparisons, as well as a unit test.
Often times we don't care about the output keys, and it's useful to
be able to pass an ArrayHandleDiscard into the algorithm to save
memory in these cases.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
By using the auto keyword and decltype we can reduce the number of
complex typedefs that exist when writing device adapter algorithms.
The goal being that it is easier for developers to see the actual
algorithms being implemented, by reducing the amount of template
'noise'.
Previously ArrayHandleReverse would only work if it was provided an explicit
users array to map too. But this doesn't need to be so, if a user wants to
start by constructing an ArrayHandleReverse we should allow that.
The side effect of this, is that some very tricky code in the DeviceAdapters
can be removed, that explicitly was added to allow output to ArrayHandleReverse
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
af61c7e6 Merge branch 'array_handle_reverse_write_2' of gitlab.kitware.com:ollielo/vtk-m into array_handle_reverse_write_2
869c9789 Merge branch 'array_handle_reverse_write_2' of gitlab.kitware.com:ollielo/vtk-m into array_handle_reverse_write_2
14592b82 Merge branch 'master' of gitlab.kitware.com:vtk/vtk-m into array_handle_reverse_write_2
1c50c86a remove commented out code to print values to stdout
80d4ee90 Enable output to ArrayHandleReverse
f64cb5ef Added test for using ArrayHandleReverse as output of ScanInclusiveByKey
87fe6768 thruw ErrorBadType, add PrepareForOutput from Pat
60542804 fixed a typo and added write support for ArrayHandleReverse
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !783
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.
Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB
WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB
libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
There were some issues for device adapter algorithms (like scan and
reduce) for empty arrays or arrays of size 1. This adds tests for these
short arrays to the device adapter algorithm tests and fixes these
problems.
18c4190d Update classes using RuntimeDeviceTracker
200928ef Add a global RuntimeDeviceTracker
a3573117 Compile RuntimeDeviceTracker.cxx with CUDA when appropriate
814b2db1 Add the implementation of RuntimeDeviceTracker to cont library
b9d3206e Move RuntimeDeviceTracker to vtkm::cont
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !707
Before it was in the vtkm::cont::internal namespace. However, we expect
to be using this feature quite a bit more as we want VTK-m to handle
multiple devices effectively (as in, just figure it out and go).
ConnectivityExplicitInternals::PrintSummary blindly attempted to print
all of its arrays regardless of whether they are valid. Trying to print
uninitialized arrays can be invalid in some circumstances. Instead,
check to see of the arrays are valid before printing them out. (This can
give you useful information anyway. There is a difference between an
uninitialized array and one of zero length.)
This makes it easier to see what is going on in the fancy arrays and do
diagnostics.
This change required some changes to printSummary_ArrayHandle to support
more array types.
The current design for ArrayPortalVirtual makes it a requirement for all
array portals (that it wraps) to have Set defined. Thus, make sure Set is
defined for all ArrayPortal. Where Set is invalid, an assert is thrown if
something calls it at runtime.
This reduces the number of weak vtables vtkm generates, resulting in
a reduction of binary sizes for projects that include vtkm classes in
multiple translation units.
Class that need to be passed across dynamic library boundaries such as
DynamicArrayHandle need to be properly export. One of 'tricks' of this
is that templated classes such as PolymorphicArrayHandleContainer need
the Type and Storage types to have public visibility.
This makes sure that all vtkm storage tags have public visibility so
in the future we can transfer dynamic array handles across libraries.
b97b4cc7 Allow thrust::reduce to work when iterator and initial value types differ.
64bcc343 Refactor MinAndMax to use vtkm::Vec<T,2> instead of Pair.
8d60ed57 Refactor MinAndMax to be a shared binary operator.
18375b54 Update Bound computations to always use a single Reduce call
2cfc9743 Reduce can support reduce to a T type that isn't the arrayhandles T type.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !614
We previously included windows.h in numerous locations using different
techniques to guard against bringing in parts of the file that are bad
(min/max macros, etc). This solves the problem by consistently using
vtkm/internal/Windows.h to setup everything.
This is a fancy array handle that can group entries in another array by
arbitrary amounts. This allows us to implement input and output arrays
with a different sized Vec for each instance. This is necessary for
generating new topologies with cells of different types.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.