Also add a throwFailedRuntimeDeviceTransfer that throws a nicely
detailed message on why a something couldn't be transfered to
the requested device adapter.
The OpenMP Device Reduction algorithm previously used a std::vector<T>
to store the reduction results of each thread. This caused problems
when T=bool as the types became a proxy type which isn't usable
with vtkm BinaryOperators.
Additionally by fixing this issue in the FunctorsOpenMP we
can remove a workaround in FunctorsGeneral that caused
compile failures when using complex BinaryOperators
such as MinAndMax.
Previously ArrayHandleBasicImpl had no support for OpenMP since
we forgot to update the implementation. This version will
work when adding new devices without any changes.
Benchmarking in VTK showed significant overhead in the computation
of the reverse connectivity calculation in
ConnectivityExplicitInternals::ComputeCellToPointConnectivity.
This patch adds a ReverseConnectivityBuilder that reduces the amount of
time and memory needed to build the table by using an atomic histogram
approach that avoids a costly radix SortByKey.
Key operations in the new helper class are templated to allow this
approach to be reused by VTK-specific cell array converters.
There is no real reason why you cannot construct an
ArrayPortalFromIterators on a device, so go ahead and let that happen.
(This removes some CUDA warnings about calling __host__ from
__device__.)
Error: Throwing an exception in CUDA code.
Fix: Change method throwing exception to VTKM_CONT.
New warning: host/device warning in taotuple.
Fix: Markup additional taotuple methods with suppressions.
This also updates our taotuple checkout to match upstream master.
Previously, to query whether an ArrayHandle was writable with
IsWriteableArrayHandle, you had to specify a device adapter. The idea
was that it would look at the portal used for that device adapter.
Instead, check the control pointer, which should give the same
indication without having to have a separate check for every type of
device.
Found via `codespell` and `grep`
more typos
includes source typo change and a typo that needs further review
follow-up typos
Follow-up typos
Revert a commit
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
c3c8d0fd Refactor ArrayHandleCompositeVector to simplify usage and impl.
ec88b7dd Markup array portals for use in the exec env.
3159b376 Make Swizzle and ExtractComponent array parameters runtime vars.
19344707 Add vtkm_taotuple to build system.
d23c4452 Merge branch 'upstream-taotuple' into tmp
b3b14de7 taotuple 2018-05-15 (e3de8c97)
b00f6c1c Add taocpp/tuple as a thirdparty package.
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1199
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.
The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.
To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
Previously memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors
to support a Storage object that is being moved, we can clearly express that
the ArrayHandle now owns memory it didn't allocate.
Here is an example of how this is done:
```cpp
T* buffer = new T[100];
auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };
vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
storage(buffer, 100, user_free_function);
vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));
```
By hard coding the PrepareForDevice to know about all the different VTK-m
devices, we can have a single base class do the execution allocation, and not
have that logic repeated in each child class.
Generalize DeviceAdapterAlgorithm::Transform to accept input array of different value and storage type.
Add doxygen documentation in DeviceAdapterAlgorithm.h
* Add FindDeviceAdapterTagAndCall
* Add support for multiple arguments to be passed to the functor in
'ForEachValidDevice' and 'FindDeviceAdapterTagAndCall'.
The implementation of the simplified version of
DeviceAdapterAlgorithmGeneral::Unique had two errors.
First, the implementation is such that it calls the more complex version
of Unique (which specifies a binary predicate to establish equality).
However, it was not calling the Unique method in the DerivedAlgorithm
like it should have been. Instead, it was calling its own Unique
algorithm, which might not be as efficient as the specialized Unique for
the device.
Second, it was using std::equal_to as its binary predicate. Using
functors from std can be dangerous because they are not marked with
VTKM_EXEC, so have the potential to not work in the execution
environment. Instead, use the readily available vtkm::Equal binary
predicate.
The implementation of ScanExclusiveByKey in
DeviceAdapterAlgorithmGeneral by shifting values in the input values
array and then calling ScanInclusiveByKey. However, the temporary
shifted values array was created using the key type instead of the
values type. This caused a compile error when the keys and values had
different types.
In generic code, it's a pain to use the equality operators since they
requires the ValueType and Storage to match, else the operator is undefined.
This commit adds operators for such comparisons, as well as a unit test.
Often times we don't care about the output keys, and it's useful to
be able to pass an ArrayHandleDiscard into the algorithm to save
memory in these cases.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
By using the auto keyword and decltype we can reduce the number of
complex typedefs that exist when writing device adapter algorithms.
The goal being that it is easier for developers to see the actual
algorithms being implemented, by reducing the amount of template
'noise'.
Previously ArrayHandleReverse would only work if it was provided an explicit
users array to map too. But this doesn't need to be so, if a user wants to
start by constructing an ArrayHandleReverse we should allow that.
The side effect of this, is that some very tricky code in the DeviceAdapters
can be removed, that explicitly was added to allow output to ArrayHandleReverse
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
af61c7e6 Merge branch 'array_handle_reverse_write_2' of gitlab.kitware.com:ollielo/vtk-m into array_handle_reverse_write_2
869c9789 Merge branch 'array_handle_reverse_write_2' of gitlab.kitware.com:ollielo/vtk-m into array_handle_reverse_write_2
14592b82 Merge branch 'master' of gitlab.kitware.com:vtk/vtk-m into array_handle_reverse_write_2
1c50c86a remove commented out code to print values to stdout
80d4ee90 Enable output to ArrayHandleReverse
f64cb5ef Added test for using ArrayHandleReverse as output of ScanInclusiveByKey
87fe6768 thruw ErrorBadType, add PrepareForOutput from Pat
60542804 fixed a typo and added write support for ArrayHandleReverse
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !783
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.
Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB
WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB
libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
There were some issues for device adapter algorithms (like scan and
reduce) for empty arrays or arrays of size 1. This adds tests for these
short arrays to the device adapter algorithm tests and fixes these
problems.
18c4190d Update classes using RuntimeDeviceTracker
200928ef Add a global RuntimeDeviceTracker
a3573117 Compile RuntimeDeviceTracker.cxx with CUDA when appropriate
814b2db1 Add the implementation of RuntimeDeviceTracker to cont library
b9d3206e Move RuntimeDeviceTracker to vtkm::cont
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !707
Before it was in the vtkm::cont::internal namespace. However, we expect
to be using this feature quite a bit more as we want VTK-m to handle
multiple devices effectively (as in, just figure it out and go).
ConnectivityExplicitInternals::PrintSummary blindly attempted to print
all of its arrays regardless of whether they are valid. Trying to print
uninitialized arrays can be invalid in some circumstances. Instead,
check to see of the arrays are valid before printing them out. (This can
give you useful information anyway. There is a difference between an
uninitialized array and one of zero length.)
This makes it easier to see what is going on in the fancy arrays and do
diagnostics.
This change required some changes to printSummary_ArrayHandle to support
more array types.
The current design for ArrayPortalVirtual makes it a requirement for all
array portals (that it wraps) to have Set defined. Thus, make sure Set is
defined for all ArrayPortal. Where Set is invalid, an assert is thrown if
something calls it at runtime.
This reduces the number of weak vtables vtkm generates, resulting in
a reduction of binary sizes for projects that include vtkm classes in
multiple translation units.
Class that need to be passed across dynamic library boundaries such as
DynamicArrayHandle need to be properly export. One of 'tricks' of this
is that templated classes such as PolymorphicArrayHandleContainer need
the Type and Storage types to have public visibility.
This makes sure that all vtkm storage tags have public visibility so
in the future we can transfer dynamic array handles across libraries.
b97b4cc7 Allow thrust::reduce to work when iterator and initial value types differ.
64bcc343 Refactor MinAndMax to use vtkm::Vec<T,2> instead of Pair.
8d60ed57 Refactor MinAndMax to be a shared binary operator.
18375b54 Update Bound computations to always use a single Reduce call
2cfc9743 Reduce can support reduce to a T type that isn't the arrayhandles T type.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !614
We previously included windows.h in numerous locations using different
techniques to guard against bringing in parts of the file that are bad
(min/max macros, etc). This solves the problem by consistently using
vtkm/internal/Windows.h to setup everything.
This is a fancy array handle that can group entries in another array by
arbitrary amounts. This allows us to implement input and output arrays
with a different sized Vec for each instance. This is necessary for
generating new topologies with cells of different types.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
There were many tests that created code paths for every base and Vec
type that VTK-m supports (up to 4 components). Although this is
admirable, it is also excessive, and our compile times for the tests are
very long.
To shorten compile times, remove the TryAllTypes method. Replace it with
a version of TryTypes that uses a default list of "exemplar" set of
integers, floats, and Vecs.
This allows callers to copy a subsection of an array into another array,
without clearing the contents of the destination array if a resize
is required.
The TryExecute function was based off of the ResolveFieldTypeAnd*
classes made for filters. To reduce the amount of code duplication,
modify these two filter classes to use the more general TryExecute
functionality.
There are various reasons why you might want to execute something but
not have a specific device to execute on. To mange this, add a general
function that will try a list of devices in order and attempt to run on
them in order.
I don't know why, but under some circumstances when compiling with CUDA
and Visual Studio, the compiler was giving syntax errors when the
IteratorFromArrayPortal was using the typename keyword to reference a
type in its superclass. Get around the problem by looking directly in
the superclass.
There are several tests in the cont directory that are in header files so
that they can be recompiled for different devices. Make sure that the
tests are exclusively using the device being tested by making the error
device adapter the default.
If any part of the test tries to use the default device (which could be
different than the one being tested), a compile error will occur. Several
of these compile errors are fixed in this commit.
There were a few places in the source code where
std::numeric_limits::min and max were used. There is an issue with these
methods on windows because the standard libraries there define macros
with the same name. Get around this problem by either places parentheses
so that they do not look like macros or use the vtkm::Infinity methods
instead.
When compiling under VisualStudio we need to first determine if checked
iterators are enabled ( _ITERATOR_DEBUG_LEVEL ). We don't want to use the
NDEBUG key, as we could be inside a project that is in Debug mode with
disabled checked iterators.
Secondly if they are enabled we need to handle the use case of NULL iterators
that get advanced by length zero. This last case is valid, but isn't supported
by the checked iterators so we need to work around it
Template instantiation is useful because when you are creating object files, as
uninstantiated template definitions are not are not added. Fully supporting
explicit instantiation like ITK does will require more code changes, but
this is a very minor step towards that goal.
These asserts are consolidated into the unified Assert.h. Also made some
minor edits to add asserts where appropriate and a little bit of
reconfiguring as found.
Previously we hinted to the compiler that it should vectorized operations
where the input and output are the same array. This obviously caused problems,
and these hints had to be removed.
In the future we need to first check for aliased arrays, and go from there.
Even when running with the serial backend, the compiler might enable SIMD
vectorization when optimizations are turned on. When this occurs, we need
to use properly atomic Add's and CAE's.
Previously each device adapter only had a unique string name. This was
not the best when it came to developing data structures to track the status
of a given device at runtime.
This adds in a unique numeric identifier to each device adapter. This will
allow classes to easily create bitmasks / lookup tables for the validity of
devices.
When writing multiple backend code users of vtkm need to use the
DeviceAdapterTraits classes, so therefore we should move them to vtkm::cont
to signify this.
Array transforms can now be created with an inverse functor, allowing for
casts back into the native array type. As a result, array transforms with
both a functor and inverse functor defined can perform read and write
operations. As an example, ArrayHandleCast now supports this operation. The
original implementation of ArrayHandleCast (i.e. read only) has been renamed
'ArrayHandleCastForInput'.
This now allows for even more efficient construction of uniform point
coordinates when running under the 3d scheduler, since we don't need to go
from 3d index to flat index to 3d index, instead we stay in 3d index
9a8809f9 Add CellSetPermutation which allows custom iteration over a cell set.
66f6db5a IsWriteableArrayHandle now can tell if an array handle can be written too
20f3fb50 Update VertexClustering to use vtkm::cont::CellSetSingleType.
154896b7 Extend the test for DataSetSingleType.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !228
Even when using implicit index's the ConnectivityExplicit would generate
the code to compute the IndexOffsets, which would than fail to compile as
the ArrayHandle would only support read operations. This fixes that issue.
98885186 Fix CopyInto tests that use different DeviceAdapterTag
69b2ad2a Add unit tests for CopyInto function
2c55b15c Add additional control logic for CopyInto function
20c1a048 CopyInto function for ArrayHandles
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !202
Xcode 7 warnings
The XCode 7 compiler has a new warning for unused typedefs. The Boost code we use has some instances where this warning gets issued. Suppress these warnings.
See merge request !199
This is to be used in place of BOOST_STATIC_ASSERT so that we can
control its implementation.
The implementation is designed to fix the issue where the latest XCode
clang compiler gives a warning about a unused typedefs when the boost
static assert is used within a function. (This warning also happens when
using the C++11 static_assert keyword.) You can suppress this warning
with _Pragma commands, but _Pragma commands inside a block is not
supported in GCC. The implementation of VTKM_STATIC_ASSERT handles all
current cases.
ArrayHandles in DAX have a CopyInto function which allows the user to copy an array handle's data into a compatible STL type iterator. Originally this was fairly straight forward to implement since array handles in DAX are templated on the DeviceAdapterTag. In contrast, VTKm array handles use a polymorphic ArrayHandleExecutionManager under the hood allowing a single array handle to interface with multiple devices at runtime. To achieve this virtual functions are used. This makes implementing the CopyInto function difficult since it is templated on the IteratorType and virtual functions cannot be templated.
To work around this, I've implemented a concrete templated CopyInto function in the class derived from ArrayHandleExecutionManagerBase. In the ArrayHandle class, CopyInto dynamically casts the base class into the derived class, then calls the CopyInto function defined in the derived class.
The drawback to this approach is that, should the user define their own class that inherits from ArrayHandleExectionManagerBase, they are not forced to implement the CopyInto function, unlike the other virtual functions.
fd685210 Always install all device headers even when device isn't enabled.
b1663b24 Add an example of using multiple backends from a single translation unit.
fc0ff69d Methods with try/catch need to be host only.
4d635d64 DeviceAdapter Tags now always exist, and contain if the device is valid.
cf32b430 Teach Configure.h to store if TBB and CUDA are enabled.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !198
Previously it was really hard to verify if a device adapter was valid. Since
you would have to check for the existence of the tag. Now the tag always
exists, but instead you query the traits of the DeviceAdapter to see if
it is a valid adapter.
This makes compiling with multiple backends alot easier.
The boost assert macros seem to have an issue where they define an
unused typedef. This is causing the XCode 7 compiler to issue a warning.
Since the offending code is in a macro, the warning is identified with
the VTK-m header even though the code is in boost. To get around this,
wrap all uses of the boost assert that is causing the warning in the
third party pre/post macros to disable the warning.
The idea of the PointCoordinate classes was to make it easier to define
new special types of point coordinate arrays. But ultimately you have to
create an array handle type, and the CoordinateSystem class pretty much
handles everything else for you. Thus, these classes where being used
nowhere.
My version of the PGI compiler was having problems with using
IteratorFromArrayPortal with STL algorithms. I traced the problem to
iterator_facade checking to see if the reference type we gave it was
a real reference (e.g. T&). It is not, iterator_facade downgraded the
iterator trait to a simple input iterator tag even though I declared
it with a random access traversal. I don't know what the reference type
has to do with random access, but in any case the value object is
designed to behave like a reference in that when you assign to it
the value gets propagated to the array. To tell boost this is the case,
I made a specialization of boost::is_reference that declares the
value type as a reference.
I'm not sure why it failed for me but not elsewhere. It might be that
this version of the PGI compiler is using "old-style" iterator traits
whereas other were using newer style that matches better the boost
iterator traits that iterator_facade is actually using.
Clean up CellSet
Underneath the CellSet implementation is a set of supporting classes that manage the actual structure in both the control and execution environments. However, the implementation of these classes was a bit confusing and inconsistent. The following changes are made:
* Most significantly, there is no longer any Connectivity classes in the control environment. This functionality has been wrapped up into the CellSet classes, which is more consistent and easier to understand. (There was a definite distinction between CellSet and Connectivity, but it was subtle and difficult to understand.) This also means that edits to CellSets happen to CellSets directly.
* The set of classes for structured and explicit cell sets match. There is different functionality within, but the class naming and meaning are consistent.
* Make the class names more consistent with the rest of VTK-m class names. Specifically classes like ExplicitConnectivity become ConnectivityExplicit. Also, the words regular and structured were being used interchangeably. Now, always use structured except when dealing specifically with grids of regular spacing.
* The connectivity classes were using the nomenclature "From" and "To" to specify topological elements of links. The same concept in worklet classes were using the nomenclature "Src" and "Dest." For consistency, all references are changed to "From" and "To".
* Unlike explicit cell sets, structured cell sets have functionality shared between control and execution environments. Rather than duplicate it or create unique exposed classes, have a shared internal implementation in vtkm::internal.
See merge request !117
(Re-) Add a helper structure that holds the connectivity information for
a particular topology connection (e.g. from points to cells) to make it
easier to manage connections in multiple different directions in
CellSetExplicit.
Unlike the previous version of connectivity, this structure is
considered "internal" and not exposed through the API so that
CellSetExplicit can better manage the data. Also, many of the helper
methods remain in CellSetExplicit since they were specific for point-to-
Also, CellSetExplicit has a mechanism to take an arbitrary pair of
TopologyElementTags and get the appropriate connectivity. This should
simplify adding connections in the future.
On one of my compile platforms, GCC was giving conversion warnings from
any boost include that was not wrapped in pragmas to disable conversion
warnings. To make things easier and more robust, I created a pair of
macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should
be wrapped around any #include of a boost header file.
DynamicCellSet
Add a ```DynamicCellSet``` class to use in place of raw pointers or boost ```smart_ptr```s to make managing the anonymous class and casting easier.
See merge request !103
Previously, IteratorFromArrayPortal was declaring its difference_type
to be vtkm::Id. Although this is allowed, there is code that assumes
that iterators have a difference_type that is ptrdiff_t or something
similar. This change makes the difference_type the default for the
boost iterator facade, which should be the type other code that\
neglects to check expects.
The DynamicCellSet will be used in place of the pointer to a CellSet
in a DataSet. This will prevent us from having to cast it all the time
and also remove reliance on boost smart_ptr.
Correct type for the output portal in DeviceAdapterAlgorithmGeneral::UpperBounds
It looks like the type here was accidently written as a `ArrayHandle<T, COut>` when it should (I think) be `ArrayHandle<vtkm::Id, COut>`. I Encountered this causing a compilation error when trying to run some
benchmarks with FloatDefault input and values since it was using the
wrong type for some iterators if I recall correctly and couldn't cast to the vtkm::Id
type. This should fix that by using the correct type for the output
portal.
See merge request !75
Sometimes it is useful to know how big an array held in a
DynamicArrayHandle is. This adds two methods to DynamicArrayHandle:
GetNumberOfValues and GetNumberOfComponents. They allow you to query the
size of the array without having to statically cast to the array itself.
Encountered this causing a compilation error when trying to run some
benchmarks with FloatDefault input and values since it was using the
wrong type for some iterators iirc and couldn't cast to the vtkm::Id
type. This should fix that by using the correct type for the output
portal.
ArrayPortalFromIterators uses the GetIteratorBegin method signature.
This makes all the custom ArrayPortals have the same syntax,
which is nice when writing new backends.
See merge request !48