When expanding variadic parameter packs in ArrayHandleDecorator
implementations, make sure that the types used are appropriate.
Since std::declval is used to test whether or not a method with
specific arguments exists, it is important that the reference types
are correct to ensure that the detection works as expected.
Portals are always passed to implementation functions as rvalue
references, and array handles are always passed as lvalue refs.
The convenience functions `ArrayPortalToIteratorBegin()` and
`ArrayPortalToIteratorEnd()` wouldn't detect specializations of
`ArrayPortalToIterators<PortalType>` since the specializations aren't
visible when the `Begin`/`End` functions are declared.
Since the CUDA iterators rely on a specialization, the convenience
functions would not compile on CUDA.
Now, instead of specializing `ArrayPortalToIterators` to provide custom
iterators for a particular portal, the portal may advertise custom
iterators by defining `IteratorType`, `GetIteratorBegin()`, and
`GetIteratorEnd()`. `ArrayPortalToIterators` will detect such portals
and automatically switch to using the specialized portals.
This eliminates the need for the specializations to be visible to the
convenience functions and allows them to be usable on CUDA.
There were issues with the particle advection code where a small number
of work-heavy task invocations were needed. Since we were enforcing a
minimum of 1024 invocations per thread, this effectively serialized
scheduling.
Now the scheduler dynamically adjusts for small thread launches,
allowing finer scheduling.
4659d69c7 Remove some commented out code
aec75ab1a Suppress CUDA warning about device calling host
851864d0b Work around with Visual Studio 2015 issue
452a2e1c9 Suppress warnings about CUDA host/device mismatch
4fdefe9f1 Suppress some deprecated warnings in visual studio
5cfc14482 Implement old ListTag features with new ListTag implementations
d5fe4046c Remove instances of ListTag in favor of List
92db37623 Convert uses of ListTagBase to List
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1918
The destructors of some control side objects (such as CellSet and
ArrayHandle) are defined. These destructors are obviously only compiled
for the control environment (i.e. for CUDA only for the host). However,
not all of the subclasses implemented their own destructors. In CUDA,
when a default destructor is used, it is compiled for both host and
device. This caused a problem as the superclass's destructor was only
compiled for the host and therefore caused a warning.
Fixed the problem by defining an empty destructor to any subclasses that
needed one.
It's weird that I ran into this problem while chaning the List TMP
class, but the solution seems fine.
A new header named TypeList.h and the type lists have been redefined in
this new file. All the types have been renamed from `TypeListTag*` to
`TypeList*`. TypeListTag.h has been gutted to provide deprecated
versions of the old type list names.
There were also some other type lists that were changed from using the
old `ListTagBase` to the new `List`.
The newer List operations should still work on the old ListTags, so make
those changes first to ensure that everything still works as expected if
given an old ListTag.
Next step is to deprecate ListTagBase itself and move all the lists to
the new types.
When it was originally created, it was assumed that the ArrayHandle
class would be used by a single thread. As the use of VTK-m expands,
that is no longer a safe assumption. To ensure that operations on
ArrayHandle happen correctly, add a mutex to the Internals of
ArrayHandle and require all operations on the Internals lock that mutex.
There is some behavior of GCC compilers before GCC 9.0 that is
incompatible with the specification of OpenMP 4.0. The workaround was
using the workaround any time a GCC compiler >= 9.0 was used. The proper
behavior is to only use the workaround when the GCC compiler is being
used and the version of the compiler is less than 9.0.
Also, switch to using VTKM_GCC to check for the GCC compiler instead of
__GNUC__. The problem with using __GNUC__ is that many other compilers
pretend to be GCC by defining this macro, but in cases like compiler
workarounds it is not accurate.
Previously when ApplyPolicyFieldOfType was used in cases where not all
of the types matched not all of the storages, those invalid arrays were
replaced with an ArrayHandleDiscard. This unnecessarily increased the
length of the type names used for the resulting ArrayHandleMultiplexer.
Now, ListTagRemoveIf is used to remove any invalid arrays so that the
resulting ArrayHandleMultiplexer only includes the valid arrays.
The older Xcode 9 compiler has troubles with ArrayHandleDecorator that
are similar to those of earlier Microsoft and Cuda compilers.
Note that the logic behind the changed compiler check has a lot of
guesswork involved. I noticed this problem on a laptop with Xcode 9
installed. However, even though Xcode uses the clang compiler, it
notoriously does not return the actual clang version. Instead, it
returns some toolchain version that has nothing to do with it. I'm
pretty sure Xcode 9 is using clang version 4 under the covers, but
__clang_major__ reports 9. Oddly, Xcode 10 reports __clang_major__ as 8,
so that's not too much help. So instead, we check for
__apple_build_version__, which returns the Xcode version (sort of) and
that seems a reasonable comparison.
Although I have not tried it, I'm willing to bet that the older clang
outside of Xcode will also have issues. Here is where the real guesswork
takes place since I don't have handy compilers to try. Like I said, I
think the internet claims that Xcode 9 is using clang 4, so also add to
the check any compiler that reports itself clang 4 or below.
The previous implementation of `RuntimeDeviceTracker` occasionally
outputted a log at level `Info` about devices being enabled or disabled.
The problem was that the information given was inconsistent (so it would
sometimes announce one change but not announce a different corrective
change). This could cause weird confusions. For example, when you used a
`ColorTable`, it would use a `ScopedRuntimeDeviceTracker` to temporarily
force the device to `Serial`. The log will just tell you that the device
was forced to `Serial` but never tell you that the devices where
restored to include actual parallel devices.
This change helps correct these with the following changes:
* Added a new log level, `DevicesEnabled`, that is a higher level than
`Info`. All logging from `RuntimeDeviceTracker` goes to this log
level.
* Change the logging output of `RuntimeDeviceTracker` to output a list
of currently enabled devices whenever a change happens. That way you
don't have to guess what happend for each change.
* Change `ScopedRuntimeDeviceTracker` to log whenever the scope is
entered or left.
The tao ones won't compile otherwise. This should be okay, since these
are all just compile time helpers on host-only functions -- they won't
generate different sized data structures etc if we mix them in the
same build.
8520d70e0 Compile most frequently used VTK-m filters into a library
d1d61b9eb vtkm::filter::Filter passes filter policies by value
4ff021b08 Improve VTK-m compilation times by compiling more keys<T> types
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1836
7e01edb01 Ensure that Portal::Set isn't defined for read-only portals.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1848
VTK-m now provides the following filters with the default policy
as part of the vtkm_filter library:
- CellAverage
- CleanGrid
- ClipWithField
- ClipWithImplicitFunction
- Contour
- ExternalFaces
- ExtractStructured
- PointAverage
- Threshold
- VectorMagnitude
By building these as a library we hope to provide faster compile
times for consumers of VTK-m when using common configurations.
93e638ce4 ArrayHandleCartesianProduct can be used with implicit handles
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1842
This patch removes (or conditionally removes) the Set method from
portals that are read-only so that IsWritableArrayHandle will work as
expected. The ArrayPortal doxygen has been updated to reflect this.
The remaining exceptions are `ArrayPortalVirtual` and
`ArrayPortalMultiplexer`, since their mutability cannot be determined at
compile time.
485df972f Update the documentation on the different VTK-m namespaces
d29f5ba37 Update doxyfile to suppress documenting unnecessary components.
18b09791e All export macros use the `VTK_M_*_EXPORT` pattern
f2a3ecd01 Don't generate doxygen for serialization helpers
fd4bcd809 Move PolicyExtrude into the correct vtk-m namespace
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1843
630768600 Suppress CUDA warnings
5fa02057a Rely less on overload resolution for ApplyPolicy
26d7bfd0d Force ArrayPolicy of a specific type to the right template
6c136b978 Remove vtkm::BaseComponent
07c59fcf7 Update filters with secondary fields to use new policy method
3039a18ba Add ability to get an array from a Field for a particular type
2b6e6da6c Add ability to get VariantArrayHandle as an ArrayHandleMultiplexer
6323d6803 Add recursive component queries to VecTraits
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1829
There was a warning comes from the functors in support of the portals
for ArrayHandleMultiplexer. The template has no good way to determine
whether the object it is calling is for control or execution, so it
supports both. It is not useful to warn when it happens to compile only
for the host.
Sometimes the CUDA runtime would not allocate sufficient stack
space for the particle advection code to run. This issue was exposed by
!1737 -- for some reason, once those changes to unrelated filters/worklets
are added to VTK, CUDA allocates less stack and the following tests would
fail:
UnitTestLagrangianFilterCUDA
UnitTestLagrangianStructuresFilterCUDA
UnitTestStreamlineFilterCUDA
UnitTestStreamSurfaceFilterCUDA
These were fixed by increasing the stack size in the particle advection
worklet Run(...) methods.
An RAII helper has been added that will restore the previous stack size
in case an exception is thrown, and the KDTree code has been updated
to use this helper when it adjusts the CUDA stack allocation.
This is done through a new version of ApplyPolicy. This version takes
a type of the array as its first template argument, which must be
specified.
This requires having a list of potential storage to try. It will use
that to construct an ArrayHandleMultiplexer containing all potential
types. This list of storages comes from the policy. A StorageList
item was added to the policy.
Types are automatically converted. So if you ask for a vtkm::Float64 and
field contains a vtkm::Float32, it will the array wrapped in an
ArrayHandleCast to give the expected type.
The DataSetBuilderExplicitIterative class used to use an ArrayHandle of
vtkm::Vec3f_32 values, which are always of type Float32. This can cause
unexpected results when using double precision by default (i.e. when
FloatDefault is set to Float64). Change that to give Float32 values by
default.
IsWritableArrayHandle can now be used directly as `std::true_type` or
`std::false_type` without having to pull the `type` member out of a
dependent namespace.
This behaves just like `ScanExclusive`, but rather than returning the
total sum, it is appended to the end of the output array.
This is in preparation for the CellSetExplicit refactoring described in
issue #408.
The `MultiBlock` class has been renamed to `PartitionedDataSet`, and its API
has been refactored to refer to "partitions", rather than "blocks".
Additionally, the `AddBlocks` method has been changed to `AppendPartitions` to
more accurately reflect the operation performed. The associated
`AssignerMultiBlock` class has also been renamed to
`AssignerPartitionedDataSet`.
This change is motivated towards unifying VTK-m's data model with VTK. VTK has
started to move away from `vtkMultiBlockDataSet`, which is a hierarchical tree
of nested datasets, to `vtkPartitionedDataSet`, which is always a flat vector
of datasets used to assist geometry distribution in multi-process environments.
This simplifies traversal during processing and clarifies the intent of the
container: The component datasets are partitions for distribution, not
organizational groupings (e.g. materials).
Ref #405
One of the dashboard compilers was complaining about not being able to
resolve which overload to use for std::size_t. (Perhaps on CUDA
std::size_t is sometimes not an unsigned 64-bit integer.) Try to correct
this by adding a templated method that casts anything to vtkm::UInt64.
The functions GetHumanReadableSize and GetSizeString accepted a
vtkm::UInt64 as the size, which should hold pretty much any reasonable
memory size and is compatible with std::size_t. But it has a different
sign-ness as vtkm::Id. So if you are holding an array size with vtkm::Id
(which is common), you could get a compiler warning when using these
functions, which is annoying. So, for convenience add a second form of
these methods that takes a vtkm::Id and automatically converts.
By removing the ability to have multiple CellSets in a DataSet
we can simplify the following things:
- Cell Fields now don't require a CellSet name when being constructed
- Filters don't need to manage what the active cellset is
Although convenient, one of the issues of creating data with
MakeTestDataSet is that it is hard to look at the data created. It is
often helpful to be able to bring in the data into something like
ParaView or VisIt to play with it. To enable that, write them all out as
part of UnitTestVTKDataSetWriter.
a9bbb6ead Permute cells inline
d37ab7732 Add ability to read FIELD section in vtk legacy files
19a610f5e Fix skipping over information in vtk files
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1797
1f8030a6e Add non-const version of DataSet::Get methods
69226803c Make PointTransform actually transform the points
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1803
A non-const version of GetCoordinateSystem was added to implement a
change to the point transform filter. To make things symmetric (and
provide likely future needs), also add non-const versions of getting the
fields and cell sets.
Previously, the VTK file readers just skipped over everything in FIELD
sections. However, it is common for many point and cell fields to be
written in this section (it is how ParaView writes out most of its
data). This change will allow these fields to be read in correctly.
The primary (likely only) use of PointTransform is to perform affine
transform on the position of the mesh. However, PointTransform did not
actually move the mesh. Rather, it just created a new field with the
transformed points.
Now, the output has its coordinate system replaced with the transformed
one generated (in addition to be added as a field). This can be turned
off (but defaults to on).
Also changed the constructor to turn on UseCoordinateSystemAsField so
that by default the filter operates on the existing coordinate system
and replaces it with the new coordinate system.
Also removed the template parameter on the filter. That added an
unnecessary complication to using it.
884616788 Simplify and extend AtomicArray implementation.
9560c4f63 Limit testing dispatch of atomic array to only atomic types.
0e728c800 Update atomic interfaces to support Add/CAS for UInt32/64.
720b452eb Force AtomicArray to use only basic storage arrays.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1802
- Use AtomicInterface to implement device-specific atomic operations.
- Remove DeviceAdapterAtomicArrayImplementations.
- Extend supported atomic types to include unsigned 32/64-bit ints.
- Add a static_assert to check that AtomicArray type is supported.
- Add documentation for AtomicArrayExecutionObject, including a CAS
example.
- Add a `T Get(idx)` method to AtomicArrayExecutionObject that does
an atomic load, and update existing CAS usage to use this instead
of `Add(idx, 0)`.
This allows for developers to do things such as the following
as constexpr's:
```cxx
constexpr vtkm::Id2 dims(16,16);
constexpr vtkm::Float64 dx = vtkm::Float64(4.0 * vtkm::Pi()) / vtkm::Float64(dims[0] - 1);
```
bf96d921d Add extra braces around std::array initializers
9bbf4a5a6 Corrections and expanded testing of implicit functions
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1776
Because ArrayPortalSOA calls a delegate portal to get the actual values,
it can only implement its own Set or Get if the delegate portal supports
it. Previously this was done by calling an overloaded internal method
based on the result of PortalSupportsSets/Gets. However, regardless of
whether the delegate portal supported Set or Get, ArrayPortalSOA
provided one. Thus, if something else tried to use PortalSupportsSets/
Gets on ArrayPortalSOA, it would always report true even if it was not
really supported.
Instead, use SFINAE to remove the Set or Get if that method is not
supported in the delegate portal.
Since ArrayHandleSOA is only really used for portals from basic storage
arrays, it will be rare that Set or Get is not supported. However, a
device adapter is free to remove one of these methods on a device
portal. For example, if you call PrepareForInput on an ArrayHandle, it
is possible that the device adapter will create a portal that has no Set
method because the array is not writable.
Thanks to Allison Vacanti for recomending this solution.
I kept getting warnings from different compilers about type conversions
because I was making values by adding an index to them. Change how we
create and test values so that these type issues are less likely to come
up.
This appears to be a bug in older clang and gcc compilers where they
gave a warning asking for an extra unnecessary set of braces around the
initializer of std::array. See for example
https://bugs.llvm.org/show_bug.cgi?id=21629https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25137
I don't think newer compilers give this warning, but we still support
some older compilers it happens on, so we live with it.
To help provide a better time writing VTK-m filter this streamlines
the CreateResult API to provide a focused set of methods, based on
how CreateResult has been used by existing filters.
For demonstration purposes, I added a vector field to one of the data
sets (the cow nose). However, the serialization test was not expecting a
field of that type and therefore had not compiled the case to serialize
that type of field, which caused a failure. Fixed the problem by
instructing the test to also consider the correct type of vector fields.
The `From` and `To` nomenclature for topology mapping has been confusing for
both users and developers, especially at lower levels where the intention of
mapping attributes from one element to another is easily conflated with the
concept of mapping indices (which maps in the exact opposite direction).
These identifiers have been renamed to `VisitTopology` and `IncidentTopology`
to clarify the direction of the mapping. The order in which these template
parameters are specified for `WorkletMapTopology` have also been reversed,
since eventually there may be more than one `IncidentTopology`, and having
`IncidentTopology` at the end will allow us to replace it with a variadic
template parameter pack in the future.
Other implementation details supporting these worklets, include `Fetch` tags,
`Connectivity` classes, and methods on the various `CellSet` classes (such as
`PrepareForInput` have also reversed their template arguments. These will need
to be cautiously updated.
The convenience implementations of `WorkletMapTopology` have been renamed for
clarity as follows:
```
WorkletMapPointToCell --> WorkletVisitCellsWithPoints
WorkletMapCellToPoint --> WorkletVisitPointsWithCells
```
The `ControlSignature` tags have been renamed as follows:
```
FieldInTo --> FieldInVisit
FieldInFrom --> FieldInMap
FromCount --> IncidentElementCount
FromIndices --> IncidentElementIndices
```
The Invoker is a control side object that handles the construction
of the relevant worklet dispatcher. Moving it to control makes it
obvious that it isn't an algorithm itself but a way to launch
worklets.
There was a special case for ArrayHandleMultiplexer where if you gave it
just one type it would treat that as a value type rather than an array
to support and instead provide a default list of types. However, GCC 4.8
is having trouble compiling the code to create the default list, the
semantics are confusing, and the more I think about it the less likely I
think we will need this functionality. So, just getting rid of that.
d80a8125c Sprinkle noexcept goodness on Variant and ArrayPortalMultiplexer
a96a13cf3 Use large case statements to CastAndCall variants
866e1d7d5 Update comparison for virtual and multiplexer arrays
5416cbeb7 Add ArrayHandleMultiplexer testing to BenchmarkFieldAlgorithms
d45106452 Add changedoc for ArrayHandleMultiplexer
0aa15c97c Fix 'Failed to specialize alias template' error from Visual Studio
7b72e31df Fixes for CUDA
5e2385352 Create ArrayHandleMultiplexer
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1726
The code was working fine on all the dashboards except for the Visual
Studio 2015 compiles on delve. It gave an error like:
ArrayHandleMultiplexer.h(398): error C2938: 'ArrayHandleToStorageTag<unknown-type>' : Failed to specialize alias template
A StackOverflow article (https://stackoverflow.com/questions/43411542/
metaprogramming-failed-to-specialize-alias-template) suggests that this
is a bug in older versions of Visual Studio. Although fixed in more
recent versions, we might have to support older versions.
Currently, ListTags are implemented by having a subtype name list set to
a brigand::list. However, there is always a chance this will change. To
make things more explicit, create a vtkm::internal::ListTagToBrigandList
to make it clear what the resulting type should be (and provide some
potential future-proofing).
Also add a convenient vtkm::ListTagApply that allows you to easily
instantiate a template with the list of types in a ListTag.
Previously, IsWriteableArrayHandle just checked to see if an
ArrayHandle's portal has a ValueType of void* because we had coded the
special read-only array handles to have fake portals for writing.
However, we recently removed that because it was more trouble than it
was worth. Now IsWritableArrayHandle checks for the existance of the Set
method. If it does not exist, then the portal is considered read-only.
Also corrected spelling (writeable -> writable).
Previously, `ArrayHandleCast` was considered a read-only array handle.
However, it is trivial to reverse the cast (now that `ArrayHandleTransform`
supports an inverse transform). So now you can write to a cast array
(assuming the underlying array is writable).
One trivial consequence of this change is that you can no longer make a
cast that cannot be reversed. For example, it was possible to cast a simple
scalar to a `Vec` even though it is not possible to convert a `Vec` to a
scalar value. This was of dubious correctness (it is more of a construction
than a cast) and is easy to recreate with `ArrayHandleTransform`.
Several ArrayHandles (actuall Storage implementations) had a fake portal
type that only defined invalid value types and no Get/Set methods. The
idea was to quickly identify when using a read-only array for writing.
However, this was more trouble than it was worth as the compiler just
gives an incomprehensible error and it is hard to track down the actual
value.
Now actually define some type even if it is never used.
28484fc6a Update examples and benchmarks to use new VTK-m CMake helper function
ea50e82aa Move VTK-m CMake testing wrappers to the testing folder
0b7dd7c38 Add CMake vtkm_add_target_information() to make using vtk-m easier
e934e2273 vtkm_library WRAP_FOR_CUDA renamed to clarify the intent of the property
a2e6660fd Remove unused vtkm_compile_as_cuda CMake function
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Matt Larsen <larsen30@llnl.gov>
Merge-request: !1718
461f87dbc Fix issues with PointLocatorUniformGrid not finding all points
e473cb4bb Fix PointLocatorUniformGrid for points on boundary
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Li-Ta Lo <ollie@lanl.gov>
Merge-request: !1720
When creating the search structures in PointLocatorUniformGrid, a point
outside the boundary would be given an invalid bin id. These points
could never be found. Generally, this is not a big deal for points
outside of the boundary, but it could be a problem for points on the
boundary. A point on the boundary could be taken as outside the
boundary. Since the boundary is chosen from limits of the points, some
will almost always be on the boundary.
Fix this problem by clamping all points to the nearest valid bin. This
could cause a problem if the user has selected a boundary excluding a
lot of points. All those points could be grouped to the same edge bins,
but that is probably not a great idea anyway.
We want the option name to be clear that it might be applicable for
more than just CUDA. If VTK-m ever supported something like SYCL
it would not be clear that those sources should go in `WRAP_FOR_CUDA`.
how did any of this work?
match other CellSet file layouts.
???
compile in CUDA.
unit tests.
also only serial.
make error message accurate
Well, this compiles and works now.
Did it ever?
use CellShapeTagGeneric
UnitTest matches previous changes.
whoops
Fix linking problems.
Need the same interface
as other ThreadIndices.
add filter test
okay, let's try duplicating CellSetStructure.
okay
inching...
change to wedge in CellSetListTag
Means changing these to support it.
switch back to wedge from generic
compiles and runs
remove ExtrudedType
need vtkm_worklet
vtkm_worklet needs to be included
fix segment count for wedge specialization
need to actually save the index
for the other constructor.
specialize on Explicit
clean up warning
angled brackets not quotes.
formatting
512d0431e Cell and Point locators have correct export visibility
c7f827581 Correct signed to unsigned warning conversion found by clang-8
c28797845 Gradient's ComputeDivergence is now properly initialized
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1705
2e0f4dd37 Fix floating point error in test
b766d9a92 Improve support of testing recursive types
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1691
The UnitTestCudaArrayHandleFancy was failing because it was
was calling TestValue in a worklet and comparing that to
a value that was generated by calling TestValue on the host.
Because these were called on different architecture, it is
possible that they will not be bit-wise exact. Add a little
bit of tolerance to the check to avoid false failures for
this reason.
bcaf7d9be ScopedRuntimeDeviceTracker have better controls of setting devices.
4212d0c04 RuntimeDeviceInformation now says the AnyTag exists.
fa03dc664 ScopedRuntimeDeviceTracker requires a device to execute on when constructed.
4020f5198 RuntimeDeviceTracker can't be copied and is only accessible via reference.
e9482018e ScopedRuntimeDeviceTracker has the same API as RuntimeDeviceTracker
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1676
These methods need to be marked inline as they are used by multiple
TU's. When they aren't marked as inline we see random failures in
downstream VTK-m users generally when CUDA is enabled due to ODR
violations.
The ScopedRuntimeDeviceTracker now can force, enable, or disable
devices. Additionally the ScopedRuntimeDeviceTracker and the
RuntimeDeviceTracker handle the DeviceAdapterTagAny robustly
across all methods.
As the RuntimeDeviceTracker is a per thread construct we now make
it explicit that you can only get a reference to the per-thread
version and can't copy it.
The 8x8x8 is a better launch strategy for most VTK-m kernels.
The current problem is that a couple of VTK-m kernels use a
high number of registers and this number of threads combines to
require too many registers.
What we should do in the longer run is have more controls over
kernel launches on a per kernel basis. This will require VTK-m
to extract the number of registers being used by each kernel
The consistent API for control to execution memory transfers is
the ArrayHandle class. Previously the tests would verify memory
transfer by calling the ArrayManagerExecution class directly. This
is problematic as the class isn't used by ArrayHandle<T, StorageBasic>.