It is very easy to cause ODR violations with DeviceAdapterTagCuda.
If you include that header from a C++ file and a CUDA file inside
the same program we an ODR violation. The reasons is that the C++
versions will say the tag is invalid, and the CUDA will say the
tag is valid.
The solution to this is that any compilation unit that includes
DeviceAdapterTagCuda from a version of VTK-m that has CUDA enabled
must be invoked by the cuda compiler.
`vtkm::cont::testing` now initializes with logging enabled and support
for device being passed on the command line, `vtkm::testing` only
enables logging.
Rather than force all dispatchers to be templated on a device adapter,
instead use a TryExecute internally within the invoke to select a device
adapter.
Because this removes the need to declare a device when invoking a
worklet, this commit also removes the need to declare a device in
several other areas of the code.
The original design of invoke and the transport infrastructure
relied on the implementation behavior of vtkm::cont types
such as ArrayHandle that used an internal shared_ptr to managed
state. This allowed passing by value instead of passing by
non-const ref when needing to transfer information to the device.
As VTK-m adds support for classes that use virtuals the ability
to pass by base pointer type allows for us to invoke worklets
using a base type without the risk of type slicing.
Additional by moving over to a non-const ref Invocation we
can update all transports that have 'output' to now be
by ref and therefore support types that can't be copied while
being 'more' correct.
While making changes to how execution objects work, we had agreed to
name the base object ExecutionObjectBase instead of its original name of
ExecutionObjectFactoryBase. Somehow that change did not make it through.
In order to make the change from the current way execution obejcts are utilized to the new proposed executionObjectFactory process type checks now has to look for the new execution object factory class to check against.
Previously, when a Worklet needed a scatter, the scatter object was
stored in the Worklet object. That was problematic because that means
the Scatter, which is a control object, was shoved into the execution
environment.
To prevent that, move the Scatter into the Dispatcher object. The
worklet still declares a ScatterType alias, but no longer has a
GetScatter method. Instead, the Dispatcher now takes a Scatter object in
its constructor. If using the default scatter (ScatterIdentity), the
default constructor is used. If using another type of Scatter that
requires data to set up its state, then the caller of the worklet needs
to provide that to the dispatcher. For convenience, worklets are
encouraged to have a MakeScatter method to help construct a proper
scatter object.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
Previously the arguments to the operator of a vtkm::cont::arg::Transport
were the control object, the input domain object, and the output range.
If you wanted to, for example, check the size of an input array to make
sure it matched the input range, you would have to know the meaning of
the input domain object to query its range. This made it hard to create
generic transports, like TransportTagArrayIn, that accept data from
multiple different input domains but need to know the input range.
Previously, the operator for a Transport class took the object being
transported to the execution environment and the size of the output
domain. This change also passes in the control-side argument for the
input domain. This will help check input array sizes as well as make
other potential transformations based on the input domain.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
This changes the interface to the ThreadIndices classes to have both
input and output indices. It also adds a visit index to ThreadIndices.
Also added the VisitIndex execution signature tag, which relies on this
behavior.
Previously, all Fetch objects received an Invocation object in their
Load and Store methods. The point of this was that it allowed the Fetch
to get data from any of the execution objects. However, every Fetch
either just got data directly from its associated execution object or
else used a secondary execution object (the input domain) to get indices
into their own execution object.
This left two potential areas for improvement. First, pulling data out
of the Invocation object was unnecessarily complicated. It would be much
nicer to get data directly from the associated execution object. Second,
when getting index information from the input domain, it was often the
case that extra computations were necessary (particularly on structured
cell sets). There was no way to share the index information among
Fetches, and therefore the computations were replicated.
This change removes the Invocation from the Fetch Load and Store.
Instead, it passes the associated execution object and a new object type
called the ThreadIndices. The ThreadIndices are customized for the input
domain and therefore have all the information needed for a redirected
lookup. It is also a thread-local object so it can cache computed
indices and save on computation time.
The ExecObject tag for the ControlSignature was not declared right so
would cause a compile error if it was ever used. Clearly this was not
being tested properly, so the dispatcher base unit test now passes an
ExecObject parameter.
It's easy to put accidently put something that is not a valid tag in a
ControlSignature or ExecutionSignature. Previously, when you did that
you got a weird error at the end of a very long template instantiation
chain that made it difficult to find the offending worklet.
This adds some type checks when the dispatcher is instantated to check
the signatures. It doesn't point directly to the signature or its
parameter, but it is much closer.
These changes support the implementation of DispatcherBase. This class
provides the basic functionality for calling an Invoke method in the
control environment, transferring data to the execution environment,
scheduling threads in the execution environment, pulling data for each
calling of the worklet method, and actually calling the worklet.