The previous implementation of `RuntimeDeviceTracker` occasionally
outputted a log at level `Info` about devices being enabled or disabled.
The problem was that the information given was inconsistent (so it would
sometimes announce one change but not announce a different corrective
change). This could cause weird confusions. For example, when you used a
`ColorTable`, it would use a `ScopedRuntimeDeviceTracker` to temporarily
force the device to `Serial`. The log will just tell you that the device
was forced to `Serial` but never tell you that the devices where
restored to include actual parallel devices.
This change helps correct these with the following changes:
* Added a new log level, `DevicesEnabled`, that is a higher level than
`Info`. All logging from `RuntimeDeviceTracker` goes to this log
level.
* Change the logging output of `RuntimeDeviceTracker` to output a list
of currently enabled devices whenever a change happens. That way you
don't have to guess what happend for each change.
* Change `ScopedRuntimeDeviceTracker` to log whenever the scope is
entered or left.
The ScopedRuntimeDeviceTracker now can force, enable, or disable
devices. Additionally the ScopedRuntimeDeviceTracker and the
RuntimeDeviceTracker handle the DeviceAdapterTagAny robustly
across all methods.
As the RuntimeDeviceTracker is a per thread construct we now make
it explicit that you can only get a reference to the per-thread
version and can't copy it.
The RuntimeDeviceTracker had grown organically to handle multiple
different roles inside VTK-m. Now that we have device tags
that can be passed around at runtime, large portions of
the RuntimeDeviceTracker API aren't needed.
Additionally the RuntimeDeviceTracker had a dependency on knowing
the names of each device, and this wasn't possible
as that information was part of its self. Now we have moved that
information into RuntimeDeviceInformation and have broken
the recursion.
Previously you had to exactly match the case of a device adapter's name to
construct it, which was a source of lots of problems ( OpenMP versus OPENMP, CUDA or Cuda ).
Now `vtkm::cont::make_DeviceAdapterId` and `vtkm::cont::RuntimeDeviceTracker` support
case-insensitive device construction.
The DeepCopy method is used when a ScopedGlobalRuntimeDeviceTracker
is constructed. This in turn causes the rebuilding of the device
names and states which isn't a free operation. Now we copy the already
computed information.
This was noticeable when using ArrayHandleTransform since it uses
ScopedGlobalRuntimeDeviceTracker when construction host side
portals.
Also
- Renamed vtkm::cont::make_DeviceAdapterIdFromName to just overload
make_DeviceAdapterId.
- Refactored CMake logic for unit tests
- Since we're now querying the device tracker for the names, they
cannot be all caps.
- Updated usages of InitLogging to use Initialize instead.
- Added changelog.
VTK-m has been updated to replace old per device worklet testing executables with a device
dependent shared library so that it's able to accept a device adapter
at runtime.
Meanwhile, it updates the testing infrastructure APIs. vtkm::cont::testing::Run
function would call ForceDevice when needed and if users need the device
adapter info at runtime, RunOnDevice function would pass the adapter into the functor.
Optional Parser is bumped from 1.3 to 1.7.
By making RuntimeDeviceInformation class template independent, vtkm is
able to detect
device info at runtime with a runtime specified deviceId. In the past
it's impossible
because the CRTP pattern does not allow function overloading(compiler
would complain
that DeviceAdapterRuntimeDetector does not have Exists() function
defined).
This change allows you to set a subclass of
vtkm::cont::ExecutionObjectBase as a functor
used in ArrayHandleTransform. This latter class will then detect that
the functor is an ExecObject and will call PrepareForExecution with the
appropriate device to get the actual Functor object.
This change allows you to use virtual objects and other device dependent
objects as functors for ArrayHandleTransform without knowing a priori
what device the portal will be used on.
It will reduce the cost of getting the thread runtime device tracker,
and will have a better runtime overhead if user constructs a lot of
short lived threads that use VTK-m.
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.
The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.
To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
By using perfect forwarding we can reduce not only the amount of TryExecute
signatures, but we can enable the ability to pass temporary functors to
TryExecute.
At the same time we have optimized TryExecute by moving the string generation
code into a single function that is compiled into the vtkm_cont library.
The end result is that the vtkm_rendering library size has been reduced from
12MB to 11MB, and we shave off about 5% of our build time.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
This allows code to include the RuntimeDeviceTracker without depending
on the device-specific adapters (I think).
Also changed the implementation to use a shared_ptr for the state so you
can pass it around and share the state easier.
Before it was in the vtkm::cont::internal namespace. However, we expect
to be using this feature quite a bit more as we want VTK-m to handle
multiple devices effectively (as in, just figure it out and go).