Deprecate `VirtualObjectHandle` and all other classes that are used to
implement objects with virtual methods in the execution environment.
Additionally, the code is updated so that if the
`VTKm_NO_DEPRECATED_VIRTUAL` flag is set none of the code is compiled at
all. This opens us up to opportunities that do not work with virtual
methods such as backends that do not support virtual methods and dynamic
libraries for CUDA.
This class was used indirectly by the old `ArrayHandle`, through
`ArrayHandleTransfer`, to move data to and from a device. This
functionality has been replaced in the new `ArrayHandle`s through the
`Buffer` class (which can be compiled into libraries rather than make
every translation unit compile their own template).
This commit removes `ArrayManagerExecution` and all the implementations
that the device adapters were required to make. None of this code was in
any use anymore.
When `DeviceAdapterAlgorithm::ScheduleTask` was called directly (i.e.
not through `Schedule`), nothing was added to the log. Adding
`VTKM_LOG_SCOPE` to these methods so that all scheduling is added to the
performance log.
Now that we have atomic free functions (e.g. `vtkm::AtomicAdd()`), we no
longer need special implementations for control and each execution
device. (Well, technically we do have special implementations for each,
but they are handled with compiler directives in the free functions.)
Convert the old atomic interface classes (`AtomicInterfaceControl` and
`AtomicInterfaceExecution`) to use the new atomic free functions. This
will allow us to test the new atomic functions everywhere that atomics
are used in VTK-m.
Once verified, we can deprecate the old atomic interface classes.
The buffer class encapsulates the movement of raw C arrays between
host and devices.
The `Buffer` class itself is not associated with any device. Instead,
`Buffer` is used in conjunction with a new templated class named
`DeviceAdapterMemoryManager` that can allocate data on a given
device and transfer data as necessary. `DeviceAdapterMemoryManager`
will eventually replace the more complicated device adapter classes
that manage data on a device.
The code in `DeviceAdapterMemoryManager` is actually enclosed in
virtual methods. This allows us to limit the number of classes that
need to be compiled for a device. Rather, the implementation of
`DeviceAdapterMemoryManager` is compiled once with whatever compiler
is necessary, and then the `RuntimeDeviceInformation` is used to
get the correct object instance.
This fixes, which where triggered since in the new CI, one of the
docker runner set `OMP_NUM_THREADS=3`:
1. `UnitTestOpenMPDeviceAdapter`
2. `UnitTestMeshQualityFilter`
In the redution optimized implementation for _OpenMP_, it unrolls
the reduce loop in iterations of four elements. The last iteration
in the loop might overflow the loop end element (when it is not a
multiple of four).
This commit fixes this by setting the OpenMP unrolled reduce loop
end element to its previous closest multiple of four of the original end
element.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
The `ArrayHandleStreaming` class stems from an old research project
experimenting with bringing data from an `ArrayHandle` in parts and
overlapping device transfer and execution. It works, but only in very
limited contexts. Thus, it is not actually used today. Plus, the feature
requires global indexing to be permutated throughout the worklet
dispatching classes of VTK-m for no further reason.
Because it is not really used, there are other more promising approaches
on the horizon, and it makes further scheduling improvements difficult,
we are removing this functionality.
1f1688483 Initial infrastructure to allow WorkletMapField to have 3D scheduling
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1938
Marked the old versions of PrepareFor* that do not use tokens as
deprecated and moved all of the code to use the new versions that
require a token. This makes the scope of the execution object more
explicit so that it will be kept while in use and can potentially be
reclaimed afterward.
The old version of ExecutionObject (that only takes a device) is still
supported, but you will get a deprecated warning if that is what is
defined.
Supporing this also included sending vtkm::cont::Token through the
vtkm::cont::arg::Transport mechanism, which was a change that propogated
through a lot of code.
1f61c500e Remove non-atomic ops from BitField unit test.
5565848d9 Use a dynamic strategy for openmp 1D scheduling.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1925
There were issues with the particle advection code where a small number
of work-heavy task invocations were needed. Since we were enforcing a
minimum of 1024 invocations per thread, this effectively serialized
scheduling.
Now the scheduler dynamically adjusts for small thread launches,
allowing finer scheduling.
There is some behavior of GCC compilers before GCC 9.0 that is
incompatible with the specification of OpenMP 4.0. The workaround was
using the workaround any time a GCC compiler >= 9.0 was used. The proper
behavior is to only use the workaround when the GCC compiler is being
used and the version of the compiler is less than 9.0.
Also, switch to using VTKM_GCC to check for the GCC compiler instead of
__GNUC__. The problem with using __GNUC__ is that many other compilers
pretend to be GCC by defining this macro, but in cases like compiler
workarounds it is not accurate.
BitFields are:
- Stored in memory using a contiguous buffer of bits.
- Accessible via portals, a la ArrayHandle.
- Portals operate on individual bits or words.
- Operations may be atomic for safe use from concurrent kernels.
The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle
containing the indices of all set bits, in no particular order.
The new AtomicInterface classes provide an abstraction into bitwise
atomic operations across control and execution environments and are used
to implement the BitPortals.
The purpose of the TestBuild infrastructure was to confirm that
VTK-m didn't have any lexical issues when it was a pure header
only project. As we now move to have more compiled components
the need for this form of testing is mitigated. Combined
with the issue of TestBuilds causing MSVC issues, we should
just remove this infrastructure.
The OpenMP Device Reduction algorithm previously used a std::vector<T>
to store the reduction results of each thread. This caused problems
when T=bool as the types became a proxy type which isn't usable
with vtkm BinaryOperators.
Additionally by fixing this issue in the FunctorsOpenMP we
can remove a workaround in FunctorsGeneral that caused
compile failures when using complex BinaryOperators
such as MinAndMax.