As we remove more and more virtual methods from VTK-m, I expect several
users will be interested in completely removing them from the build for
several reasons.
1. They may be compiling for hardware that does not support virtual
methods.
2. They may need to compile for CUDA but need shared libraries.
3. It should go a bit faster.
To enable this, a CMake option named `VTKm_NO_DEPRECATED_VIRTUAL` is
added. It defaults to `OFF`. But when it is `ON`, none of the code that
both uses virtuals and is deprecated will be built.
Currently, only `ArrayHandleVirtual` is deprecated, so the rest of the
virtual classes will still be built. As we move forward, more will be
removed until all virtual method functionality is removed.
`UnknownArrayHandle` has special behavior when putting in or getting out
an `ArrayHandleMultiplexer` or an `ArrayHandleCast`. When putting in
either of these, `UnknownArrayHandle` gets the actual array stored in
the multiplexer and cast so that you can later retrieve the base array.
Likewise, when you get the array out with `AsArrayHandle`, you can give
it an `ArrayHandleCast` or `ArrayHandleMultiplexer`, and you will get
the base array placed inside of it.
The Visual Studio compiler has an annoying habit where if you use a
templated class or method with a deprecated class as a template
parameter, you will get a deprecation warning where that class is used
in the templated thing. Thus, if you want to suppress the warning, you
have to supress every instance of the template, not just where the
template is declared.
This is annoying behavior that is thankfully not replicated in other
compilers.
The implementation of `VariantArrayHandle` has been changed to be a
relatively trivial subclass of `UnknownArrayHandle`.
The advantage of this change is twofold. First, it removes
`VariantArrayHandle`'s dependence on `ArrayHandleVirtual`, which gets us
much closer to deprecating that class. Second, it ensures that
`UnknownArrayHandle` is a reasonable replacement for
`VariantArrayHandle`, so we can move forward with replacing that.
The new method `ArrayHandle::DeepCopy` had the pattern such that the
data in the `this` array was pushed to the provided destination array.
However, this is the opposite pattern used in the equivalent method in
VTK, which takes the data from the provided array and copies it to
`this` array.
So, swap the pattern to match that of VTK. Also, make the method name
more descriptive by renaming it to `DeepCopyFrom`. Hopefully, users will
read that to mean given the `ArrayHandle`, you copy data from the other
provided `ArrayHandle`.
A recent modification to `ArrayCopy` created a fast path for when
copying arrays of the same type. This fast path just deep copies the
buffers. The issue was that the buffer copy was creating new buffers
instead of updating the existing buffers. The passed in `ArrayHandle`
would get updated to the new buffers, but any other `ArrayHandle`s
sharing those buffers would still have the old versions. That would lead
to unexpected errors in code like this.
```cpp
vtkm::cont::ArrayHandle<T> outArray1;
vtkm::cont::ArrayHandle<T> outArray2 = outArray1;
vtkm::cont::ArrayCopy(inArray, outArray2);
```
If `inArray` was a different type than `outArray2`, then the data for
both `outArray1` and `outArray2` would be updated (which is the expected
behavior for something considered a "pointer"). However, if `inArray`
was the same type as `outArray2`, then the fast path would change
`outArray2` but leave `outArray1` the same.
This behavior has been corrected so that, in this case, the data of
`outArray1` always follows that of `outArray2`.
C++ was not resolving the overloads of `ArrayCopyImpl` as expected,
which was causing `ArrayCopy` to sometimes use a less efficient method
for copying.
Also fix an issue with `Buffer::DeepCopy` that caused a deadlock when
copying a buffer that was not actually allocated anywhere (as well as
failing to copy the metadata, which was probably the whole point).
The old style `ArrayHandle` stored most of its state, including the
data, in the `vtkm::cont::internal::Storage` object (templated to the
type of array). The new style of `ArrayHandle` stores the data itself in
`Buffer` objects, and recent changes to `Buffer` allow metadata to be
stored there, too.
These changes make it pretty unnecessary to hold any state at all in the
`Storage` object. This is good since the sharing of state from one type
of `ArrayHandle` to another (such as by transforming the data), can be
done by just sharing the `Buffer` objects.
To reinforce this behavior, the `Storage` object has been changed to
make it completely stateless. All the methods of `Storage` must be
marked as `static`.
While in the transition between two types of `ArrayHandle`
implementations, we need to declare when an `ArrayHandle` is implemented
with the new style. To consolidate, create a
`VTKM_ARRAY_HANDLE_NEW_STYLE` to override the default `ArrayHandle`
implementation with the `ArrayHandleNewStyle` implementation.
The testing helper class provided a method named `GetTestDataBasePath`
that returned the base path to all the data files stored in the VTK-m
repo. This is fine, but it was a little cumbersome to build filenames.
To make things easier, there is now a new method named `DataPath` that
takes a string of the filename (or, rather, subpath) to the file in that
directory and automatically builds the path to it.
4345fe26b Store the number of bits of a BitField in the Buffer's metadata
da0403be7 Add metadata to Buffer object.
a84891cd3 Update ArrayHandleBitField to new array style with Buffer
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2218
The number of bits in a `BitField` cannot be directly implied from the
size of the buffer (because the buffer gets padded to the nearest sized
word). Thus, the `BitField stored the number of bits in its own
internals.
Unfortunately, that caused issues when passing the `BitField` data
between it and an `ArrayHandleBitField`. If the `ArrayHandleBitField`
resized itself, the `BitField` would not see the new size because it
ignored the new buffer size.
To get around this problem, `BitField` now declares its own
`BufferMetaData` that stores the number of bits. Now, since the number
of bits is stored in the `Buffer` object, it is sufficient to just share
the `Buffer` to synchronize all of the state.
One of the goals of the `Buffer` object is to allow sharing of data
among objects that will interpret the data differently or give a
different interface over the data. However, when sharing only the array,
important metadata can become lost.
Provide a field that can store some custom metadata in the buffer object
so that the rest of the state can follow the buffer object around.
Now that we have the functions in `vtkm/Atomic.h`, we can deprecate (and
eventually remove) the more cumbersome classes `AtomicInterfaceControl`
and `AtomicInterfaceExecution`.
Also reversed the order of the `expected` and `desired` parameters of
`vtkm::AtomicCompareAndSwap`. I think the former order makes more sense
and matches more other implementations (such as `std::atomic` and the
GCC `__atomic` built ins). However, there are still some non-deprecated
classes with similar methods that cannot easily be switched. Thus, it's
better to be inconsistent with most other libraries and consistent with
ourself than to be inconsitent with ourself.
Now that we have atomic free functions (e.g. `vtkm::AtomicAdd()`), we no
longer need special implementations for control and each execution
device. (Well, technically we do have special implementations for each,
but they are handled with compiler directives in the free functions.)
Convert the old atomic interface classes (`AtomicInterfaceControl` and
`AtomicInterfaceExecution`) to use the new atomic free functions. This
will allow us to test the new atomic functions everywhere that atomics
are used in VTK-m.
Once verified, we can deprecate the old atomic interface classes.
ed41874cc Consolidate tests for base vtkm code that is device-specific
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2219
Some of the code in the base `vtkm` namespace is device specific. For
example, the functions in `Math.h` are customized for specific devices.
Thus, we want this code to be specially compiled and run on these
devices.
Previously, we made a header file and then added separate tests to each
device package. That was created before we had ways of running on any
device. Now, it is much easier to compile the test a single time for all
devices and use the `ALL_BACKENDS` feature of `vtkm_unit_tests` CMake
function to automatically create the test for all devices.
8983154e9 Add DeepCopy to ArrayHandle
694ba7e92 Change ArrayCopy to deep copy Buffer objects where possible
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2212
2d1b609b3 Use Ubuntu instead of rhel8 for cuda+kokkos
769248583 Make sure we use c++14 when using CUDA 11+
64efa6401 Kokkos: make sure we don't pass multiple rdc flags
b2f4c8e5e Switch -O3 to -O2 on Linux with Cuda 10
db57ed26a Fix warnings
452f61e29 Add Kokkos backend
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2164
`ArrayHandle::DeepCopy` creates a new `ArrayHandle` of the same type and
deep copies the data into it.
This functionality is similar to `ArrayCopy`. However, it can be used
without having to compile for the device on which the copy happens.
Now that the data in an `ArrayHandle` is stored in `Buffer` objects, we
now have a more efficient way of doing deep copies of memory. Rather
than call `Algorithm::Copy`, which iterates over the array and copies
each item, `ArrayCopy` now uses the `Buffer` interface to do direct
device-to-device (or host-to-host) mem copies. This should be more
efficent and take less time to compile.
Note that this direct `Buffer` copy only works if the two `ArrayHandle`s
are of the same type. If they are different, `ArrayCopy` still has to
fall back to using `Algorithm::Copy`.
Also note that not all `ArrayHandle`s are using the new `ArrayHandle`
interface (and therefore not using `Buffer` objects). Thus, a fallback
is still available for old `ArrayHandle` types.
This has no real change in the operation, but it will simplify code as
we convert `ArrayHandle`s to the new type. We will be able to write
simple runtime code rather than complex metaprogramming to determine the
number of buffers to use.
For `make_ArrayHandle` and `make_Field` when it is determined that the
data can be safely moved, just silently move instead of copy instead of
printing a log message saying the copy flag will be ignored.
Also fix an issue with `make_ArrayHandle` when the data was not moved
when it could have been.
The recent version of ArrayHandleBasic allocates typeless arrays without
any initialization. This can cause issues with types that require a
constructor. The UnitTestVariantArrayHandle was trying to create an
ArrayHandle with an std::string, and the uninitialized strings were
causing crashes on some platforms.
We have made several improvements to adding data into an `ArrayHandle`.
## Moving data from an `std::vector`
For numerous reasons, it is convenient to define data in a `std::vector`
and then wrap that into an `ArrayHandle`. It is often the case that an
`std::vector` is filled and then becomes unused once it is converted to an
`ArrayHandle`. In this case, what we really want is to pass the data off to
the `ArrayHandle` so that the `ArrayHandle` is now managing the data and
not the `std::vector`.
C++11 has a mechanism to do this: move semantics. You can now pass
variables to functions as an "rvalue" (right-hand value). When something is
passed as an rvalue, it can pull state out of that variable and move it
somewhere else. `std::vector` implements this movement so that an rvalue
can be moved to another `std::vector` without actually copying the data.
`make_ArrayHandle` now also takes advantage of this feature to move rvalue
`std::vector`s.
There is a special form of `make_ArrayHandle` named `make_ArrayHandleMove`
that takes an rvalue. There is also a special overload of
`make_ArrayHandle` itself that handles an rvalue `vector`. (However, using
the explicit move version is better if you want to make sure the data is
actually moved.)
## Make `ArrayHandle` from initalizer list
A common use case for using `std::vector` (particularly in our unit tests)
is to quickly add an initalizer list into an `ArrayHandle`. Now you can
by simply passing an initializer list to `make_ArrayHandle`.
## Deprecated `make_ArrayHandle` with default shallow copy
For historical reasons, passing an `std::vector` or a pointer to
`make_ArrayHandle` does a shallow copy (i.e. `CopyFlag` defaults to `Off`).
Although more efficient, this mode is inherintly unsafe, and making it the
default is asking for trouble.
To combat this, calling `make_ArrayHandle` without a copy flag is
deprecated. In this way, if you wish to do the faster but more unsafe
creation of an `ArrayHandle` you should explicitly express that.
This requried quite a few changes through the VTK-m source (particularly in
the tests).
## Similar changes to `Field`
`vtkm::cont::Field` has a `make_Field` helper function that is similar to
`make_ArrayHandle`. It also features the ability to create fields from
`std::vector`s and C arrays. It also likewise had the same unsafe behavior
by default of not copying from the source of the arrays.
That behavior has similarly been depreciated. You now have to specify a
copy flag.
The ability to construct a `Field` from an initializer list of values has
also been added.
While compiling UnitTestVariantArrayHandle, some versions of gcc
(between 6 and 8, I think) gave a warning like the following:
```
../vtkm/cont/StorageVirtual.h:227:12: warning: 'vtkm::Id vtkm::cont::internal::detail::StorageVirtualImpl<T, S>::GetNumberOfValues() const [with T = std::__cxx11::basic_string<char>; S = vtkm::cont::StorageTagImplicit<{anonymous}::UnusualPortal<std::__cxx11::basic_string<char> > >]' declared 'static' but never defined [-Wunused-function]
```
This warning makes no sense because it is refering to a method that is
not declared static. (In fact, it overrides a virtual method.)
I believe this is an obscure bug in these versions of gcc. I found a
[stackoverflow post] that seems to have the same problem, but no
workaround was found.
The warning originated from code that had little effect. It was part of
a test with a custom ArrayHandle storage type that was already disabled
for other reasons. Just removed the code.
[stackoverflow post]: https://stackoverflow.com/questions/56615695/how-to-fix-declared-static-but-never-defined-on-member-function
c689a68c5 Suppress bad deprecation warnings in MSVC
a3f23a03b Do not cast to ArrayHandleVirtual in VariantArrayHandle::CastAndCall
f6b13df51 Support coordinates of both float32 and float64
453e31404 Deprecate ArrayHandleVirtualCoordinates
be7f06bbe CoordinateSystem data is VariantArrayHandle
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2177
The Microsoft compiler has this annoying and stupid behavior where if
you have a generic templated method/function and that method is
instantiated with a deprecated class, then the compiler will issue a
C4996 warning even if the calling code is suppressing that warning
(because, for example, you are implementing other deprecated code and
the use is correct). There is no way around this other than suppressing
the warnings for all uses of the templated method.
We are moving to deprecate `ArrayHandleVirtual`, so we are removing the
feature where `VariantArrayHandle::CastAndCall` automatically casts to
an `ArrayHandleVirtual` if possible.
The big reason to make this change now (as opposed to later when
`ArrayHandleVirtual` is deprecated) is to improve compile times.
This prevents us from having to compile an extra code path using
`ArrayHandleVirtual`.
`CoordinateSystem` differed from `Field` in that its `GetData`
method returned an `ArrayHandleVirtualCoordinates` instead of
a `VariantArrayHandle`. This is probably confusing since
`CoordianteSystem` inherits `Field` and has a pretty dramatic
difference in this behavior.
In preparation to deprecate `ArrayHandleVirtualCoordinates`, this
changes `CoordiantSystem` to be much more like `Field`. (In the
future, we may change the `CoordinateSystem` to point to a `Field`
rather than be a special `Field`.)
A method named `GetDataAsMultiplexer` has been added to
`CoordinateSystem`. This method allows you to get data from
`CoordinateSystem` as a single array type without worrying
about creating functors to handle different types and without
needing virtual methods.
18b5be92d Fix issue with CUDA and ArrayHandleMultiplexer
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2168
`ArrayHandle::PrepareForOutput` often has to reallocate the array to the
specified size. Previously, this allocation was not happening with the
`Token` that is passed to `PrepareForOutput`. If the `ArrayHandle` is
already attached or enqueued for that `Token`, then the allocation would
deadlock.
You can now pass a `Token` object to `Allocate`, which is what
`PrepareForOutput` does.
When you try to call the `Reduce` operation in the CUDA device adapter
with a sufficently complex interator type, you get a compile error
that says `error: cannot pass an argument with a user-provided
copy-constructor to a device-side kernel launch`.
This appears to be a bug in either nvcc or Thrust. I believe it is
related to the following reported issues:
* https://github.com/thrust/thrust/issues/928
* https://github.com/thrust/thrust/issues/1044
Work around this problem by making a special condition for calling
`Reduce` with an `ArrayHandleMultiplexer` that calls the generic
algorithm in `DeviceAdapterAlgorithmGeneral` instead of the algorithm in
Thrust.
Often when a user gives memory to an `ArrayHandle`, she wants data to be
written into the memory given to be used elsewhere. Previously, the
`Buffer` objects would delete the given buffer as soon as a write buffer
was created elsewhere. That was a problem if a user wants VTK-m to write
results right into a given buffer.
Instead, when a user provides memory, "pin" that memory so that the
`ArrayHandle` never deletes it.
The buffer class encapsulates the movement of raw C arrays between
host and devices.
The `Buffer` class itself is not associated with any device. Instead,
`Buffer` is used in conjunction with a new templated class named
`DeviceAdapterMemoryManager` that can allocate data on a given
device and transfer data as necessary. `DeviceAdapterMemoryManager`
will eventually replace the more complicated device adapter classes
that manage data on a device.
The code in `DeviceAdapterMemoryManager` is actually enclosed in
virtual methods. This allows us to limit the number of classes that
need to be compiled for a device. Rather, the implementation of
`DeviceAdapterMemoryManager` is compiled once with whatever compiler
is necessary, and then the `RuntimeDeviceInformation` is used to
get the correct object instance.
143e3d39a remove unused type alias
01a448663 Merge branch 'master' into uniform_real
c67e5bb12 fixe warnings about implicit type conversion
1e4294392 Add deterministic seed to avoid potential spurious failure
5b0e309b9 the random source is still 64 bits
cc3061bab Avoid calling ReadPortal() all the time
9bf6dea22 remove inline initialization of seed
e69308047 Add statistics base testing, add Flot32 RNG
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2148
If a test throws any unexpected exception, the test is supposed to
detect that and fail. For the STL exceptions, the test failed to return
an error code. Fix that.
Many of the operations of `VariantArrayHandleBase` are not dependent on
the TypeList parameter of the class. Still others can operate just as
well by providing a type list to a method. Thus, it is convenient to
create a superclass that is not templated. That allows us to pass around
a `VariantArrayHandle` when the type list does not matter.
This superclass is called `VariantArrayHandleCommon` because "base" was
already taken.
This makes it clear that it returns true for an invalid array handle.
The previous name implied that it was looking for an ArrayHandle in some
"valid" set, which is the opposite.
The ArrayPortalMultiplexer always called Set on the delegate portals.
However, sometimes the portals were not supported. Instead, only call
when the delegate is supported.
Probably the "right" thing to do would be to check all posible portals
for support, but that sounds like it would slow down the compile.
Previously, the VariantArrayHandle::AsMultiplexer operation silently
failed and returned an invalid ArrayHandleMultiplexer. This is now
changed to throw an exception if the desired ArrayHandleMultiplexer
cannot hold the VariantArrayHandle's array.
de3bda373 Use deque instead of list for ArrayHandle queue
498d44548 Pass Token::Reference by value
c32c9e8e8 Fix deadlock when changing device during read
99e14ab8a Add proper enqueuing of Tokens for ArrayHandle
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2130
c0dee7402 make it explicit that we are using 64-bit unsigned integer in bit op
34f350588 Added changelog
e9f584a91 ArrayHandleRandomUniformReal
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2116
934f085e0 Build diy as a library
f0a37ac6a Merge branch 'upstream-diy' into diy-mpi-nompi
7687aabf8 diy 2020-06-05 (b62915aa)
6ca2b9f87 Point to new version of Diy
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2123
Because ArrayHandle currently only supports one device at a time, it is
possible that a `PrepareForInput` might actually need to wait for write
access so that it could move the data between devices. However, we don't
want the `Token` to be attached for writing because that could block
other read operations.
To get around this, add a hack to WaitToWrite to allow it to attach for
reading instead of writing.
An issue that was identified for the thread safety of `ArrayHandle` is
that if several threads are waiting to use an `ArrayHandle`, there might
be an expectation of the order in which the operations happen. For
example, if one thread is modifying the contents of an `ArrayHandle` and
another is reading those results, we would need the first one to start
before the second one.
To solve this, a queue is added to `ArrayHandle` such that when waiting
to read or write an `ArrayHandle` the `Token` has to be at the top of
the queue in addition to other requirements being met.
Additionally, an `Enqueue` method is added to add a `Token` to the queue
without blocking. This allows a control thread to queue the access and
then spawn a thread where the actual work will be done. As long as
everything is enqueued on the main thread, the operations will happen in
the expected order.
Initially, the probe filter would simply not set a value if a sample was
outside the input `DataSet`. This is not great as the memory could be
left uninitalized and lead to unpredictable results. The testing
compared these invalid results to 0, which seemed to work but is
probably unstable.
This was partially fixed by a previous change that consolidated to
mapping of cell data with a general routine that permuted data. However,
the fix did not extend to point data in the input, and it was not
possible to specify a particular invalid value.
This change specifically updates the probe filter so that invalid values
are set to a user-specified value.
251bd82b8 Significantly improve FlyingEdges performance across all devices
fa9373801 Rework FlyingEdges::Pass1 to handle NUMA and CUDA requirements.
769a10b47 FlyingEdge Normal and Point generation occurs in Pass4
93d87e06f Optimize StructuredPointGradient for non boundary points.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2080
We now use SumYAxis when executing with CUDA for better memory patterns.
Instead of using the heavy Pass4/Pass4WithNormals, CUDA now uses a
2 pass approach with the second pass outputting the normals and
coordinates using with significantly less warp divergence
Generally, fields that have a WHOLE_MESH association might be valid even
if the structure of the mesh changes. Thus, it makes sense for filters
to pass this data pretty much all the time.
Also cleaned up some code and comments to make the relationship between
`MapFieldOntoOutput` and `DoMapField` more clear.
The only reason Keys has a template is so that it can hold a UniqueKeys
array and provide the key for each group. If that is not needed and you
want to implement a library function that takes a keys object, you can
now grab the Keys superclass KeysBase. KeysBase is not templated, so you
can pass it to a standard method in a library.
The version of `Filter::Execute` that takes a policy as an argument is now
deprecated. Filters are now able to specify their own fields and types,
which is often why you want to customize the policy for an execution. The
other reason is that you are compiling VTK-m into some other source that
uses a particular types of storage. However, there is now a mechanism in
the CMake configuration to allow you to provide a header that customizes
the "default" types used in filters. This is a much more convenient way to
compile filters for specific types.
One thing that filters were not able to do was to customize what cell sets
they allowed using. This allows filters to self-select what types of cell
sets they support (beyond simply just structured or unstructured). To
support this, the lists `SupportedCellSets`, `SupportedStructuredCellSets`,
and `SupportedUnstructuredCellSets` have been added to `Filter`. When you
apply a policy to a cell set, you now have to also provide the filter.
24d022b02 Implement and test ImageReader and ImageWriter capabilities in the io library
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1967
When serializing fields, you have to select what underlying data types
of the field you want to support serializing. With recent changes in the
default policy, attempts to serialize a field often resulted in trying
to use the `vtkm::ListUniversal` type list, which is infinite.
Obviously, this cannot be compiled.
Instead, when the `vtkm::ListUniversal` list is encountered, use
`vtkm::TypeListAll` instead.
This refactor aims to increase the performance of
AddField / Getfield at the expense of memory usage and
some bits of performances when only few fields are used
Moving to non-contiguous memory will impact cpu cache benefits,
however, since each of the Field has again another pointer to
its actual data, this benefits where actually just small.
Also the choice of map v.s. unordered_map is about the number of
elements, very likely few rehashing happenings from until <100
This will also reduce the memory fragmentation caused by vectors.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
Previously, the policy specified which field types the filter should
operate on. The filter could remove some types, but it was not able to
add any types.
This is backward. Instead, the filter should specify what types its
supports and the policy may cull out some of those.
This change is needed for being able to use different thread indices types
without changing Fetchs. Basically decoupling those two areas.
1. This commit removes concrete specialization instantiations of
ThreadIndicesTypes in all of the Fetch's specializations.
2. It also moves the ThreadIndicesType template parameter from the Fetch
struct to a template parameter in their methods Load/Store.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
This fixes, which where triggered since in the new CI, one of the
docker runner set `OMP_NUM_THREADS=3`:
1. `UnitTestOpenMPDeviceAdapter`
2. `UnitTestMeshQualityFilter`
In the redution optimized implementation for _OpenMP_, it unrolls
the reduce loop in iterations of four elements. The last iteration
in the loop might overflow the loop end element (when it is not a
multiple of four).
This commit fixes this by setting the OpenMP unrolled reduce loop
end element to its previous closest multiple of four of the original end
element.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
If you gave ReduceByKey a fancy output array that decorated another
array, you could get a runtime error for using an invalid array (if the
device adapter used the generic algorithm). The problem was that
ReduceByKey creates a temporary array, and that array was given the same
storage as the output array. That might not be valid for fancy arrays,
so instead use the default storage for the temporary array.
If you gave ScanInclusiveByKey a fancy output array that decorated
another array, you would get a runtime error for using an invalid array.
The problem was that ScanInclusiveByKey creates a temporary output array
and then copies the result to the actual output array. The problem was
that the temporary output array was given the same storage as the output
array, which won't work if the output array is fancy. Instead, make the
storage for the temporary array default.
One of the dashboard compilers (gcc 4.8 with cuda, I think) was giving
warnings about an unused parameter. Apparently, the compiler did not
like the parameters defined for `default` assignment operators.
ef49a71aa Merge branch 'philox' of gitlab.kitware.com:ollielo/vtk-m into philox
563642400 remove outdated comments on type-punning
be7c18544 tried to supress warning about type punning
c795f74b2 Merge branch 'master' into philox
2b43fcb25 change function signature for work with MSVC
13e7ac362 Merge branch 'philox' of gitlab.kitware.com:ollielo/vtk-m into philox
1e2fd540d add Doxygen documentation
f2ea17917 add Doxygen documentation
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1990