CUDA devices have problems with recursive algorithms that have no well-
defined depth because the stack on a CUDA device tends to be pretty
short. Fix the problem for BoundingIntervalHierarchyExec by changing to
a state-machine based algorithm that follows the hierarchy up and down.
db0f5c31b Add a Transfer object for ArrayHandleVirtual
99cb10b9a Fix warnings about override keyword
0ff83e94b Properly handle conditions when VirtualStorage is null
0571c6335 Add missing allocation methods to ArrayHandleVirtual
68b2e5e65 Add move constructors to ArrayHandle subclasses
0b32831af Make ArrayHandleVirtual conform with other ArrayHandle structure
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1557
2d6a63948 Speed up the timer test
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1567
Previously, UnitTestTimer paused for 1 second each time it wanted to
test that the timer did (or didn't) record the time elapsing. This was
done 3 times per test and tested seperately on (currently) 5 devices.
The accumulated 15 seconds is not a whole lot, but it can add up when
coupled with the well over 300 other tests.
This change moves from the POSIX sleep function (which is limited to
increments of 1 second) to the updated C++ thread/chrono classes. The
amount of time to wait each time is now 0.25 seconds, which should speed
the test up by 4 times. The risk is that the shorter wait times can
throw off the results if the computer being run on is busy. If that is
the case, we can bump up the wait time (perhaps to 0.5 seconds).
Previously, ArrayHandleVirtual was using the default Transfer object.
This was problematic because it would copy/allocate things in the
execution environment independently from the array that it was wrapped
around. This caused several negative effects, particularly for CUDA
devices. First, if the data were already on the device (or the array is
implicit), a second copy of the data would be made. Second, the copy to
the device is likely less efficient. Third (and worst of all), the data
did not always get pulled back to the original array correctly.
This commit also contains instantiations of ArrayHandleVirtual and its
components for the most common types.
If an ArrayHandleVirtual is constructed without an underlying concrete
array handle, then the Storage<T, StorageTagVirtual> holds a
StorageVirtual pointer that is null. Generally, a null
ArrayHandleVirtual cannot do much, but its operations should still be
correct. There were a few places where the Storage would blindly try to
use its StorageVirtual pointer without checking it first. This adds some
conditions that should correct the behavior when StorageVirtual is null.
There is a test to ensure that basic VTK-m classes have proper move
constructors that do not throw exceptions. Some of these are subclasses
of ArrayHandle. Add these move constructors to the
VTK_M_ARRAY_HANDLE_SUBCLASS macros so they get automatically added.
Previously, ArrayHandleVirtual was defined as a specialization of
ArrayHandle with the virtual storage tag. This was because the storage
object was polymorphic and needed to be handled special. These changes
moved the existing storage definition to an internal class, and then
managed the pointer to that implementation class in a Storage object
that can be managed like any other storage object.
Also moved the implementation of StorageAny into the implementation of
the internal storage object.
6797c6e33 Specify return type for GetTimerImpl
25f3432b1 Increase the conditions on which Timer is tested
4d9ce2488 Synchronize CUDA timer when stopping it
85265a9c8 Add const correctness to Timer
465508993 Allow resetting Timer with a new device
dd4a93952 Enable initializing Timer with a DeviceAdapterId
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Haocheng LIU <haocheng.liu@kitware.com>
Merge-request: !1562
The internal function GetTimerImpl has a rather complex expression for
its return type. Prevously this was derived using declspec, but one of
the versions of Visual Studio barfed on that for some reason. So now
declare the return type explicitly.
UnitTestTimer was changed to be initialized across all possible devices
and getting times across all possible devices. Also test all possible
ways to set the device in the Timer.
Previously, when Stop was called on a Cuda timer, it would record a stop
event but it would not synchronize it at that time. Instead, the
synchronize was only called when GetElapsedTime was called. The problem
is that the time of the event is only marked when synchronize is called.
Thus, if the event completed before GetElapsedTime was called, it would
record the time from when the event acutally happened to the time when
GetElapsedTime was called as part of the elapsed time, which is
incorrect.
Fix the problem by synchronizing when Stop is called. Although this
makes the Timer more invasive, generally using the Timer can cause
synchronization to happen. This behavior is consistent with the Timer
implementation for other devices.
It should be possible to query a vtkm::cont::Timer without modifying it.
As such, its query functions (such as Stopped and GetElapsedTime) should
be const.