Commit Graph

412 Commits

Author SHA1 Message Date
Robert Maynard
f4840618cf Make sure ThrustPatches is included before thrust. 2019-04-03 08:51:05 -04:00
Robert Maynard
b9e0e541b8 VTK-m once again uses consistent include style 2019-03-28 14:12:08 -04:00
Robert Maynard
256e0c3c11 Merge topic 'rename_to_GetRuntimeDeviceTracker'
ae11e115a RuntimeDeviceTracker: Remove `Global` from names

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1592
2019-03-24 08:17:02 -04:00
Robert Maynard
ae11e115a0 RuntimeDeviceTracker: Remove Global from names 2019-03-22 08:53:26 -07:00
Robert Maynard
6cdf6cb672 Less aggressive defaults for VTK-m compared to summit.
Since we don't have per system checks currently built into
vtk-m we can't use the tuned values for Summit, as they
don't run on all our hardware.
2019-03-20 09:30:34 -07:00
Robert Maynard
3879479185 Improve VTK-m cuda scheduling based on Summit scaling study
When benchmarking the VTK-m algorithms on Summit I discovered
that our scheduling choices aren't optimal for the hardware.

This is a short term fix where we select good numbers for
Summit, and in the future make the defaults controllable
by the calling programming and/or environment variables.

Performance numbers can be found at:
  https://gitlab.kitware.com/snippets/755
2019-03-20 09:30:34 -07:00
Kenneth Moreland
4d9ce24888 Synchronize CUDA timer when stopping it
Previously, when Stop was called on a Cuda timer, it would record a stop
event but it would not synchronize it at that time. Instead, the
synchronize was only called when GetElapsedTime was called. The problem
is that the time of the event is only marked when synchronize is called.
Thus, if the event completed before GetElapsedTime was called, it would
record the time from when the event acutally happened to the time when
GetElapsedTime was called as part of the elapsed time, which is
incorrect.

Fix the problem by synchronizing when Stop is called. Although this
makes the Timer more invasive, generally using the Timer can cause
synchronization to happen. This behavior is consistent with the Timer
implementation for other devices.
2019-02-28 15:08:32 -07:00
Kenneth Moreland
85265a9c84 Add const correctness to Timer
It should be possible to query a vtkm::cont::Timer without modifying it.
As such, its query functions (such as Stopped and GetElapsedTime) should
be const.
2019-02-28 15:08:16 -07:00
Haocheng LIU
0696ae135e Merge topic 'asynchronize-timer'
415252c66 Introduce asynchronous and device independent timer

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Haocheng LIU <haocheng.liu@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1530
2019-02-05 12:02:59 -05:00
Haocheng LIU
415252c662 Introduce asynchronous and device independent timer
The timer class now is asynchronous and device independent. it's using an
similiar API as vtkOpenGLRenderTimer with Start(), Stop(), Reset(), Ready(),
and GetElapsedTime() function. For convenience and backward compability, Each
Start() function call will call Reset() internally and each GetElapsedTime()
function call will call Stop() function if it hasn't been called yet for keeping
backward compatibility purpose.

Bascially it can be used in two modes:

* Create a Timer without any device info. vtkm::cont::Timer time;

  * It would enable timers for all enabled devices on the machine. Users can get a
specific elapsed time by passing a device id into the GetElapsedtime function.
If no device is provided, it would pick the maximum of all timer results - the
logic behind this decision is that if cuda is disabled, openmp, serial and tbb
roughly give the same results; if cuda is enabled it's safe to return the
maximum elapsed time since users are more interested in the device execution
time rather than the kernal launch time. The Ready function can be handy here
to query the status of the timer.

* Create a Timer with a device id. vtkm::cont::Timer time((vtkm::cont::DeviceAdapterTagCuda()));

  * It works as the old timer that times for a specific device id.
2019-02-05 12:01:56 -05:00
Robert Maynard
d0a70946b8 Simplify the DeviceAdapterRuntimeDetectorCuda to not do a kernel launch.
The kernel launch component of the runtime device adapter is fairly
pointless. If the hardware supports CUDA we should expect that
VTK-m has the correct kernel versions.

Plus in the original version if the CUDA device was being used
and the kernel launch returns cudaErrorDevicesUnavailable it
was never possible to restore CUDA support. Now what happens
is that the runtime tracker is marked as failed, but the
calling code can always go back and trying the device again.
2019-02-04 13:27:20 -05:00
Robert Maynard
5508d17c31 Merge topic 'correct_broken_install'
24e71d251 VTK-m yet again has properly installed headers.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1525
2019-01-24 14:59:41 -05:00
Robert Maynard
24e71d251b VTK-m yet again has properly installed headers.
Fixes the install issues mentioned in #342
2019-01-24 14:26:40 -05:00
Allison Vacanti
03fc7b66d0 Add VTKM_CUDA_DEVICE_PASS preprocessing definition.
This is only set while compiling device code, and is useful
for code that needs different implementations on devices (e.g.
they call CUDA device intrinsics, etc).
2019-01-24 11:23:45 -05:00
Robert Maynard
d6f66d17a3 Testing run methods now take argc/argv to init logging/runtime device
`vtkm::cont::testing` now initializes with logging enabled and support
for device being passed on the command line, `vtkm::testing` only
enables logging.
2019-01-17 13:16:27 -06:00
Robert Maynard
4ec5bae02d Remove VTK-m TestBuild infrastructure
The purpose of the TestBuild infrastructure was to confirm that
VTK-m didn't have any lexical issues when it was a pure header
only project. As we now move to have more compiled components
the need for this form of testing is mitigated. Combined
with the issue of TestBuilds causing MSVC issues, we should
just remove this infrastructure.
2019-01-16 10:04:33 -06:00
Kenneth Moreland
2e426ad547 Run the update-control-signature-tags.sh script 2019-01-11 12:23:10 -07:00
Robert Maynard
628dce822e Merge topic 'require_cmake38'
f1e1a524e Require CMake 3.8 to build VTK-m.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1514
2019-01-09 17:02:52 -05:00
Abhishek Yenpure
afd0409189 Merge topic 'code_sprint_locator_fixes'
9b56d41fe Fixing Rectilinear Grid Cell Locator
10e9d47dc Removing std::out print statement from test
34c7b57d8 Merge branch 'code_sprint_locator_fixes' of gitlab.kitware.com:ayenpure/vtk-m into code_sprint_locator_fixes
62ee1a2c8 Updates to the Cell Locators
7eb0de5b7 Merge branch 'code_sprint_locator_fixes' of gitlab.kitware.com:ayenpure/vtk-m into code_sprint_locator_fixes
866b0798d Resolving type warnings
c062f2e26 Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into code_sprint_locator_fixes
797c83891 Adding default constructor and removing wrong comment
...

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1395
2019-01-09 16:23:17 -05:00
Robert Maynard
f1e1a524e9 Require CMake 3.8 to build VTK-m. 2019-01-09 16:01:22 -05:00
ayenpure
62ee1a2c8a Updates to the Cell Locators
- Adding updates to uniform grid cell locator
  - adding OpenMP test, updating copyrights
- Adding rectilinear grid cell locator
  - adding unit tests for serial, tbb, OpenMP, and cuda
- Updating CMakeLists to honor the alphabetical ordering
2019-01-06 17:18:23 -08:00
Robert Maynard
718caaaeac CudaAllocator allows managed memory to be explicitly disabled 2018-12-28 11:30:29 -05:00
Robert Maynard
90bb23de6b CudaAllocator::Initialize correctly uses managed memory when possible
Previously the logic would always think managed memory wasn't supported
2018-12-20 17:21:55 -05:00
ayenpure
c062f2e26c Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into code_sprint_locator_fixes 2018-12-03 07:44:31 -08:00
Allison Vacanti
16c4dde2ee Merge topic 'cuda10_warning'
0e105eae6 cudaPointerAttributes::isManaged deprecated in CUDA 10.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1430
2018-10-10 15:05:57 -04:00
Allison Vacanti
0e105eae6d cudaPointerAttributes::isManaged deprecated in CUDA 10.
Update code to support both the old and new way of checking this.
2018-10-10 13:51:56 -04:00
Allison Vacanti
bd337854ec Initial implementation of general logging.
Addresses #291.
2018-10-02 11:37:55 -04:00
Kenneth Moreland
98a0a20feb Allow ArrayHandleTransform to work with ExecObject
This change allows you to set a subclass of
vtkm::cont::ExecutionObjectBase as a functor
used in ArrayHandleTransform. This latter class will then detect that
the functor is an ExecObject and will call PrepareForExecution with the
appropriate device to get the actual Functor object.

This change allows you to use virtual objects and other device dependent
objects as functors for ArrayHandleTransform without knowing a priori
what device the portal will be used on.
2018-09-05 13:11:04 -06:00
ayenpure
22ca8bce15 Fixing unit test 2018-08-30 10:19:00 -07:00
ayenpure
42e2bb7f9a Updating files with copyrights 2018-08-29 19:46:49 -07:00
ayenpure
594d1934d4 Adding CellLocatorUniformGrid
- Adding a cell locator to locate points in a uniform grid
- Adding unit tests for the new cell locator
2018-08-29 19:30:07 -07:00
Kenneth Moreland
d879188de0 Make DispatcherBase invoke using a TryExecute
Rather than force all dispatchers to be templated on a device adapter,
instead use a TryExecute internally within the invoke to select a device
adapter.

Because this removes the need to declare a device when invoking a
worklet, this commit also removes the need to declare a device in
several other areas of the code.
2018-08-29 19:18:54 -07:00
Allison Vacanti
024a75821d Make DeviceAdapterId constructor protected.
This forces users to use a defined tag, since they shouldn't need
to create their own.
2018-08-24 16:38:08 -04:00
Haocheng LIU
7d22132253 Merge topic 'allow-disabling/enabling-cuda-managed-memory'
e34301eca Allow disabling/enabling of CUDA managed memory via an env variable

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1359
2018-08-17 13:14:02 -04:00
Haocheng LIU
e34301eca8 Allow disabling/enabling of CUDA managed memory via an env variable
By setting the environment variable "VTKM_MANAGEDMEMO_DISABLED" to be 1,
users are able to disable CUDA managed memory even though the hardware is
capable of doing so.
2018-08-17 11:10:15 -04:00
Sujin Philip
1212081de1 Support deferred freeing for CUDA memory
Calls to 'cudaFree' block execution on all cuda devices. Reduce the number of
times this happens by having a deferred free mechanism that frees a pool
of pointers together when a threshold is reached.

Especially helpful during virtual object transfers that requires a few small
allocations and frees.
2018-08-16 12:05:36 -04:00
Allison Vacanti
f6da092146 Use CUDA_ARCH instead of CUDACC to guard device-only code.
CUDACC is defined when compiling host code under nvcc, while
CUDA_ARCH is only defined for host code.
2018-08-09 11:57:05 -04:00
Allison Vacanti
2c079b96dd Make AtomicArrays work on CUDA 8.
CUDA 8.0 is erroring out in the cuda AtomicArray implementation:

https://open.cdash.org/viewBuildError.php?buildid=5489156

This patch fixes the error. See comments in source for more info.
2018-08-08 15:26:32 -04:00
Haocheng LIU
ce9cd8072a Use std::call_once to construct singeltons
By using `call_once` from C++11, we can simplify the logic in code
where we are querying same value variables from multiple threads.
2018-08-06 16:36:03 -04:00
Sujin Philip
259d670ab5 Merge topic 'cuda-per-thread-streams-2'
06dee259f Minimize cuda synchronizations

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1288
2018-07-25 15:07:39 -04:00
Robert Maynard
e031e64967 ExecutionArrayInterfaceBasic<T> explicitly construct DeviceAdapterId objects
Rather than implicitly presume the `VTKM_DEVICE_ADAPTER_` macros can
convert to DeviceAdapterId.
2018-07-25 12:04:30 -04:00
Robert Maynard
86b9ab9969 Refactor ExecutionArrayInterfaceBasic to use inheriting constructors 2018-07-25 12:03:48 -04:00
Robert Maynard
4240111dd8 Make sure VirtualObjectHandle tests include RuntimeDeviceTracker 2018-07-18 10:37:46 -04:00
Robert Maynard
8077b031a8 Merge topic 'uncomment_cuda_range_test'
1e478bbe6 Re-enable UnitTestCudaComputeRange

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1321
2018-07-17 13:28:05 -04:00
Robert Maynard
1e478bbe63 Re-enable UnitTestCudaComputeRange 2018-07-17 11:43:19 -04:00
Robert Maynard
bf49575e00 Remove unneeded typeinfo includes 2018-07-17 11:41:53 -04:00
Robert Maynard
64958b014b VTK-m now supports passing pointers when invoking worklets.
The original design of invoke and the transport infrastructure
relied on the implementation behavior of vtkm::cont types
such as ArrayHandle that used an internal shared_ptr to managed
state. This allowed passing by value instead of passing by
non-const ref when needing to transfer information to the device.

As VTK-m adds support for classes that use virtuals the ability
to pass by base pointer type allows for us to invoke worklets
using a base type without the risk of type slicing.

Additional by moving over to a non-const ref Invocation we
can update all transports that have 'output' to now be
by ref and therefore support types that can't be copied while
being 'more' correct.
2018-07-06 14:27:36 -04:00
Robert Maynard
9238cedcab Merge topic 'ice_nvcc_on_renar'
5ced0da8f Try to ice the ubuntu 17.10 + cuda 9.1 compiler

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1305
2018-07-05 11:36:16 -04:00
Robert Maynard
5ced0da8f5 Try to ice the ubuntu 17.10 + cuda 9.1 compiler 2018-07-05 09:14:52 -04:00
Robert Maynard
e5090e1289 Make sure the PointLocatorUniform uses the correct runtime device 2018-07-03 17:42:57 -04:00
Sujin Philip
06dee259f7 Minimize cuda synchronizations
1. Have a per-thread pinned array for cuda errors
2. Check for errors before scheduling new tasks and at explicit sync points
3. Remove explicit synchronizations from most places

Addresses part 2 of #168
2018-07-03 14:19:06 -04:00
ayenpure
e2dccee099 Merge branch 'master' of https://gitlab.kitware.com/vtk/vtk-m into spatialsearch 2018-06-30 11:56:33 -06:00
Kenneth Moreland
4459ab9174 Merge branch 'master' into 'pointlocator-general-interface'
# Conflicts:
#   vtkm/cont/PointLocatorUniformGrid.h
2018-06-28 12:51:08 -04:00
Kenneth Moreland
439beaaed9 Make point locator tests have consistent devices 2018-06-27 10:37:59 +02:00
Allison Vacanti
a8d8b3670d Suppress host/device warnings on CUDA atomics. 2018-06-25 14:53:53 -04:00
David Thompson
d8cf1f7b51 Merge topic 'geometry-squashed'
880d8a989 Add `vtkm/Geometry.h` and test it.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1262
2018-06-20 14:15:50 -04:00
David Thompson
880d8a989e Add vtkm/Geometry.h and test it.
This commit adds several geometric constructs to vtk-m
in the `vtkm/Geometry.h` header. They may be used from
both the execution and control environments.

We also add methods to perform projection and Gram-Schmidt
orthonormalization to `vtkm/VectorAnalysis.h`.

See `docs/changelog/geometry.md` included in this commit
for more information.
2018-06-20 11:58:14 -04:00
ayenpure
d8e8078099 Fixing the typos with ScanExclusiveByKey
- Fixed the typo
- Moved the test to vtkm/worklet/testing as vtkm/cont/testing does not execute with CUDA
2018-06-15 16:39:00 -07:00
Robert Maynard
8276e35cf4 Mark classes that should not be derived from as final. 2018-06-15 10:49:59 -04:00
Robert Maynard
82cdae0025 VTK-m waits for cuda streams to finish before host access
Previously it was possible for VTK-m to access memory from
the host before the computations in a stream finished.
2018-06-01 10:28:55 -04:00
Robert Maynard
9c3547bc7c VTK-m cuda runtime now handles no cuda runtime properly
Previously it would throw an uncaught exception and crash.
2018-05-31 10:07:37 -04:00
Allison Vacanti
1f6a662c0a Merge DevAdaptAlgoThrust --> DevAdaptAlgoCuda. 2018-05-29 14:07:29 -04:00
Allison Vacanti
be0c6a17a9 Move DevAdaptAtomicArrayImplementation to its own file. 2018-05-29 14:07:29 -04:00
Allison Vacanti
3af9f66083 Merge ArrayManagerExecutionThrustDevice into AMECuda. 2018-05-29 14:07:29 -04:00
Robert Maynard
4a520b7bdd Merge topic 'pascal_managed_memory_copy_non_blocking'
e0b6e698 copying cpu memory to pascal managed memory now works consistently.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1211
2018-05-18 15:15:17 -04:00
Robert Maynard
e0b6e69878 copying cpu memory to pascal managed memory now works consistently.
When copying small arrays from cpu memory to pascal memory we would
see subsequent kernels fail as the memory transfer hadn't finished.
This is a bug as each stream should act like a FIFO queue. So
for now when encountering this use case we explicitly synchronize
after the memcpy.
2018-05-16 17:56:50 -04:00
Robert Maynard
1c5feeb185 Make sure all device specific tests use the intended device.
This means that we not only setup the runtime device tracker
to force the intended device, it also means making sure
the default device is the error device.
2018-05-16 08:21:16 -04:00
Robert Maynard
e28244f345 Re-implement DeviceAdapterRuntimeDetector to avoid ODR violations.
The previous implementation of DeviceAdapterRuntimeDetector caused
multiple differing definitions of the same class to exist and
was causing the runtime device tracker to report CUDA as disabled
when it actually was enabled.

The ODR was caused by having a default implementation for
DeviceAdapterRuntimeDetector and a specific specialization for
CUDA. If a library had both CUDA and C++ sources it would pick up
both implementations and would have undefined behavior. In general
it would think the CUDA backend was disabled.

To avoid this kind of situation in the future I have reworked VTK-m
so that each device adapter must implement DeviceAdapterRuntimeDetector
for that device.
2018-05-15 13:08:34 -04:00
Robert Maynard
571556d984 CUDA's RuntimeDeviceTracker and Timer are now built as part of vtkm_cont
This is done to not only reduce the amount of code that users need
to generate but to reduce the amount of errors when using
the RuntimeDeviceTracker. If the runtime device tracker is initially
used in a library by a c++ file it will never properly detect the
cuda backend. By moving the code into vtkm_cont we can make sure
this problem doesn't occur.
2018-05-10 10:57:06 -04:00
Robert Maynard
364b366ab3 Correct signed/unsigned cast warnings from DeviceAdapterAlgorithmThrust
Found with CUDA 7.5
2018-05-08 15:29:11 -04:00
Robert Maynard
c9ba80ad93 Replace uint with vtkm::Id in DeviceAdapterAlgorithmThrust
The usage of uint was causing problems with CUDA + MSVC2015 as
type was not defined. Instead we use vtkm::Id as that was the expect
type to be passed to the task
2018-05-02 09:55:56 -04:00
Robert Maynard
b56894dd09 Move VTK-m Cuda backend over to a grid-stride iteration pattern.
This allows for easier host side logic when determining grid and block
sizes, and allows for a smaller library side by moving some logic
into compiled in functions.
2018-04-30 17:29:26 -04:00
Robert Maynard
b7e6371842 Correct issues found be enabling more CUDA warnings. 2018-04-23 14:27:53 -04:00
Matt Larsen
715141737f Merge topic 'typos'
efdf8543 Misc. Typos

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Matt Larsen <mlarsen@cs.uoregon.edu>
Merge-request: !1113
2018-04-06 18:04:46 -04:00
Robert Maynard
84311a2453 Merge branch 'master' into cmake_refactor 2018-04-05 10:18:36 -04:00
Robert Maynard
c123796949 VTK-m ArrayHandle can now take ownership of a user allocated memory location
Previously memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors
to support a Storage object that is being moved, we can clearly express that
the ArrayHandle now owns memory it didn't allocate.

Here is an example of how this is done:
```cpp
  T* buffer = new T[100];
  auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };

  vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
      storage(buffer, 100, user_free_function);
  vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));
```
2018-04-04 11:28:25 -04:00
Robert Maynard
707970f492 VTK-m StorageBasic is now able to give/take ownership of user allocated memory.
This fixes the three following issues with StorageBasic.

1. Memory that was allocated by VTK-m and Stolen by the user needed the
proper free function called which is generally StorageBasicAllocator::deallocate.
But that was hard for the user to hold onto. So now we provide a function
pointer to the correct free function.

2. Memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. This is now resolved by allowing the
user to specify a free function to be called on release.

3. When the CUDA backend allocates memory for an ArrayHandle that has no
control representation, and the location we are running on supports concurrent
managed access we want to specify that cuda managed memory as also the host memory.
This requires that StorageBasic be able to call an arbitrary new delete function
which is chosen at runtime.
2018-04-04 11:27:57 -04:00
Robert Maynard
8808b41fbd Merge branch 'master' into vtk-m-cmake_refactor 2018-03-29 22:51:26 -04:00
Robert Maynard
944bc3c0d6 Introduce vtkm::cont::ColorTable replacing vtkm::rendering::ColorTable
The new and improved vtkm::cont::ColorTable provides a more feature complete
color table implementation that is modeled after
vtkDiscretizableColorTransferFunction. This class therefore supports different
color spaces ( rgb, lab, hsv, diverging ) and supports execution across all
device adapters.
2018-03-28 16:11:23 -04:00
luz.paz
efdf854306 Misc. Typos
Found via `codespell` and `grep`
2018-03-28 09:45:07 -04:00
Robert Maynard
2bfbf0a902 Transfer of virtuals to the CUDA device now properly uses streams
This way when multiple threads are using VTK-m they all won't block while
one transfer a class with virtuals to the device.
2018-03-20 17:04:41 -04:00
Robert Maynard
6202d8d22d CudaAllocator guards all CUDA 8.0+ calls behind ifdef's. 2018-02-26 16:37:57 -05:00
Robert Maynard
e630ac5aa4 Merge branch 'master' into vtk-m-cmake_refactor 2018-02-23 14:52:00 -05:00
Robert Maynard
705528bf17 vtk-m ArrayHandle + basic storage has an optimized PrepareForDevice method
By hard coding the PrepareForDevice to know about all the different VTK-m
devices, we can have a single base class do the execution allocation, and not
have that logic repeated in each child class.
2018-02-16 10:00:28 -05:00
Robert Maynard
22f9ae3d24 vtk-m ArrayHandle + basic holds control data by StorageBasicBase
By making the array handle hold the control side data by the parent storage
class we remove significant code generation.
2018-02-16 09:59:20 -05:00
Robert Maynard
d0a68d3266 Refactor vtk-m storage basic to generate less code
By moving all common operations to a parent class we can
significantly reduce the vtk-m library size.
2018-02-16 09:59:19 -05:00
Robert Maynard
22ea58335a iVTK-m CUDA backend doesn't use thrust::cuda::pointer any more.
This was removed as CUDA 9.0 on MSVC has issues where CUB/Thrust
would fail to compile when given these types.
2018-02-02 08:33:17 -05:00
Robert Maynard
0c5a087e41 Merge topic 'dont_allow_rvalue_tasks'
ef611239 Don't allow DeviceTaskTypes to construct tasks from rvalues.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1062
2018-01-30 08:37:29 -05:00
luz.paz
80b11afa24 Misc. typos
Found via `codespell -q 3` via downstream VTK
2018-01-30 06:51:47 -05:00
Robert Maynard
ef611239f6 Don't allow DeviceTaskTypes to construct tasks from rvalues. 2018-01-18 13:55:37 -05:00
Robert Maynard
7d7c6ab1ab Don't allow DeviceTaskTypes to construct tasks from rvalues. 2018-01-18 13:51:30 -05:00
Robert Maynard
9c668b61e0 Simplify how we built the list of source files for vtkm_cont 2018-01-17 17:13:50 -05:00
Robert Maynard
0660c67fef Merge branch 'master' into vtk-m-cmake_refactor 2018-01-16 15:42:28 -05:00
Sujin Philip
950b12b1f2 Add ArrayHandleVirtualCoordinates 2018-01-09 17:23:41 -05:00
Matthew Letter
e17cfddfc8 added vtkm_cont_EXPORTS flag into the build
cuda, serial, and tbb were missing the vtkm_cont_EXPORTS flag
2018-01-08 14:00:58 -05:00
Robert Maynard
004bfe7b12 Prefer using existence of targets when looking for TBB/CUDA support 2018-01-08 14:00:57 -05:00
Robert Maynard
afc19ab0fc Setup symbol visibility controls for VTK-m 2018-01-08 14:00:57 -05:00
Robert Maynard
24e57556e6 Merge branch 'master' into vtk-m-cmake_refactor
Includes updating to cleanup benchmark code and handle the new MPI option
2017-12-28 14:23:21 -05:00
Sujin Philip
b530a5ce3f Fix issue with Managed Memory for 0 size arrays 2017-12-19 17:18:24 -05:00
Matthew Letter
fac43bd812 Merge branch 'master' into cmake_refactor 2017-11-28 13:36:02 -07:00
Sujin Philip
8c242cef91 Switch from faux to true virtuals 2017-11-06 15:25:29 -05:00
Robert Maynard
a6eecbe9ac ExecutionArrayInterface now can hint at how allocated memory will be used.
Certain backends desire the ability to mark allocations as being used for
reading versus writing to improve performance.
2017-11-02 10:12:57 -04:00
Matthew Letter
24d0e7766e Merge remote-tracking branch 'remotes/origin/master' into cmake_refactor 2017-10-31 16:57:41 -06:00
Sujin Philip
5842da4921 Remove ArrayHandle CopyInto
Fixes #170
2017-10-27 17:28:59 -04:00
Robert Maynard
ed8f4111ef Update all the code to work with CMake 3.3
Obviously this does mean that CUDA is not supported with 3.3.
2017-10-27 15:30:14 -04:00
Robert Maynard
56c7362258 A thought on what CMake 3.9 would mean to VTK-m. 2017-10-27 15:29:51 -04:00
Li-Ta Lo
3acd7c37a1 Merge topic 'pointlocator'
ed3a64a5 Coding style improvment
7fa800b7 Update TestingPointLocatorUniformGrid.h
f1974cab Update TestingPointLocatorUniformGrid.h
508882fa PointLocatorUniformGrid

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !973
2017-10-25 10:42:07 -04:00
Allison Vacanti
5a99dd761b Only use cuda hints for CUDA 8.0+. 2017-10-24 11:55:07 -04:00
Li-Ta Lo
508882fa21 PointLocatorUniformGrid
Provide an accelerated neareast neighbor search of points in the
dataset using a one layer uniform grid.
2017-10-19 11:44:36 -06:00
Allison Vacanti
1018d981a0 Check for overlap in CopySubRange.
Some parallel copy implementations will not handle this sanely.
2017-10-11 16:52:32 -04:00
Sujin Philip
41679cb5f9 Add a CellLocator
Implements a two-level uniform grid cell locator
2017-10-10 14:01:41 -04:00
Robert Maynard
f8f1adc962 Forward decleare DeviceAdapterAlgorithm correctly as a struct 2017-10-06 09:50:12 -04:00
Allison Vacanti
75f88b4c46 Add versioning to VTKM installed include/share dirs. 2017-10-02 11:39:10 -04:00
Robert Maynard
311618a15f Enable highest level of warnings(W4) under MSVC
This will make VTK-m warning level match the one used by VTK. This commit
also resolves the first round of warnings that W4 exposes.
2017-09-22 13:04:28 -04:00
Kenneth Moreland
c3a3184d51 Update copyright for Sandia
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
2017-09-20 15:33:44 -06:00
Allison Vacanti
00320e5dc0 Use vtkm::UInt64 for byte sizes.
vtkm::Id could be just 32bits, which limits the number of values we
would be able to store for large arrays.
2017-09-19 09:21:14 -04:00
Allison Vacanti
28ab480a40 Fix warnings on renar. 2017-09-18 15:33:02 -04:00
Robert Maynard
b9e69217ae Merge topic 'typedef_to_using_round_4'
f6863594 Convert VTK-m over to use 'using' instead of 'typedef'

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !885
2017-08-17 16:38:49 -04:00
Sujin Philip
72a6cf4a21 Change cuda calls to use the per-thread stream. 2017-08-17 11:03:02 -04:00
Robert Maynard
f68635941e Convert VTK-m over to use 'using' instead of 'typedef' 2017-08-17 10:47:25 -04:00
Robert Maynard
89f439999a Reduce the amount of typedef statements in DeviceAdapters
By using the auto keyword and decltype we can reduce the number of
complex typedefs that exist when writing device adapter algorithms.
The goal being that it is easier for developers to see the actual
algorithms being implemented, by reducing the amount of template
'noise'.
2017-08-16 14:23:21 -04:00
Allison Vacanti
326757b571 Remove ArrayHandleCuda.
!861 (b0dba9a1) adds this functionality to basic ArrayHandles.
2017-08-10 15:21:52 -04:00
Allison Vacanti
0a828189ad Reuse user-provided cuda device pointers when possible. 2017-08-03 17:21:13 -04:00
David C. Lonie
bd042ec567 Add CudaAllocator to encapsulate runtime managed memory logic.
Unified memory is now used when we detect that the hardware supports it.
2017-07-31 09:08:27 -04:00
Robert Maynard
76532264c3 Correct signed to unsigned warnings in ArrayManagerExecutionCuda. 2017-07-20 10:42:09 -04:00
David C. Lonie
379c3a0fad Use current device when allocating managed memory. 2017-07-13 12:55:22 -04:00
Li-Ta Lo
65910f139c put return value back to ScanInclusivePortal 2017-07-10 12:11:51 -06:00
Li-Ta Lo
16b61d8697 Make ScanInclusiveByKey and ScanInclusiveByKey void functions.
These two algorithms does not return meaningful return values. Generic interface and
implementation are both void. Remove erronous return type and statement for CUDA backend.
2017-07-07 15:11:42 -06:00
Robert Maynard
09a08fea8d Correct errors in TaskTuner found from the TestBuilds. 2017-07-07 08:40:03 -04:00
Robert Maynard
c11f29c093 Move the parameter sweeping code to a separate header.
The parameter sweeping code is only enabled when tuning for new GPU's
so we should move it to a separate header to make DeviceAdapterAlgorithmThrust
easier to read.
2017-07-05 14:58:19 -04:00
David C. Lonie
b2c3e41645 Refactor array transfer logic for basic storage.
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
2017-06-29 13:18:44 -04:00
Robert Maynard
cfda0593be ArrayManagerExecutionThrustDevice stops generating casting warnings. 2017-06-16 08:50:32 -04:00
Robert Maynard
5dd346007b Respect VTK-m convention of parameters all or nothing on a line
clang-format BinPack settings have been disabled to make sure that the
VTK-m style guideline is obeyed.
2017-05-26 13:53:28 -04:00
Robert Maynard
60a405ef65 Add TaskTiling1D/3D which use faux virtuals to reduce binary size.
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.

Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB

WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB

libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
2017-05-25 11:00:01 -04:00
Kitware Robot
4ade5f5770 clang-format: apply to the entire tree 2017-05-25 07:51:37 -04:00
Kitware Robot
efbde1d54b clang-format: sort include directives 2017-05-18 12:59:33 -04:00
Robert Maynard
022c36fa4f Add vtkm::exec::TaskBase, and rename WorkletInvokeFunctor to TaskSingular
Previously WorkletInvokeFunctor inherited from vtkm::exec::FunctorBase,
which is also the base class for all users Worklets and for all functors
based to DeviceAdapter::Schedule.

This is done for a few reasons. The first is that we reduce the
minimum size of user worklets. Previously the users worklet would hold
a reference to the error message, and so would the wrapper class added
when calling DeviceAdapter::Schedule. Now we only have the users worklet
holding a reference.

Second, by refactoring to have two base classes we can better improve
the documentation on what responsibilities FunctorBase.h has, compared
to TaskBase.
2017-05-02 16:38:43 -04:00
Sujin Philip
e9898cc5cf Merge topic 'virtual-methods'
4049b5b2 Add ClipWithImplicitFunction Filter
82d02e46 Modify ImplicitFunctions to use Virtual Methods
968960c1 Add Virtual Methods Framework

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !750
2017-05-02 16:12:04 -04:00
Sujin Philip
82d02e46ef Modify ImplicitFunctions to use Virtual Methods 2017-05-01 16:55:59 -04:00
Sujin Philip
968960c1a1 Add Virtual Methods Framework 2017-05-01 16:51:42 -04:00
Robert Maynard
80b9d74a23 Merge topic 'embed_more_into_vtkm_cont'
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
2017-04-26 13:47:10 -04:00
Li-Ta Lo
0ba9784082 Merge topic 'scanbykey'
5c735a38 this should resovle all the type conversion warnings
d8b02329 Merge branch 'scanbykey' of gitlab.kitware.com:ollielo/vtk-m into scanbykey
58ef7c8d one more attemp to get the data type right
22d0e355 attempt to fix warning on type conversion
ded4583a attempt to fix warning on type conversion
897b2f0f add tests for 1, 2 and ARRAY_SIZE elements for both ScanInclusiveByKey and ScanExclusiveByKey
c05a2c32 Merge branch 'scanbykey' of gitlab.kitware.com:ollielo/vtk-m into scanbykey
0e97fcb9 handle the case of 0 or 1 element in the input for ScanExclusiveByKey
...

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !746
2017-04-25 15:54:25 -04:00
Li-Ta Lo
897b2f0f63 add tests for 1, 2 and ARRAY_SIZE elements for both ScanInclusiveByKey and ScanExclusiveByKey 2017-04-19 13:38:28 -06:00
Li-Ta Lo
a205f21043 make ScanExclusiveByKey return void, rearrange parameter ordering 2017-04-17 16:11:02 -06:00
Li-Ta Lo
7023266585 add both generic and Thrust ScanExclusiveByKey 2017-04-17 15:03:49 -06:00
Li-Ta Lo
e77f9fac6a add CUDA implementation of ScanInclusiveByKey using Thrust library 2017-04-14 11:25:25 -06:00
David C. Lonie
4807b3c472 Silence warnings about unavoidable weak vtables.
- Exception classes cannot be exported due to MSVC's design decisions.
  See http://stackoverflow.com/questions/24511376. We must leave these
  classes as header only and silence the warnings.
- TransferResource in BufferState.h must remain a header-only class since
  there is no vtkm_interop library to compile the class into.
- The VTKDataSetReader hierarchy must similarly remain header-only since
  there is no vtkm_io library.
- The OptionParser Action classes are part of a header-only utility and
  cannot be easily compiled into a library.
-
2017-04-13 14:06:33 -04:00
David C. Lonie
cbfe5fddd9 Fix up various issues with ArrayHandles in vtkm_cont. 2017-04-05 15:45:11 -07:00
Robert Maynard
355eea887c Get the vtkm cont cuda object to compile properly. 2017-04-05 15:45:10 -07:00
David C. Lonie
6ecc22bb8c First pass at compiling ArrayHandle into vtkm_cont. 2017-04-05 15:45:01 -07:00