Commit Graph

38 Commits

Author SHA1 Message Date
Robert Maynard
bb90493920 Resolves Issue 52, we now install all vtkm files correctly. 2016-02-22 14:20:35 -05:00
Robert Maynard
bd3d29577a Fix ArrayPortalFromThrust to re-enable texture memory fast path. 2016-01-26 14:30:25 -05:00
Robert Maynard
b2cd41d765 Fix ArrayPortalFromThrust to re-enable texture memory fast path. 2016-01-26 14:29:52 -05:00
Kenneth Moreland
1a538ca196 Merge branch 'scatter-worklets' into 'master'
Scatter in worklets

Add the functionality to perform a scatter operation from input to output in a worklet invocation. This allows you to, for example, specify a variable amount of outputs generated for each input.

See merge request !221
2015-11-11 13:09:47 -05:00
Robert Maynard
b3687c6f3c Workaround inclusive_scan issues in thrust 1.8.X for complex value types.
The original workaround for inclusive_scan bugs in thrust 1.8 only solved the
issue for basic arithmetic types such as int, float, double. Now we go one
step further and fix the problem for all types.

The solution is to provide a proper implementation of destructive_accumulate_n
and make sure it exists before any includes of thrust occur.
2015-11-09 17:14:30 -05:00
Kenneth Moreland
f7789f0ed7 Fix issue with const types in Thrust array management
Previously, there was a declaration ConstArrayPortalFromThrust<const T>
in ArrayManagerExecutionThrustDevice. This proved problematic because
values read from the array in the worklet were typed as const T rather
than simply T. Any Vec or Matrix built from that type would then fail
because they are not meant to work with a const value (which means they
have to be set on construction and never changed.

Instead, declare ConstArrayPortalFromThrust<T> and internally set all
the Thrust pointers to have type const T. Also declare other thrust
pointers used as method parameters to have const T rather than T. This
should work as conversion from T to const T should be fine, but not the
other way around.
2015-11-06 18:05:21 -07:00
Robert Maynard
97550d5e2d Update Cuda so that UnaryPredictes work with fancy cuda array handles. 2015-11-03 13:28:07 -05:00
T.J. Corona
829c1b1f7f Install missing cuda device backend header. 2015-11-02 16:44:19 -05:00
Robert Maynard
056f69bf96 Remove unused variable and conversion warnings from cuda code. 2015-09-21 14:17:25 -04:00
Robert Maynard
9b877ef49b Merge topic 'multiple_backend_example'
fd685210 Always install all device headers even when device isn't enabled.
b1663b24 Add an example of using multiple backends from a single translation unit.
fc0ff69d Methods with try/catch need to be host only.
4d635d64 DeviceAdapter Tags now always exist, and contain if the device is valid.
cf32b430 Teach Configure.h to store if TBB and CUDA are enabled.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !198
2015-09-17 09:49:49 -04:00
Robert Maynard
fd68521066 Always install all device headers even when device isn't enabled.
vtkm_declare_headers now is able to not test headers, by using the
TESTABLE keyword.
2015-09-17 09:28:21 -04:00
Robert Maynard
1d97f886e0 Remove the thrust pragma statements that are not needed. 2015-09-15 14:20:56 -04:00
Robert Maynard
5b8cc44ed4 Merge branch 'improve_sort_perf_on_thrust' into 'master'
Tell thrust to use fast code paths when using our predicates and operators.

See merge request !176
2015-09-07 10:38:17 -04:00
Robert Maynard
72450e87f3 Make thrust use fast paths when doing sort and scan.
By introducing our own custom thrust execution policy we can make sure
to hit the fastest code paths in thrust for the sort operation. This makes
sure that for UInt32,Int32, and Float32 we use the radix sort from thrust
which offers a 2x to 3x speed improvement over the merge sort implementation.

Secondly by telling thrust that our BinaryOperators are commutative we
make sure that we get the fastest code paths when executing Inclusive
and Exclusive Scan

Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results:
  median = 0.0117049s
  median abs dev = 0.00324614s
  mean = 0.0167615s
  std dev = 0.00786269s
  min = 0.00845875s
  max = 0.0389063s
Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results:
  median = 0.0234463s
  median abs dev = 0.000317249s
  mean = 0.021452s
  std dev = 0.00470307s
  min = 0.011255s
  max = 0.0250643s
Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results:
  median = 0.0310486s
  median abs dev = 0.000182129s
  mean = 0.0286914s
  std dev = 0.00634102s
  min = 0.0116225s
  max = 0.0317379s
Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results:
  median = 0.0310617s
  median abs dev = 0.000193583s
  mean = 0.0295779s
  std dev = 0.00491531s
  min = 0.0147257s
  max = 0.032307s
2015-09-03 16:00:37 -04:00
Robert Maynard
0d6dfb1e40 Make it possible to use Cuda TextureMemory from device/host method. 2015-09-03 11:52:40 -04:00
Robert Maynard
37403237c6 Allow us to still use __ldg texture load with the new VTKM_EXEC_CONT_EXPORT. 2015-09-02 11:34:36 -04:00
Robert Maynard
157d8efee4 Workaround thrust 1.8 inclusive scan issue.
Starting in thrust 1.8 the implementation of scan inclusive inside
thrust became highly optimized by using parallel task groups. This
new implementation has a bug that only exists when using custom
binary operators, large size arrays, release mode, and no
debugger or mem-checker attached.

While I have submitted the issue to thrust, we need to be able
to work around the existing issue. The solution I have chosen is
to mark all vtkm::exec::cuda::interal::WrappedBinaryOperators
as being commutative as far as thrust is concerened. To make
sure we don't get any unexpected behavior I have also had
to create WrappedBinaryPredicate so that we don't mark any
predicate as commutative.
2015-08-17 10:39:14 -04:00
Robert Maynard
ab59e34a2f Rename pragma header guard so it makes sense for tbb and thrust.
Boost is not the only thirdparty that we are supressing warnings for, so
make the name more generic.
2015-08-13 09:04:23 -04:00
Robert Maynard
8204db2f6a Use VTKM_BOOST_PRE_INCLUDE around thrust headers too. 2015-08-13 08:26:41 -04:00
Kenneth Moreland
21b3b318ba Always disable conversion warnings when including boost header files
On one of my compile platforms, GCC was giving conversion warnings from
any boost include that was not wrapped in pragmas to disable conversion
warnings. To make things easier and more robust, I created a pair of
macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should
be wrapped around any #include of a boost header file.
2015-07-30 17:40:40 -06:00
Robert Maynard
bb582ae4ec Update the cuda IteratorFromArrayPortal to use ptrdiff_t.
This make the advance / distance_to function signatures constant no matter
if we are building with 32/64 bit ids.
2015-07-29 09:57:42 -04:00
Robert Maynard
e74ded809a Defer more thrust iterator deduction logic to ArrayPortalToIterators. 2015-07-14 10:11:12 -04:00
Robert Maynard
4ba1f7c853 Remove the need for any portal to define an IteratorType. 2015-07-13 17:16:27 -04:00
Kenneth Moreland
4fc3626712 Fix compiler directives for icc
The Intel icc compiler tries to pretend it is gcc, but it sometimes
behaves differently. Add more explicit checks for what compiler is
being used.
2015-07-06 10:35:06 -06:00
Robert Maynard
2d7e44de62 Make it so that we actually use cuda texture memory loads.
By mistake the cuda texture memory load code was not being used, so correct
that issue and allow loading of vtkm::Vec and primitive types. Currently
the only issue is loading int8/int16 uint8/uint16 through texture memory.
2015-06-25 08:21:49 -04:00
Robert Maynard
9ef7fa9b3a Merge branch 'iterator_operator_square_bracket_corrected' into 'master'
Make sure we use ptrdiff_t for index into arrays.

This is a requirement since you can use negative indices into arrays.

See merge request !41
2015-06-16 13:09:17 -04:00
Robert Maynard
726c914ee5 Make sure we use ptrdiff_t for index into arrays.
This is a requirement since you can use negative indices into arrays.
2015-06-16 11:00:01 -04:00
Robert Maynard
eb6c698e63 Remove un-needed overloads from WrappedOperators for cuda. 2015-06-16 09:48:12 -04:00
Robert Maynard
2c91cdfa3b Update cuda/thrust backend scan algorithms to work with vec types. 2015-06-16 08:28:31 -04:00
Robert Maynard
2a2159b1e1 Generalize the support for zip handles inside the cuda backend.
Instead of having a single specialization for sort and zip handles,
we know handle any fancy handles being passed to the cuda device adapter.
This was done by reworking how we represent fancy iterators inside thrust,
and instead of using a transform iterator + counting iterator we just use
a iterator_facade.
2015-06-12 11:56:46 -04:00
Robert Maynard
07970cf476 Correct all signed / unsigned and narrowing warnings ( 64bit to 32bit ). 2015-05-28 09:05:17 -04:00
Robert Maynard
6b8e7822be The Copyright statement now has all the periods in the correct location. 2015-05-21 10:30:11 -04:00
Robert Maynard
078d623173 Allow ArrayPortalFromThrust to be used inside a zip portal. 2015-05-05 14:03:40 -04:00
Kenneth Moreland
ec0adf8b16 Change interface of ArrayTransfer to be more like ArrayHandle.
This includes changing methods like LoadDataForInput to PrepareForInput.
It also changed the interface a bit to save a reference to the storage
object. (Maybe it would be better to save a pointer?) These changes also
extend up to the ArrayManagerExecution class, so it can effect device
adapter implementations.
2015-04-30 21:07:36 -06:00
Robert Maynard
63b1f03187 Simplify the implementation of loading through textures.
We don't need this super complicated system for texture loading.
2015-03-09 16:37:45 -04:00
Robert Maynard
9b49973621 Use __ldg instead of texture object. 2015-03-05 18:31:44 -05:00
Robert Maynard
1b5c5a6ce5 Add in initial support for texture binding of input arrays. 2014-12-19 13:47:28 -05:00
Robert Maynard
d9270e408d Adding a cuda device adapter to vtkm.
Porting the dax device adapter over to vtkm. Unlike the dax version, doesn't
use the thrust::device_vector, but instead uses thrust::system calls so that
we can support multiple thrust based backends.

Also this has Texture Memory support for input array handles. Some more work
will need to be done to ArrayHandle so that everything works when using an
ArrayHandle inplace with texture memory bindings.
2014-12-19 13:47:28 -05:00