Commit Graph

3922 Commits

Author SHA1 Message Date
Allison Vacanti
b582b07983 Modify Benchmarker to expose samples and reduce iterations. 2017-10-18 10:42:50 -04:00
Allison Vacanti
d0fa70deb5 Reduce overhead and fix bugs in device adapter benchmarks. 2017-10-18 10:14:09 -04:00
Allison Vacanti
05419719d5 Add more options to the device adapter algorithm benchmark. 2017-10-17 16:21:17 -04:00
Sujin Philip
502787b1a0 Merge topic 'workaround-intel'
d6ce8000 Workaround an Intel compiler bug

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !970
2017-10-16 15:19:09 -04:00
Sujin Philip
d6ce8000f4 Workaround an Intel compiler bug
Fixes a linker error about not finding 'LinearBVH::ConstructOnDevice'
2017-10-16 12:17:04 -04:00
Sujin Philip
800bcf3124 Merge topic 'fix-intel-link-bug'
ecb99acb Workaround intel compiler bug

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Matt Larsen <mlarsen@cs.uoregon.edu>
Merge-request: !969
2017-10-12 16:46:33 -04:00
Sujin Philip
ecb99acb5e Workaround intel compiler bug
Fixes issue #179
2017-10-12 13:32:39 -04:00
Allison Vacanti
ae8d2f1ffa Merge topic 'serial_copy'
7b66dece Add equality operators that handle different handle types.
1653f20e Add missing typedef to portal.
6c2f22b5 Overcome narrowing warning on MSVC.
1018d981 Check for overlap in CopySubRange.
374321e0 Use std::copy in TBB copy routines.
825f351d Use std::copy in serial Copy implementation.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !967
2017-10-12 09:14:50 -04:00
Allison Vacanti
7b66dece45 Add equality operators that handle different handle types.
In generic code, it's a pain to use the equality operators since they
requires the ValueType and Storage to match, else the operator is undefined.
This commit adds operators for such comparisons, as well as a unit test.
2017-10-11 17:25:13 -04:00
Allison Vacanti
1653f20e7c Add missing typedef to portal. 2017-10-11 17:24:05 -04:00
Allison Vacanti
6c2f22b5ce Overcome narrowing warning on MSVC. 2017-10-11 17:24:04 -04:00
Allison Vacanti
1018d981a0 Check for overlap in CopySubRange.
Some parallel copy implementations will not handle this sanely.
2017-10-11 16:52:32 -04:00
Allison Vacanti
374321e027 Use std::copy in TBB copy routines. 2017-10-11 16:52:32 -04:00
Allison Vacanti
825f351d04 Use std::copy in serial Copy implementation.
I had assumed that the compiler would be clever enough to turn the
iterative implementation of Copy into a memcpy, but inspecting the
disassembly on a release GCC build shows that this is not the case,
likely because it can't assume that the memory ranges do not overlap.

Replacing the loop with std::copy speeds things up (about 30-50%) for
most data types, though there is a slight (usually < 5%) slowdown for
Vec types. The uint8 copy improved by a factor of 8.

Comparison:
| Speedup | iteration            | std::copy            | Benchmark (Type) |
|---------|----------------------|----------------------|------------------|
|   1.363 | 0.001590 +- 0.000087 | 0.001166 +- 0.000049 | Copy 2097152 values (vtkm::Float32) |
|   1.487 | 0.003429 +- 0.000185 | 0.002305 +- 0.000146 | Copy 2097152 values (vtkm::Float64) |
|   1.379 | 0.001568 +- 0.000072 | 0.001137 +- 0.000093 | Copy 2097152 values (vtkm::Int32) |
|   1.420 | 0.003410 +- 0.000173 | 0.002402 +- 0.000101 | Copy 2097152 values (vtkm::Int64) |
|   1.303 | 0.001564 +- 0.000083 | 0.001201 +- 0.000078 | Copy 2097152 values (vtkm::UInt32) |
|   7.204 | 0.002441 +- 0.000104 | 0.000339 +- 0.000029 | Copy 2097152 values (vtkm::UInt8) |
|   0.987 | 0.006602 +- 0.000266 | 0.006688 +- 0.000291 | Copy 2097152 values (vtkm::Vec< vtkm::Float32, 4 >) |
|   0.965 | 0.010065 +- 0.000528 | 0.010427 +- 0.000617 | Copy 2097152 values (vtkm::Vec< vtkm::Float64, 3 >) |
|   0.979 | 0.003327 +- 0.000191 | 0.003398 +- 0.000142 | Copy 2097152 values (vtkm::Vec< vtkm::Int32, 2 >) |
|   0.851 | 0.001579 +- 0.000090 | 0.001856 +- 0.000098 | Copy 2097152 values (vtkm::Vec< vtkm::UInt8, 4 >) |
2017-10-11 16:52:32 -04:00
Allison Vacanti
b396716f86 Merge topic 'vertexclustering-reducepoints'
8fabece1 Use median point from cluster as representative vertex.
c7bf0c95 Compute PointIdMap while reducing cluster ids.
5dee7c6a Select input point from cluster rather than averaging.
28e76ddb Update vertex clustering benchmarking code.
e3c9e7bb Optimize cell map computation.
d7669650 Use requested grid in VertexClustering worklet.
0472dc11 Fix warning on Cuda.
3f4e17e2 Add field mapping to VertexClustering.
...

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !960
2017-10-11 16:25:30 -04:00
Sujin Philip
4253d12062 Merge topic 'cell-locator'
41679cb5 Add a CellLocator
02f48cfa Fix multiple declaration of DistributeCellData
9e0650ad Update Newton's Method to return solution status

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !957
2017-10-11 09:37:51 -04:00
Robert Maynard
6c695a1b6e Merge topic 'fixes_184_empty_reduce'
34361dd1 DeviceAdapterAlgorithmSerial ReduceByKey handles zero size key/values

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !965
2017-10-11 08:55:09 -04:00
Sujin Philip
41679cb5f9 Add a CellLocator
Implements a two-level uniform grid cell locator
2017-10-10 14:01:41 -04:00
Sujin Philip
02f48cfaaa Fix multiple declaration of DistributeCellData 2017-10-10 14:01:41 -04:00
Sujin Philip
9e0650adf2 Update Newton's Method to return solution status 2017-10-10 14:01:41 -04:00
Sujin Philip
a482ace3c6 Merge topic 'update-cuda9-workaround'
730fa439 Update cuda 9 workaround for cuda 9 final release

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !966
2017-10-10 14:01:03 -04:00
Sujin Philip
730fa4390a Update cuda 9 workaround for cuda 9 final release 2017-10-10 11:34:29 -04:00
Allison Vacanti
8fabece187 Use median point from cluster as representative vertex.
Also uses the recently added StableSortIndices worklet to produce the
cell map.
2017-10-10 10:28:51 -04:00
Allison Vacanti
c7bf0c95f4 Compute PointIdMap while reducing cluster ids. 2017-10-10 10:28:51 -04:00
Allison Vacanti
5dee7c6a5d Select input point from cluster rather than averaging.
This is a bit counterintuitive, but choosing a random point from each
cluster rather than averaging them gives a better visual result. The
averages poorly represent an surface that runs through the grid block and
tends to bias the output points towards the center of each block, creating
very noticeable grid artifacts that look blocky.
2017-10-10 10:28:51 -04:00
Allison Vacanti
28e76ddbfd Update vertex clustering benchmarking code.
Reset the timer after each print to make the output more usable.
2017-10-10 10:28:51 -04:00
Allison Vacanti
e3c9e7bbce Optimize cell map computation.
Rather than sorting and reducing the list by key, sorting and uniquing
an index array with an indirection functor is faster.
2017-10-10 10:28:51 -04:00
Allison Vacanti
d766965000 Use requested grid in VertexClustering worklet.
This is to match the default behavior of vtkQuadricClustering. If we
want to add this functionality back, it should go into the filter as
an option that adjusts nDivisions before calling the worklet.
2017-10-10 10:28:51 -04:00
Allison Vacanti
0472dc1198 Fix warning on Cuda.
assert(false && ""); emitted a

"warning : controlling expression is constant"

Replace the assertion with an exception, which is more appropriate here
anyway.
2017-10-10 10:28:51 -04:00
Allison Vacanti
3f4e17e2a2 Add field mapping to VertexClustering. 2017-10-10 10:28:51 -04:00
Allison Vacanti
9c332674ae Remove unused comparison operator. 2017-10-10 10:28:51 -04:00
Allison Vacanti
5420368ae0 Add fields to the cow nose testing dataset. 2017-10-10 10:28:51 -04:00
Allison Vacanti
c2a7e4faba Adapt ReduceByKey to handle ArrayHandleDiscard for output keys.
Often times we don't care about the output keys, and it's useful to
be able to pass an ArrayHandleDiscard into the algorithm to save
memory in these cases.
2017-10-10 10:28:51 -04:00
Allison Vacanti
9fe7cb4542 Add struct to simplify detection of ArrayHandleDiscards.
This makes it easier to adapt algorithms to avoid reading from
discard arrays.
2017-10-10 10:28:51 -04:00
Robert Maynard
34361dd15a DeviceAdapterAlgorithmSerial ReduceByKey handles zero size key/values 2017-10-10 10:12:59 -04:00
Kenneth Moreland
82c97e0d14 Merge topic 'array-copy-regression'
e149dcf1 Specify device adapter for array copy in worklet tests

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !961
2017-10-10 09:30:56 -04:00
Robert Maynard
62fede1ba7 Merge topic 'delve_cuda_interop_testing'
189b668e The interop/cuda headers aren't testable.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !964
2017-10-09 13:17:41 -04:00
Robert Maynard
189b668e54 The interop/cuda headers aren't testable.
The headers in this location presume you are linking/including GL which
isn't done as part of the header tests.
2017-10-09 09:40:52 -04:00
Robert Maynard
e20b64151c Merge topic 'consistent_forward_declares'
f8f1adc9 Forward decleare DeviceAdapterAlgorithm correctly as a struct

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !962
2017-10-09 08:42:38 -04:00
Matt Larsen
86b2757505 Merge topic 'simplify_triangulator'
2326dca0 simplify triangulator

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !963
2017-10-06 23:30:18 -04:00
Matt Larsen
2326dca0a2 simplify triangulator 2017-10-06 16:08:29 -07:00
Robert Maynard
f8f1adc962 Forward decleare DeviceAdapterAlgorithm correctly as a struct 2017-10-06 09:50:12 -04:00
Allison Vacanti
5ec29128de Merge topic 'install_hooks'
c0a73159 Clean up install/VTK issues in VTKmConfig.cmake.
30f4151b Add missing headers to examples.
a2a55eda Install cuda interop headers.
85062a3a Add version info to installed cmake config files.
75f88b4c Add versioning to VTKM installed include/share dirs.
b3852e8d Add versioning to VTKM libraries.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !958
2017-10-04 10:55:36 -04:00
Kenneth Moreland
e149dcf1c1 Specify device adapter for array copy in worklet tests
Rob noticed a degridation in performance in some worklet tests when
ArrayCopy was added. I hypothesize that this slowdown is doing the array
copy with TBB instead of serial in the serial tests. (There have been
some checks in the existing code to suggest that some operations in TBB
can be slower than serial.) This change forces the array copy to be on
the device for which we are testing.
2017-10-03 14:18:39 -07:00
Robert Maynard
fe028d828a Merge topic 'marching_cubes_faster_structured_normals'
1147edb1 MarchingCubes now uses Gradient fast paths when possible.
d7d5da4f More changes to Neighborhood code to make it more easy to use.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !916
2017-10-03 15:44:16 -04:00
Allison Vacanti
a934ab82f8 Merge topic 'stable_sort_keys'
fd311f57 Update worklet::Keys to stable sort keys and not modify input.
d6b2896a Add StableSortIndices worklet.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !954
2017-10-03 14:04:53 -04:00
Allison Vacanti
c0a7315906 Clean up install/VTK issues in VTKmConfig.cmake. 2017-10-03 11:28:15 -04:00
Allison Vacanti
30f4151bf8 Add missing headers to examples. 2017-10-02 12:33:30 -04:00
Allison Vacanti
a2a55edaff Install cuda interop headers.
Looks like a missing subdirectory call. Also disabled testbuilds since
these need external OpenGL setup.
2017-10-02 12:33:30 -04:00
Allison Vacanti
85062a3ab3 Add version info to installed cmake config files.
Files like VTKmConfig.cmake are now under:

prefix/lib/cmake/vtkm-1.0/, rather than
prefix/lib

to allow multiple vtkm versions to share an installation prefix.
2017-10-02 12:33:30 -04:00