TBB's ReduceByKey was using the generic DeviceAdapterGeneral
implementation and was about 50x slower than the serial implementation,
which is very efficient.
This patch improves TBB's RBK implementation significantly, though it still
does not scale well. On a quad core processor, this implementation performs
comparably or slightly worse than the highly efficient serial algorithm.
More than 4 cores may be needed to see sufficient parallel speedup that
would overcome the TBB overhead, and grain size does not seem to affect the
performance significantly.
In case where the number of steps taken by each particle is explicitly
provided, code to initialize the arrays for steps and statuses for the
particles was missing.
e9f9a3d8 remove setting of DeveiceAdapter from cosmotools worklet
cdf84ccb Add sample input
6ca2683f Remove the data file for examples
f3766449 Cosmology halo finder
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !919
The filter classes have an internal CellSetIndex member that tracks on
which cell set to operate on. The get accessor is called
GetActiveCellSetIndex (note the descriptive "Index" at the end of the
function name). However, the set accessor was called SetActiveCellSet
(sans "Index"). This discrepancy does not make a lot of sense.
This commit changes SetActiveCellSet to SetActiveCellSetIndex. Not only
do I like the extra descriptor (in case we later want to set cells by
name), it is also used much less than the get method so is less
disruptive.
Previously, ConvertNumComponentsToOffsets always used TryCompile on the
global set of runtime devices. That is still the default behavior, but
now you are able to specify your own runtime tracker. Also, there are
now versions of ConvertNumComponentsToOffsets that take a device adapter
tag.
Running friend of friends algorithm and then NxN most bound particle
after to find halo center. Cosmology center finder running NxN MBP
algorithm followed by a estimator reducing the problem to MxN MBP
to speed up run.
This allows you to defer its construction. The default constructor
will set up the scatter to have 0 inputs and 0 outputs, so using
it will likely quickly reveal an error.
c5232e99 Simplify the implementation of vtkm::ForEach
6069c19f Brigand.hpp now works around CUDA 9 compiler issues.
6a4e91d5 ExecutionPolicy now handles CUDA9 removal of __CUDACC_VER__
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !902
f492f7ff reviewed version 1
a0cdad52 reviewed version 1
bc3b2e2b reviewed version 1
839185a5 reviewed version 1
6f509a8b reviewed version 1
babb154a reviewed version 1
90c870f2 reviewed version 1
1d80d5b6 reviewed version 1
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !882
Also by simplifying the implementation we work around a CUDA 9 issue which
when compiling the old version would cause an internal compiler error and
crash compilation.
b12a20a5 Fix issue where auto type was not resolving template parameters
3471dc27 Expand usage of AverageByKey
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !903
bbf84c11 Clear framebuffer to black instead of white.
3210e502 Adds an small Z offset to the wireframe edges to solve z-fights
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !907
This commit solved z-fighting issues with two changes:
- A small offset, proportional to distance between near and far planes, is
applied in camera space to the edges.
- The minimum screenspace offset is increased as well for the same reason.
A new test case is added for uniform grids.
Additionally, the line plotting algorithm is changed to round off the
edge endpoints to fill in empty pixels seen on uniform grids.