Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
3b03177c Add TBB specialization of Unique.
94d668dd Add serial version of Unique.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !933
TBB's ReduceByKey was using the generic DeviceAdapterGeneral
implementation and was about 50x slower than the serial implementation,
which is very efficient.
This patch improves TBB's RBK implementation significantly, though it still
does not scale well. On a quad core processor, this implementation performs
comparably or slightly worse than the highly efficient serial algorithm.
More than 4 cores may be needed to see sufficient parallel speedup that
would overcome the TBB overhead, and grain size does not seem to affect the
performance significantly.
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.
Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB
WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB
libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
windows.h was only being included for MSVC, while in UnitTestTimer.cxx, the
Windows function Sleep was being called after check for _WIN32. This was
causing compilation failure in MINGW.
Fixes#122
There were some issues for device adapter algorithms (like scan and
reduce) for empty arrays or arrays of size 1. This adds tests for these
short arrays to the device adapter algorithm tests and fixes these
problems.
b97b4cc7 Allow thrust::reduce to work when iterator and initial value types differ.
64bcc343 Refactor MinAndMax to use vtkm::Vec<T,2> instead of Pair.
8d60ed57 Refactor MinAndMax to be a shared binary operator.
18375b54 Update Bound computations to always use a single Reduce call
2cfc9743 Reduce can support reduce to a T type that isn't the arrayhandles T type.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !614
6b1094c7 Consistently include windows.h by making a wrapper header.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !620
We previously included windows.h in numerous locations using different
techniques to guard against bringing in parts of the file that are bad
(min/max macros, etc). This solves the problem by consistently using
vtkm/internal/Windows.h to setup everything.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.