Commit Graph

131 Commits

Author SHA1 Message Date
Robert Maynard
043afd326a Merge topic 'refactor_arrayhandle_to_reduce_lib_size'
9bf14b78 Correct warnings inside worklet::Clip when making array handles
1b6d67e0 Always defer to the serial allocator when allocating basic storage
bf2b4169 Refactor vtk-m ArrayHandle to use mutable over const_cast
705528bf vtk-m ArrayHandle + basic storage has an optimized PrepareForDevice method
22f9ae3d vtk-m ArrayHandle + basic holds control data by StorageBasicBase
b1d0060d Make Storage and ArrayHandle export for the same value types.
d0a68d32 Refactor vtk-m storage basic to generate less code

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1084
2018-02-21 16:56:53 -05:00
Thomas Otahal
1dabda4216 Bug fix for max threads in radix sort
Use PlainType to get max threads instead of
ValueType for key-value radix sorts.
2018-02-20 08:28:29 -07:00
Robert Maynard
705528bf17 vtk-m ArrayHandle + basic storage has an optimized PrepareForDevice method
By hard coding the PrepareForDevice to know about all the different VTK-m
devices, we can have a single base class do the execution allocation, and not
have that logic repeated in each child class.
2018-02-16 10:00:28 -05:00
Thomas Otahal
30f6e53c27 Fixed compiler warning for char type with kxsort
Added check for long double arrays, use TBB parallel_sort

Added radix sort instantiations for char16_t, char32_t, and
wchar_t. std::is_arithmetic<T> will evaluate to true for these
types.

Removed VTKM_CONT_EXPORT in DeviceAdapterAlgorithmTBB.h to try
and fix dll related error on Windows.
2018-02-14 09:44:11 -07:00
Thomas Otahal
8f75df65c5 Fix Visual Studio compiler warnings. 2018-02-14 07:44:37 -07:00
Thomas Otahal
d7a98057b1 Move kxsort.h inside VTKM_THIRDPARTY_PRE_INCLUDE 2018-02-13 15:27:59 -07:00
Thomas Otahal
f53d0789b0 Removed debug output statements. 2018-02-13 12:32:53 -07:00
Thomas Otahal
7c0b09deb4 Removed Gnu specific __attribute__ macro for unused variables
Replaced with (void)parameter
2018-02-13 11:36:47 -07:00
Thomas Otahal
773897655c Added missing files from Rob's patch. 2018-02-13 09:23:10 -07:00
Thomas Otahal
84de519250 Applied patch from Rob Maynard
This makes finding the implementation and explicit instantiations easier.
It also removes most macro usage from RadixSort.
2018-02-13 09:16:45 -07:00
Thomas Otahal
9cf12c48ce Merge branch 'master' into cpu_parallel_radix_sort 2018-02-07 10:23:54 -07:00
Thomas Otahal
5e72f96b99 CPU parallel radix sorting
Created split implementation. Parallel radix
sort calls moved to vtkm_cont library.

Added key value radix sorts. SortByKey will invoke
radix sort when the key is a fundamental C++ numeric
or character type.

Added fast path for vtkm::SortLess and vtkm::SortGreater
calls to Sort and SortByKey.
2018-01-31 14:08:14 -07:00
Robert Maynard
ef611239f6 Don't allow DeviceTaskTypes to construct tasks from rvalues. 2018-01-18 13:55:37 -05:00
Thomas Otahal
0d5deec473 Merge branch 'master' into cpu_parallel_radix_sort 2018-01-10 08:42:49 -07:00
Thomas Otahal
250888f7af CPU parallel radix sorting
Parallel radix sorting will be invoked in DeviceAdapterAlgorthmTBB.h when
the input is ArrayHandle<T, vtkm::cont::StorageTagBasic> where T is one of
the following basic C++ types:

unsigned int
unsigned short int
unsigned long int
unsigned long long int
unsigned char
char16_t
char32_t
wchar_t
char
short
int
long long
signed char
float
double

If a comparison operator is provided, it must be type std::less<T> or std::greater<T>.

Radix sort implementation is Satish parallel radix sort as documented in the
following citation:

  Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort.
    N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey.
    In Proc. SIGMOD, pages 351–362, 2010

Implementation is based on Takuya Akiba's GitHub source code with the following
changes:

   - Changed parallel threading from OpenMP to TBB tasks
   - Removed pair sorting
   - Added minimum threshold for parallel, will instead invoke serial radix sort (kxsort)
   - Added std::greater<T> and std::less<T> to interface for descending order sorts
   - Added can_use_parallel_radix_sort<T, F>() function to determine if parallel radix sorting
     is possible for type T and compare function F (fallback is std::sort() if not possible)
   - Added linear scaling of threads used by the algorithm for more stable performance
     on machines with lots of available threads (KNL and Haswell)

Added kxsort (serial MSD radix sort by Dinghua Li via GitHub) implementation without modification.
2018-01-10 07:28:21 -07:00
Robert Maynard
93bc0198fe Suppress false positive warnings about calling host device functions. 2018-01-02 10:40:49 -05:00
Sujin Philip
8c242cef91 Switch from faux to true virtuals 2017-11-06 15:25:29 -05:00
Allison Vacanti
6c2f22b5ce Overcome narrowing warning on MSVC. 2017-10-11 17:24:04 -04:00
Allison Vacanti
1018d981a0 Check for overlap in CopySubRange.
Some parallel copy implementations will not handle this sanely.
2017-10-11 16:52:32 -04:00
Allison Vacanti
374321e027 Use std::copy in TBB copy routines. 2017-10-11 16:52:32 -04:00
Allison Vacanti
75f88b4c46 Add versioning to VTKM installed include/share dirs. 2017-10-02 11:39:10 -04:00
Robert Maynard
311618a15f Enable highest level of warnings(W4) under MSVC
This will make VTK-m warning level match the one used by VTK. This commit
also resolves the first round of warnings that W4 exposes.
2017-09-22 13:04:28 -04:00
Robert Maynard
427ff728ad Merge topic 'restore_tbb_schedule_explicit_grain_size'
9607f71c TBB 1D scheduling restored to using the explicit grain size.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !946
2017-09-21 16:45:42 -04:00
Robert Maynard
9607f71cd3 TBB 1D scheduling restored to using the explicit grain size. 2017-09-21 09:17:44 -04:00
Kenneth Moreland
c3a3184d51 Update copyright for Sandia
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
2017-09-20 15:33:44 -06:00
Allison Vacanti
0b36596fd5 Merge topic '173_tbb_unique'
3b03177c Add TBB specialization of Unique.
94d668dd Add serial version of Unique.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !933
2017-09-20 14:35:08 -04:00
Allison Vacanti
3b03177c3f Add TBB specialization of Unique.
This performs roughly an order of magnitude better than the old
implementation on a quad core processor.
2017-09-20 09:47:22 -04:00
Allison Vacanti
3638b340ef Specialize CopyIf for TBB backend. 2017-09-19 11:09:27 -04:00
Allison Vacanti
c07c37aea2 Remove assert for incorrect assumption new TBB::RBK impl.
The assertion assumes that the RHS of the join operation has not been
yet reduced, which is not correct.
2017-09-18 16:12:07 -04:00
Allison Vacanti
d174c0fe3b Add TBB specialization for ReduceByKey.
TBB's ReduceByKey was using the generic DeviceAdapterGeneral
implementation and was about 50x slower than the serial implementation,
which is very efficient.

This patch improves TBB's RBK implementation significantly, though it still
does not scale well. On a quad core processor, this implementation performs
comparably or slightly worse than the highly efficient serial algorithm.
More than 4 cores may be needed to see sufficient parallel speedup that
would overcome the TBB overhead, and grain size does not seem to affect the
performance significantly.
2017-09-15 14:25:16 -04:00
Robert Maynard
f68635941e Convert VTK-m over to use 'using' instead of 'typedef' 2017-08-17 10:47:25 -04:00
Robert Maynard
a487017fd1 Remove lines that only contain a semi-colon. 2017-08-16 14:31:17 -04:00
Robert Maynard
b85cdd9080 Convert VTK-m over to use 'using' instead of 'typedef' 2017-08-07 14:05:43 -04:00
Robert Maynard
c09e88d214 Improve the overall doxygen content for vtk-m. 2017-07-07 11:14:25 -04:00
David C. Lonie
b2c3e41645 Refactor array transfer logic for basic storage.
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
2017-06-29 13:18:44 -04:00
Robert Maynard
5dd346007b Respect VTK-m convention of parameters all or nothing on a line
clang-format BinPack settings have been disabled to make sure that the
VTK-m style guideline is obeyed.
2017-05-26 13:53:28 -04:00
Robert Maynard
60a405ef65 Add TaskTiling1D/3D which use faux virtuals to reduce binary size.
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.

Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB

WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB

libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
2017-05-25 11:00:01 -04:00
Kitware Robot
4ade5f5770 clang-format: apply to the entire tree 2017-05-25 07:51:37 -04:00
Ben Boeckel
db36ee22b0 cont: move VTKM_SUPPRESS_EXEC_WARNINGS to above declarations
Most uses of this macro appeared before any associated `template` lines.
Make them consistent. This also makes clang-format happier.
2017-05-23 14:34:20 -04:00
Robert Maynard
9d75e7b775 Remove unneeded member variables from tbb ScheduleKernelId3 2017-05-23 10:49:19 -04:00
Kitware Robot
efbde1d54b clang-format: sort include directives 2017-05-18 12:59:33 -04:00
Robert Maynard
57ab48fe8e Replace occurrences of NULL with nullptr. 2017-05-04 10:50:57 -04:00
Sujin Philip
e9898cc5cf Merge topic 'virtual-methods'
4049b5b2 Add ClipWithImplicitFunction Filter
82d02e46 Modify ImplicitFunctions to use Virtual Methods
968960c1 Add Virtual Methods Framework

Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !750
2017-05-02 16:12:04 -04:00
Sujin Philip
968960c1a1 Add Virtual Methods Framework 2017-05-01 16:51:42 -04:00
Robert Maynard
80b9d74a23 Merge topic 'embed_more_into_vtkm_cont'
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
2017-04-26 13:47:10 -04:00
David C. Lonie
ec6589d391 Only enable -fPIC on component static libraries when necessary. 2017-04-20 12:15:31 -04:00
Robert Maynard
5f55be17e7 Update the TBB grain size to be a more reasonable default.
We not only update the TBB grain size to be a more value, we also specify
a grain size for the 3D scheduler.
2017-04-06 16:34:15 -04:00
David C. Lonie
cbfe5fddd9 Fix up various issues with ArrayHandles in vtkm_cont. 2017-04-05 15:45:11 -07:00
David C. Lonie
6ecc22bb8c First pass at compiling ArrayHandle into vtkm_cont. 2017-04-05 15:45:01 -07:00
Sujin Philip
566d70c450 Merge topic 'fix-windows_h-include-logic'
25f9f88f Fix windows.h include logic

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !722
2017-03-13 09:59:24 -04:00