Commit Graph

37 Commits

Author SHA1 Message Date
Robert Maynard
311618a15f Enable highest level of warnings(W4) under MSVC
This will make VTK-m warning level match the one used by VTK. This commit
also resolves the first round of warnings that W4 exposes.
2017-09-22 13:04:28 -04:00
Kenneth Moreland
c3a3184d51 Update copyright for Sandia
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
2017-09-20 15:33:44 -06:00
Allison Vacanti
de44c58bb2 Prevent compiler from optimizing out benchmark body.
Since the result of the Reduce call was not used, several compilers
were omitting the call completely on release builds.
2017-09-19 14:22:42 -04:00
Sujin Philip
4db096bfb0 Update CellToPoint worklets
Update CellToPoint worklets that assume that a point will have atleast
one cell.
2017-08-04 12:43:26 -04:00
David C. Lonie
760c5856f0 Add BenchmarkArrayTransfer.
This will let us measure performance while tuning CUDA managed memory
hints.
2017-07-13 15:17:02 -04:00
Robert Maynard
5dd346007b Respect VTK-m convention of parameters all or nothing on a line
clang-format BinPack settings have been disabled to make sure that the
VTK-m style guideline is obeyed.
2017-05-26 13:53:28 -04:00
Kitware Robot
4ade5f5770 clang-format: apply to the entire tree 2017-05-25 07:51:37 -04:00
Ben Boeckel
0a6a2ad83a Benchmarker: include required headers 2017-05-18 12:59:33 -04:00
Kitware Robot
efbde1d54b clang-format: sort include directives 2017-05-18 12:59:33 -04:00
Sujin Philip
82d02e46ef Modify ImplicitFunctions to use Virtual Methods 2017-05-01 16:55:59 -04:00
Li-Ta Lo - 194699
6ce8a0135a Merge branch 'master' into unified-memory 2017-03-09 14:54:03 -07:00
Li-Ta Lo - 194699
b470175f98 new unified memory effort with the new Thrust device 2017-03-09 14:51:45 -07:00
Sujin Philip
9eddce6c99 Rename StreamCompact to CopyIf
Plus, removes the version that uses one array as both input and stencil.
2017-03-06 11:08:27 -05:00
Robert Maynard
6b1094c767 Consistently include windows.h by making a wrapper header.
We previously included windows.h in numerous locations using different
techniques to guard against bringing in parts of the file that are bad
(min/max macros, etc). This solves the problem by consistently using
vtkm/internal/Windows.h to setup everything.
2016-11-28 09:54:37 -05:00
Kenneth Moreland
fdaccc22db Remove exports for header-only functions/methods
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.

Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
2016-11-15 22:22:13 -07:00
Robert Maynard
f31d6c2258 Refactor vtkm::Types to be concise and move math helpers out of internal.
I have verified that the optimized assembly for Vec<3> and Vec<4> are consistent
with what we generated before.
2016-10-28 14:57:16 -04:00
Kenneth Moreland
4205b94f72 Fix warnings about unused method parameters
This probably sneaked by the dashboards because not many of them compile
the benchmarking code.
2016-10-13 11:51:17 -06:00
Robert Maynard
e6bbfbe5ce Update the Worklet based benchmarks to also test dynamic arrays.
This is useful so that we know the performance impact on different
ways to implement virtual arrays in vtk-m.
2016-10-06 11:54:06 -04:00
Robert Maynard
c2769b81e6 Move over from boost/random to c++11 random. 2016-09-08 17:10:39 -04:00
Robert Maynard
912e236241 Adjust the range so we don't potential divide by zero. 2016-09-03 17:04:25 -04:00
Kenneth Moreland
ece53f514a Fix inappropriate placement of typename keyword
How did any compiler accept that?

Also fix minor warning with topology algorithm benchmark.
2016-08-17 15:51:43 -06:00
Robert Maynard
02929c79e4 Add more benchmarks that work at the Worklet level.
These benchmarks are the foundation to expanding the benchmarking folder
to verify the performance of more than just the device adapter.
2016-08-05 16:30:20 -04:00
John Biddiscombe
bafee5dd71 Add a Copy benchmark 2016-06-16 09:18:26 -04:00
Kenneth Moreland
7f005562ac Make all benchmarking sources listed in build
The benchmarking header files were not listed. Not a huge deal since
these files do not need to be installed, but they should be listed
anyway. Changed the vtkm_save_benchmarks CMake macro to be able to list
headers.

Also moved everything from BenchmarkDeviceAdapter.h to
BenchmarkDeviceAdapter.cxx. Since this code shouldn't need to be
included by anything except this benchmark, there is no need to have it
in a header file. Plus, the build changes would mean that any change in
the header (where most of the source was) could cause all code in this
directory to recompile. I do not want to set that precedent.
2016-06-02 10:24:21 -06:00
Robert Maynard
19a941cc6e It is now easier to use the device adapter benchmark code.
For example to now benchmark only the sortbykey algorithm

  ./bin/BenchmarkDeviceAdapter_TBB sortbykey
2016-05-03 15:53:09 -04:00
Kenneth Moreland
cc497e6a1b Remove cont/Assert.h and exec/Assert.h
These asserts are consolidated into the unified Assert.h. Also made some
minor edits to add asserts where appropriate and a little bit of
reconfiguring as found.
2016-04-20 15:41:14 -06:00
Robert Maynard
022898072c Update Benchmark code to properly verify all algorithms. 2015-10-28 14:20:06 -04:00
Robert Maynard
f38673f618 Replace ErrorControlOutOfMemory with ErrorControlBadAllocation. 2015-10-01 14:25:28 -04:00
Robert Maynard
72450e87f3 Make thrust use fast paths when doing sort and scan.
By introducing our own custom thrust execution policy we can make sure
to hit the fastest code paths in thrust for the sort operation. This makes
sure that for UInt32,Int32, and Float32 we use the radix sort from thrust
which offers a 2x to 3x speed improvement over the merge sort implementation.

Secondly by telling thrust that our BinaryOperators are commutative we
make sure that we get the fastest code paths when executing Inclusive
and Exclusive Scan

Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results:
  median = 0.0117049s
  median abs dev = 0.00324614s
  mean = 0.0167615s
  std dev = 0.00786269s
  min = 0.00845875s
  max = 0.0389063s
Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results:
  median = 0.0234463s
  median abs dev = 0.000317249s
  mean = 0.021452s
  std dev = 0.00470307s
  min = 0.011255s
  max = 0.0250643s
Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results:
  median = 0.0310486s
  median abs dev = 0.000182129s
  mean = 0.0286914s
  std dev = 0.00634102s
  min = 0.0116225s
  max = 0.0317379s
Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results:
  median = 0.0310617s
  median abs dev = 0.000193583s
  mean = 0.0295779s
  std dev = 0.00491531s
  min = 0.0147257s
  max = 0.032307s
2015-09-03 16:00:37 -04:00
Robert Maynard
ab59e34a2f Rename pragma header guard so it makes sense for tbb and thrust.
Boost is not the only thirdparty that we are supressing warnings for, so
make the name more generic.
2015-08-13 09:04:23 -04:00
Kenneth Moreland
c637bf94b1 The use of is_sorted in Benchmarker.h was ambiguous
Benchmarker provides its own implementation of is_sorted since this
method was not introduced until C++11 and not all compilers necessarily
support it. However, for those that did, the system is_sorted conflicted
with the provided is_sorted. To get around the problem, specify the full
namespace of the is_sorted being used (which is standard practice in
VTK-m anyway).
2015-08-12 09:16:54 -06:00
Will Usher
046cd2d2b9 Change StorageBasic to use an aligned allocator.
The storage used will now be aligned to `VTKM_CACHE_LINE_SIZE bytes,
resulting in slightly better cache usage and load/store performance.
This define is set in `StorageBasic.h We also now detect if Posix is
available in Configure.h and will define VTKM_POSIX with _POSIX_VERSION
if it's available.

The AlignedAllocator used by StorageBasic is also STL compatible
and can be used in STL containers so user's can use it in their
std::vector and pass aligned user memory to the storage.
2015-08-11 13:42:55 -06:00
Will Usher
1ea6f73297 Add our own version of is_sorted to check the assert
Also caught a bug where I incorrectly assumed abs_deviations would be sorted
in MedianAbsDeviation
2015-08-03 10:56:59 -06:00
Will Usher
311b5dcc6b Remove C++11 feature is_sorted and increase time alloted to run bench 2015-07-31 16:33:04 -06:00
Kenneth Moreland
21b3b318ba Always disable conversion warnings when including boost header files
On one of my compile platforms, GCC was giving conversion warnings from
any boost include that was not wrapped in pragmas to disable conversion
warnings. To make things easier and more robust, I created a pair of
macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should
be wrapped around any #include of a boost header file.
2015-07-30 17:40:40 -06:00
Will Usher
e982ebe41e Measurement and general improvements to the benchmark suite
- A warm up run is done and not timed to allow for any allocation of
  room for output data without accounting for it in the run times.
Previously this time spent allocating memory would be included in the
time we measured for the benchmark.

- Benchmarks are run multiple times and we then compute some statistics
  about the run time of the benchmark to give a better picture of the
expected run time of the function. To this end we run the benchmark
either 500 times or for 1.5s, whichever comes sooner (though these are
easily changeable). We then perform outlier limiting by Winsorising the
data (similar to how Rust's benchmarking library works) and print out
the median, mean, min and max run times along with the median absolute
deviation and standard deviation.

- Because benchmarks are run many times they can now perform some
  initial setup in the constructor, eg. to fill some test input data
array with values to let the main benchmark loop run faster.

- To allow for benchmarks to have members of the data type being
  benchmarked the struct must now be templated on this type, leading to
a bit of awkwardness. I've worked around this by adding the
`VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make
benchmark macro generates a struct that has an `operator()` templated on
the value type which will construct and return the benchmark functor
templated on that type. The run macro will then use this generated
struct to run the benchmark functor on the type list passed. You can
also pass arguments to the benchmark functor's constructor through the
make macro however this makes things more awkward because the name of
the MakeBench struct must be different for each variation of constructor
arguments (for example see `BenchLowerBounds`).

- Added a short comment on how to add benchmarks in
  `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different
from how the tests work.

- You can now pass an extra argument when running the benchmark suite to
  only benchmark specific functions, eg. `Benchmarks_TBB
BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark
ScanInclusive and Sort. Running without any extra arguments will run all
the benchmarks as before.
2015-07-28 11:03:28 -06:00
Will Usher
238d4fa759 Adding micro benchmark suite 2015-07-09 13:56:06 -06:00