vtk-m

mirror of https://gitlab.kitware.com/vtk/vtk-m synced 2024-09-19 10:35:42 +00:00

Author	SHA1	Message	Date
Robert Maynard	022898072c	Update Benchmark code to properly verify all algorithms.	2015-10-28 14:20:06 -04:00
Robert Maynard	f38673f618	Replace ErrorControlOutOfMemory with ErrorControlBadAllocation.	2015-10-01 14:25:28 -04:00
Robert Maynard	72450e87f3	Make thrust use fast paths when doing sort and scan. By introducing our own custom thrust execution policy we can make sure to hit the fastest code paths in thrust for the sort operation. This makes sure that for UInt32,Int32, and Float32 we use the radix sort from thrust which offers a 2x to 3x speed improvement over the merge sort implementation. Secondly by telling thrust that our BinaryOperators are commutative we make sure that we get the fastest code paths when executing Inclusive and Exclusive Scan Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results: median = 0.0117049s median abs dev = 0.00324614s mean = 0.0167615s std dev = 0.00786269s min = 0.00845875s max = 0.0389063s Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results: median = 0.0234463s median abs dev = 0.000317249s mean = 0.021452s std dev = 0.00470307s min = 0.011255s max = 0.0250643s Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results: median = 0.0310486s median abs dev = 0.000182129s mean = 0.0286914s std dev = 0.00634102s min = 0.0116225s max = 0.0317379s Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results: median = 0.0310617s median abs dev = 0.000193583s mean = 0.0295779s std dev = 0.00491531s min = 0.0147257s max = 0.032307s	2015-09-03 16:00:37 -04:00
Robert Maynard	ab59e34a2f	Rename pragma header guard so it makes sense for tbb and thrust. Boost is not the only thirdparty that we are supressing warnings for, so make the name more generic.	2015-08-13 09:04:23 -04:00
Kenneth Moreland	c637bf94b1	The use of is_sorted in Benchmarker.h was ambiguous Benchmarker provides its own implementation of is_sorted since this method was not introduced until C++11 and not all compilers necessarily support it. However, for those that did, the system is_sorted conflicted with the provided is_sorted. To get around the problem, specify the full namespace of the is_sorted being used (which is standard practice in VTK-m anyway).	2015-08-12 09:16:54 -06:00
Will Usher	046cd2d2b9	Change StorageBasic to use an aligned allocator. The storage used will now be aligned to `VTKM_CACHE_LINE_SIZE bytes, resulting in slightly better cache usage and load/store performance. This define is set in `StorageBasic.h We also now detect if Posix is available in Configure.h and will define VTKM_POSIX with _POSIX_VERSION if it's available. The AlignedAllocator used by StorageBasic is also STL compatible and can be used in STL containers so user's can use it in their std::vector and pass aligned user memory to the storage.	2015-08-11 13:42:55 -06:00
Will Usher	1ea6f73297	Add our own version of is_sorted to check the assert Also caught a bug where I incorrectly assumed abs_deviations would be sorted in MedianAbsDeviation	2015-08-03 10:56:59 -06:00
Will Usher	311b5dcc6b	Remove C++11 feature is_sorted and increase time alloted to run bench	2015-07-31 16:33:04 -06:00
Kenneth Moreland	21b3b318ba	Always disable conversion warnings when including boost header files On one of my compile platforms, GCC was giving conversion warnings from any boost include that was not wrapped in pragmas to disable conversion warnings. To make things easier and more robust, I created a pair of macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should be wrapped around any #include of a boost header file.	2015-07-30 17:40:40 -06:00
Will Usher	e982ebe41e	Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.	2015-07-28 11:03:28 -06:00
Will Usher	238d4fa759	Adding micro benchmark suite	2015-07-09 13:56:06 -06:00

11 Commits