72450e87f3
By introducing our own custom thrust execution policy we can make sure to hit the fastest code paths in thrust for the sort operation. This makes sure that for UInt32,Int32, and Float32 we use the radix sort from thrust which offers a 2x to 3x speed improvement over the merge sort implementation. Secondly by telling thrust that our BinaryOperators are commutative we make sure that we get the fastest code paths when executing Inclusive and Exclusive Scan Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results: median = 0.0117049s median abs dev = 0.00324614s mean = 0.0167615s std dev = 0.00786269s min = 0.00845875s max = 0.0389063s Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results: median = 0.0234463s median abs dev = 0.000317249s mean = 0.021452s std dev = 0.00470307s min = 0.011255s max = 0.0250643s Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results: median = 0.0310486s median abs dev = 0.000182129s mean = 0.0286914s std dev = 0.00634102s min = 0.0116225s max = 0.0317379s Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results: median = 0.0310617s median abs dev = 0.000193583s mean = 0.0295779s std dev = 0.00491531s min = 0.0147257s max = 0.032307s |
||
---|---|---|
.. | ||
BenchmarkDeviceAdapter.cxx | ||
BenchmarkDeviceAdapter.h | ||
Benchmarker.h | ||
CMakeLists.txt |