vtk-m2/benchmarking
Kenneth Moreland c72256e555 Fix deprecation warning from Google benchmark
The warning is that asking not to optimize a const variable does not
always work. Fix by simply not declaring the variable as const.
2024-05-13 08:20:44 -06:00
..
BenchmarkArrayTransfer.cxx Make BenchmarkArrayTransfer actually benchmark transfers 2020-08-04 09:16:46 -06:00
BenchmarkAtomicArray.cxx Prefer ArrayHandle::Fill over Algorithm::Fill 2022-01-04 08:50:57 -07:00
BenchmarkCopySpeeds.cxx Implement tbb runtime device configuration and update vtkm to use it 2021-09-20 10:24:23 -06:00
BenchmarkDeviceAdapter.cxx Fix deprecation warning from Google benchmark 2024-05-13 08:20:44 -06:00
Benchmarker.h Remove brigand from Benchmarker.h 2022-03-08 07:25:08 -07:00
BenchmarkFieldAlgorithms.cxx Merge topic 'no-execution-whole-array' 2022-10-31 14:41:54 -04:00
BenchmarkFilters.cxx Consolidate WarpScalar and WarpVector filter 2023-09-26 07:20:09 -04:00
BenchmarkInSitu.cxx Split flying edges and marching cells into separate filters 2023-05-04 15:20:20 +02:00
BenchmarkLocators.cxx Verbose function names. 2023-12-12 15:55:00 -05:00
BenchmarkODEIntegrators.cxx Isosurface Uncertainty Visualization Filter 2023-11-15 21:04:37 -05:00
BenchmarkRayTracing.cxx add include CanvasRayTracer.h 2023-05-30 13:01:02 -06:00
BenchmarkTopologyAlgorithms.cxx Remove testing headers from benchmarking 2021-06-10 09:41:26 -06:00
CMakeLists.txt Add benchmark for 2D explicit grids. 2023-12-05 09:39:08 -05:00
README_insitu.md Switch how InSitu benchmark iterates 2022-09-12 09:24:47 -06:00
README.md benchmarks: pass unparsed args to Google benchmark 2020-04-21 10:52:31 -04:00
vtkm.module Fix some deprecated hacks in modules 2022-10-27 10:24:28 -06:00

BENCHMARKING VTK-m

TL;DR

When configuring VTM-m with CMake pass the flag -DVTKm_ENABLE_BENCHMARKS=1 . In the build directory you will see the following binaries:

$ ls bin/Benchmark*
bin/BenchmarkArrayTransfer*  bin/BenchmarkCopySpeeds* bin/BenchmarkFieldAlgorithms*
bin/BenchmarkRayTracing* bin/BenchmarkAtomicArray*    bin/BenchmarkDeviceAdapter*
bin/BenchmarkFilters* bin/BenchmarkTopologyAlgorithms*

Taking as an example BenchmarkArrayTransfer, we can run it as:

$ bin/BenchmarkArrayTransfer -d Any

Parts of this Documents

  1. TL;DR
  2. Devices
  3. Filters
  4. Compare with baseline
  5. Installing compare.py

Choosing devices

Taking as an example BenchmarkArrayTransfer, we can determine in which device we can run it by simply:

$ bin/BenchmarkArrayTransfer
...
Valid devices: "Any" "Serial"
...

Upon the Valid devices you can chose in which device to run the benchmark by:

$ bin/BenchmarkArrayTransfer -d Serial

Run a subset of your benchmarks

VTK-m benchmarks uses Google Benchmarks which allows you to choose a subset of benchmaks by using the flag --benchmark_filter=REGEX

For instance, if you want to run all the benchmarks that writes something you would run:

$ bin/BenchmarkArrayTransfer -d Serial --benchmark_filter='Write'

Note you can list all of the available benchmarks with the option: --benchmark_list_tests.

Compare with baseline

VTM-m ships with a helper script based in Google Benchmarks compare.py named compare-benchmarks.py which lets you compare benchmarks using different devices, filters, and binaries. After building VTM-m it must appear on the bin directory within your build directory.

When running compare-benchmarks.py:

  • You can specify the baseline benchmark binary path and its arguments in --benchmark1=
  • The contender benchmark binary path and its arguments in --benchmark2=
  • Extra options to be passed to compare.py must come after --

Compare between filters

When comparing filters, we only can use one benchmark binary with a single device as shown in the following example:

$ ./compare-benchmarks.py --benchmark1='./BenchmarkArrayTransfer -d Any
--benchmark_filter=1024' --filter1='Read' --filter2=Write -- filters

# It will output something like this:

Benchmark                                                                          Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------------
BenchContToExec[Read vs. Write]<F32>/Bytes:1024/manual_time                     +0.2694         +0.2655         18521         23511         18766         23749
BenchExecToCont[Read vs. Write]<F32>/Bytes:1024/manual_time                     +0.0212         +0.0209         25910         26460         26152         26698

Compare between devices

When comparing two benchmarks using two devices use the option benchmark after -- and call ./compare-benchmarks.py as follows:

$ ./compare-benchmarks.py --benchmark1='./BenchmarkArrayTransfer -d Serial
--benchmark_filter=1024' --benchmark2='./BenchmarkArrayTransfer -d Cuda
--benchmark_filter=1024' -- benchmarks


# It will output something like this:

Benchmark                                                              Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------
BenchContToExecRead<F32>/Bytes:1024/manual_time                     +0.0127         +0.0120         18388         18622         18632         18856
BenchContToExecWrite<F32>/Bytes:1024/manual_time                    +0.0010         +0.0006         23471         23496         23712         23726
BenchContToExecReadWrite<F32>/Bytes:1024/manual_time                -0.0034         -0.0041         26363         26274         26611         26502
BenchRoundTripRead<F32>/Bytes:1024/manual_time                      +0.0055         +0.0056         20635         20748         21172         21291
BenchRoundTripReadWrite<F32>/Bytes:1024/manual_time                 +0.0084         +0.0082         29288         29535         29662         29905
BenchExecToContRead<F32>/Bytes:1024/manual_time                     +0.0025         +0.0021         25883         25947         26122         26178
BenchExecToContWrite<F32>/Bytes:1024/manual_time                    -0.0027         -0.0038         26375         26305         26622         26522
BenchExecToContReadWrite<F32>/Bytes:1024/manual_time                +0.0041         +0.0039         25639         25745         25871         25972

Installing compare-benchmarks.py

compare-benchmarks.py relies on compare.py from Google Benchmarks which also relies in SciPy, you can find instructions here regarding its installation.