vtk-m/docs/changelog/google_benchmark.md

134 lines
4.9 KiB
Markdown
Raw Normal View History

# Updated Benchmark Framework
The benchmarking framework has been updated to use Google Benchmark.
A benchmark is now a single function, which is passed to a macro:
```
void MyBenchmark(::benchmark::State& state)
{
MyClass someClass;
// Optional: Add a descriptive label with additional benchmark details:
state.SetLabel("Blah blah blah.");
// Must use a vtkm timer to properly capture eg. CUDA execution times.
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
// Optional: Report items and/or bytes processed per iteration in output:
state.SetItemsProcessed(state.iterations() * someClass.GetNumberOfItems());
state.SetBytesProcessed(state.iterations() * someClass.GetNumberOfBytes());
}
}
VTKM_BENCHMARK(MyBenchmark);
```
Google benchmark also makes it easy to implement parameter sweep benchmarks:
```
void MyParameterSweep(::benchmark::State& state)
{
// The current value in the sweep:
const vtkm::Id currentValue = state.range(0);
MyClass someClass;
someClass.SetSomeParameter(currentValue);
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
}
VTKM_BENCHMARK_OPTS(MyBenchmark, ->ArgName("Param")->Range(32, 1024 * 1024));
```
will generate and launch several benchmarks, exploring the parameter space of
`SetSomeParameter` between the values of 32 and (1024*1024). The chain of
functions calls in the second argument is applied to an instance of
::benchmark::internal::Benchmark. See Google Benchmark's documentation for
more details.
For more complex benchmark configurations, the VTKM_BENCHMARK_APPLY macro
accepts a function with the signature
`void Func(::benchmark::internal::Benchmark*)` that may be used to generate
more complex configurations.
To instantiate a templated benchmark across a list of types, the
VTKM_BENCHMARK_TEMPLATE* macros take a vtkm::List of types as an additional
parameter. The templated benchmark function will be instantiated and called
for each type in the list:
```
template <typename T>
void MyBenchmark(::benchmark::State& state)
{
MyClass<T> someClass;
// Must use a vtkm timer to properly capture eg. CUDA execution times.
vtkm::cont::Timer timer;
for (auto _ : state)
{
someClass.Reset();
timer.Start();
someClass.DoWork();
timer.Stop();
state.SetIterationTime(timer.GetElapsedTime());
}
}
}
VTKM_BENCHMARK_TEMPLATE(MyBenchmark, vtkm::List<vtkm::Float32, vtkm::Vec3f_32>);
```
The benchmarks are executed by calling the `VTKM_EXECUTE_BENCHMARKS(argc, argv)`
macro from `main`. There is also a `VTKM_EXECUTE_BENCHMARKS_PREAMBLE(argc, argv, some_string)`
macro that appends the contents of `some_string` to the Google Benchmark preamble.
If a benchmark is not compatible with some configuration, it may call
`state.SkipWithError("Error message");` on the `::benchmark::State` object and return. This is
useful, for instance in the filter tests when the input is not compatible with the filter.
When launching a benchmark executable, the following options are supported by Google Benchmark:
- `--benchmark_list_tests`: List all available tests.
- `--benchmark_filter="[regex]"`: Only run benchmark with names that match `[regex]`.
- `--benchmark_filter="-[regex]"`: Only run benchmark with names that DON'T match `[regex]`.
- `--benchmark_min_time=[float]`: Make sure each benchmark repetition gathers `[float]` seconds
of data.
- `--benchmark_repetitions=[int]`: Run each benchmark `[int]` times and report aggregate statistics
(mean, stdev, etc). A "repetition" refers to a single execution of the benchmark function, not
an "iteration", which is a loop of the `for(auto _:state){...}` section.
- `--benchmark_report_aggregates_only="true|false"`: If true, only the aggregate statistics are
reported (affects both console and file output). Requires `--benchmark_repetitions` to be useful.
- `--benchmark_display_aggregates_only="true|false"`: If true, only the aggregate statistics are
printed to the terminal. Any file output will still contain all repetition info.
- `--benchmark_format="console|json|csv"`: Specify terminal output format: human readable
(`console`) or `csv`/`json` formats.
- `--benchmark_out_format="console|json|csv"`: Specify file output format: human readable
(`console`) or `csv`/`json` formats.
- `--benchmark_out=[filename]`: Specify output file.
- `--benchmark_color="true|false"`: Toggle color output in terminal when using `console` output.
- `--benchmark_counters_tabular="true|false"`: Print counter information (e.g. bytes/sec, items/sec)
in the table, rather than appending them as a label.
For more information and examples of practical usage, take a look at the existing benchmarks in
vtk-m/benchmarking/.