vtk-m/vtkm/benchmarking/BenchmarkDeviceAdapter.cxx

80 lines
2.5 KiB
C++
Raw Normal View History

2015-07-06 21:44:29 +00:00
//============================================================================
// Copyright (c) Kitware, Inc.
// All rights reserved.
// See LICENSE.txt for details.
// This software is distributed WITHOUT ANY WARRANTY; without even
// the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
// PURPOSE. See the above copyright notice for more information.
//
// Copyright 2014 Sandia Corporation.
// Copyright 2014 UT-Battelle, LLC.
// Copyright 2014 Los Alamos National Security.
//
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
// the U.S. Government retains certain rights in this software.
//
// Under the terms of Contract DE-AC52-06NA25396 with Los Alamos National
// Laboratory (LANL), the U.S. Government retains certain rights in
// this software.
//============================================================================
#include <vtkm/cont/DeviceAdapter.h>
#include <vtkm/benchmarking/BenchmarkDeviceAdapter.h>
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
#include <iostream>
#include <algorithm>
#include <string>
#include <cctype>
int BenchmarkDeviceAdapter(int argc, char *argv[])
2015-07-06 21:44:29 +00:00
{
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
int benchmarks = 0;
if (argc < 2){
benchmarks = vtkm::benchmarking::ALL;
}
else {
for (int i = 1; i < argc; ++i){
std::string arg = argv[i];
std::transform(arg.begin(), arg.end(), arg.begin(), ::tolower);
if (arg == "lowerbounds"){
benchmarks |= vtkm::benchmarking::LOWER_BOUNDS;
}
else if (arg == "reduce"){
benchmarks |= vtkm::benchmarking::REDUCE;
}
else if (arg == "reducebykey"){
benchmarks |= vtkm::benchmarking::REDUCE_BY_KEY;
}
else if (arg == "scaninclusive"){
benchmarks |= vtkm::benchmarking::SCAN_INCLUSIVE;
}
else if (arg == "scanexclusive"){
benchmarks |= vtkm::benchmarking::SCAN_EXCLUSIVE;
}
else if (arg == "sort"){
benchmarks |= vtkm::benchmarking::SORT;
}
else if (arg == "sortbykey"){
benchmarks |= vtkm::benchmarking::SORT_BY_KEY;
}
else if (arg == "streamcompact"){
benchmarks |= vtkm::benchmarking::STREAM_COMPACT;
}
else if (arg == "unique"){
benchmarks |= vtkm::benchmarking::UNIQUE;
}
else if (arg == "upperbounds"){
benchmarks |= vtkm::benchmarking::UPPER_BOUNDS;
}
else {
std::cout << "Unrecognized benchmark: " << argv[i] << std::endl;
return 1;
}
}
}
return vtkm::benchmarking::BenchmarkDeviceAdapter
<VTKM_DEFAULT_DEVICE_ADAPTER_TAG>::Run(benchmarks);
2015-07-06 21:44:29 +00:00
}