vtk-m/vtkm/benchmarking/BenchmarkDeviceAdapter.cxx

1321 lines
44 KiB
C++
Raw Normal View History

2015-07-06 21:44:29 +00:00
//============================================================================
// Copyright (c) Kitware, Inc.
// All rights reserved.
// See LICENSE.txt for details.
// This software is distributed WITHOUT ANY WARRANTY; without even
// the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
// PURPOSE. See the above copyright notice for more information.
//
// Copyright 2014 National Technology & Engineering Solutions of Sandia, LLC (NTESS).
2015-07-06 21:44:29 +00:00
// Copyright 2014 UT-Battelle, LLC.
// Copyright 2014 Los Alamos National Security.
//
// Under the terms of Contract DE-NA0003525 with NTESS,
2015-07-06 21:44:29 +00:00
// the U.S. Government retains certain rights in this software.
//
// Under the terms of Contract DE-AC52-06NA25396 with Los Alamos National
// Laboratory (LANL), the U.S. Government retains certain rights in
// this software.
//============================================================================
#include <vtkm/TypeTraits.h>
2017-05-18 14:51:24 +00:00
#include <vtkm/benchmarking/Benchmarker.h>
#include <vtkm/cont/ArrayHandle.h>
#include <vtkm/cont/ArrayHandleConstant.h>
2017-05-18 14:51:24 +00:00
#include <vtkm/cont/ArrayHandleCounting.h>
#include <vtkm/cont/ArrayHandlePermutation.h>
#include <vtkm/cont/ArrayHandleZip.h>
#include <vtkm/cont/ArrayPortalToIterators.h>
2017-05-18 14:51:24 +00:00
#include <vtkm/cont/DeviceAdapterAlgorithm.h>
#include <vtkm/cont/ErrorExecution.h>
#include <vtkm/cont/StorageBasic.h>
#include <vtkm/cont/Timer.h>
#include <vtkm/cont/internal/DeviceAdapterError.h>
#include <vtkm/cont/testing/Testing.h>
2015-07-06 21:44:29 +00:00
#include <vtkm/worklet/StableSortIndices.h>
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
#include <algorithm>
#include <cmath>
#include <random>
#include <string>
#include <utility>
#include <vtkm/internal/Windows.h>
#if VTKM_DEVICE_ADAPTER == VTKM_DEVICE_ADAPTER_TBB
#include <tbb/task_scheduler_init.h>
#endif // TBB
// This benchmark has a number of commandline options to customize its behavior.
// See The BenchDevAlgoConfig documentations for details.
// For the TBB implementation, the number of threads can be customized using a
// "NumThreads [numThreads]" argument.
2017-05-18 14:29:41 +00:00
namespace vtkm
{
namespace benchmarking
{
2017-05-18 14:29:41 +00:00
enum BenchmarkName
{
COPY = 1,
COPY_IF = 1 << 1,
LOWER_BOUNDS = 1 << 2,
REDUCE = 1 << 3,
REDUCE_BY_KEY = 1 << 4,
SCAN_INCLUSIVE = 1 << 5,
SCAN_EXCLUSIVE = 1 << 6,
2017-05-18 14:29:41 +00:00
SORT = 1 << 7,
SORT_BY_KEY = 1 << 8,
STABLE_SORT_INDICES = 1 << 9,
STABLE_SORT_INDICES_UNIQUE = 1 << 10,
UNIQUE = 1 << 11,
UPPER_BOUNDS = 1 << 12,
2017-05-18 14:29:41 +00:00
ALL = COPY | COPY_IF | LOWER_BOUNDS | REDUCE | REDUCE_BY_KEY | SCAN_INCLUSIVE | SCAN_EXCLUSIVE |
SORT |
SORT_BY_KEY |
STABLE_SORT_INDICES |
STABLE_SORT_INDICES_UNIQUE |
2017-05-18 14:29:41 +00:00
UNIQUE |
UPPER_BOUNDS
};
/// Configuration options. Can be modified using via command line args as
/// described below:
struct BenchDevAlgoConfig
{
/// Benchmarks to run. Possible values:
/// Copy, CopyIf, LowerBounds, Reduce, ReduceByKey, ScanInclusive,
/// ScanExclusive, Sort, SortByKey, StableSortIndices, StableSortIndicesUnique,
/// Unique, UpperBounds, or All. (Default: All).
// Zero is for parsing, will change to 'all' in main if needed.
int BenchmarkFlags{ 0 };
/// ValueTypes to test.
/// CLI arg: "TypeList [Base|Extended]" (Base is default).
bool ExtendedTypeList{ false };
/// Run benchmarks using the same number of bytes for all arrays.
/// CLI arg: "FixBytes [n|off]" (n is the number of bytes, default: 2097152, ie. 2MiB)
/// @note FixBytes and FixSizes are not mutually exclusive. If both are
/// specified, both will run.
bool TestArraySizeBytes{ true };
vtkm::UInt64 ArraySizeBytes{ 1 << 21 };
/// Run benchmarks using the same number of values for all arrays.
/// CLI arg: "FixSizes [n|off]" (n is the number of values, default: off)
/// @note FixBytes and FixSizes are not mutually exclusive. If both are
/// specified, both will run.
bool TestArraySizeValues{ false };
vtkm::Id ArraySizeValues{ 1 << 21 };
/// If true, operations like "Unique" will test with a wider range of unique
/// values (5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 75%, 100%
/// unique). If false (default), the range is limited to 5%, 25%, 50%, 75%,
/// 100%.
/// CLI arg: "DetailedOutputRange" enables the extended range.
bool DetailedOutputRangeScaling{ false };
// Internal: The benchmarking code will set this depending on execution phase:
bool DoByteSizes{ false };
// Compute the number of values for an array with the given type:
template <typename T>
VTKM_CONT vtkm::Id ComputeSize()
{
return this->DoByteSizes
? static_cast<vtkm::Id>(this->ArraySizeBytes / static_cast<vtkm::UInt64>(sizeof(T)))
: this->ArraySizeValues;
}
};
// Share a global instance of the config (only way to get it into the benchmark
// functors):
static BenchDevAlgoConfig Config = BenchDevAlgoConfig();
struct BaseTypes : vtkm::ListTagBase<vtkm::UInt8,
vtkm::Int32,
vtkm::Int64,
vtkm::Pair<vtkm::Id, vtkm::Float32>,
vtkm::Float32,
vtkm::Vec<vtkm::Float32, 3>,
vtkm::Float64,
vtkm::Vec<vtkm::Float64, 3>>
{
};
struct ExtendedTypes : vtkm::ListTagBase<vtkm::UInt8,
vtkm::Vec<vtkm::UInt8, 4>,
vtkm::Int32,
vtkm::Int64,
vtkm::Pair<vtkm::Int32, vtkm::Float32>,
vtkm::Pair<vtkm::Int32, vtkm::Float32>,
vtkm::Pair<vtkm::Int64, vtkm::Float64>,
vtkm::Pair<vtkm::Int64, vtkm::Float64>,
vtkm::Float32,
vtkm::Vec<vtkm::Float32, 3>,
vtkm::Float64,
vtkm::Vec<vtkm::Float64, 3>>
{
};
const static std::string DIVIDER(40, '-');
/// This class runs a series of micro-benchmarks to measure
/// performance of the parallel primitives provided by each
/// device adapter
2017-05-18 14:29:41 +00:00
template <class DeviceAdapterTag>
class BenchmarkDeviceAdapter
{
typedef vtkm::cont::StorageTagBasic StorageTag;
typedef vtkm::cont::ArrayHandle<vtkm::Id, StorageTag> IdArrayHandle;
2017-05-18 14:29:41 +00:00
typedef vtkm::cont::DeviceAdapterAlgorithm<DeviceAdapterTag> Algorithm;
typedef vtkm::cont::Timer<DeviceAdapterTag> Timer;
public:
// Various kernels used by the different benchmarks to accelerate
// initialization of data
2017-05-18 14:29:41 +00:00
template <typename Value>
struct FillTestValueKernel : vtkm::exec::FunctorBase
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
2017-05-18 14:29:41 +00:00
typedef typename ValueArrayHandle::template ExecutionTypes<DeviceAdapterTag>::Portal PortalType;
PortalType Output;
VTKM_CONT
2017-05-18 14:29:41 +00:00
FillTestValueKernel(PortalType out)
: Output(out)
{
}
2017-05-18 14:29:41 +00:00
VTKM_EXEC void operator()(vtkm::Id i) const { Output.Set(i, TestValue(i, Value())); }
};
2017-05-18 14:29:41 +00:00
template <typename Value>
struct FillScaledTestValueKernel : vtkm::exec::FunctorBase
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
2017-05-18 14:29:41 +00:00
typedef typename ValueArrayHandle::template ExecutionTypes<DeviceAdapterTag>::Portal PortalType;
PortalType Output;
const vtkm::Id IdScale;
VTKM_CONT
2017-05-18 14:29:41 +00:00
FillScaledTestValueKernel(vtkm::Id id_scale, PortalType out)
: Output(out)
, IdScale(id_scale)
{
}
2017-05-18 14:29:41 +00:00
VTKM_EXEC void operator()(vtkm::Id i) const { Output.Set(i, TestValue(i * IdScale, Value())); }
};
2017-05-18 14:29:41 +00:00
template <typename Value>
struct FillModuloTestValueKernel : vtkm::exec::FunctorBase
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
2017-05-18 14:29:41 +00:00
typedef typename ValueArrayHandle::template ExecutionTypes<DeviceAdapterTag>::Portal PortalType;
PortalType Output;
const vtkm::Id Modulus;
VTKM_CONT
2017-05-18 14:29:41 +00:00
FillModuloTestValueKernel(vtkm::Id modulus, PortalType out)
: Output(out)
, Modulus(modulus)
{
}
2017-05-18 14:29:41 +00:00
VTKM_EXEC void operator()(vtkm::Id i) const { Output.Set(i, TestValue(i % Modulus, Value())); }
};
2017-05-18 14:29:41 +00:00
template <typename Value>
struct FillBinaryTestValueKernel : vtkm::exec::FunctorBase
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
2017-05-18 14:29:41 +00:00
typedef typename ValueArrayHandle::template ExecutionTypes<DeviceAdapterTag>::Portal PortalType;
PortalType Output;
const vtkm::Id Modulus;
VTKM_CONT
2017-05-18 14:29:41 +00:00
FillBinaryTestValueKernel(vtkm::Id modulus, PortalType out)
: Output(out)
, Modulus(modulus)
{
}
2017-05-18 14:29:41 +00:00
VTKM_EXEC void operator()(vtkm::Id i) const
{
Output.Set(i, i % Modulus == 0 ? TestValue(vtkm::Id(1), Value()) : Value());
}
};
private:
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchCopy
{
2016-05-10 19:22:55 +00:00
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
ValueArrayHandle ValueHandle_src;
ValueArrayHandle ValueHandle_dst;
std::mt19937 Rng;
2016-05-10 19:22:55 +00:00
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchCopy()
{
ValueHandle_src.PrepareForOutput(Config.ComputeSize<Value>(), DeviceAdapterTag());
ValueHandle_dst.PrepareForOutput(Config.ComputeSize<Value>(), DeviceAdapterTag());
2016-05-10 19:22:55 +00:00
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
for (vtkm::Id i = 0; i < ValueHandle_src.GetNumberOfValues(); ++i)
{
ValueHandle_src.GetPortalControl().Set(vtkm::Id(i), TestValue(vtkm::Id(Rng()), Value()));
2016-05-10 19:22:55 +00:00
}
Timer timer;
2017-05-18 14:29:41 +00:00
Algorithm::Copy(ValueHandle_src, ValueHandle_dst);
2016-05-10 19:22:55 +00:00
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2016-05-10 19:22:55 +00:00
std::stringstream description;
description << "Copy " << arraySize << " values (" << HumanSize(arraySize * sizeof(Value))
<< ")";
2016-05-10 19:22:55 +00:00
return description.str();
}
};
2017-05-18 14:29:41 +00:00
VTKM_MAKE_BENCHMARK(Copy, BenchCopy);
2016-05-10 19:22:55 +00:00
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchCopyIf
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
const vtkm::Id PERCENT_VALID;
const vtkm::Id N_VALID;
ValueArrayHandle ValueHandle, OutHandle;
IdArrayHandle StencilHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchCopyIf(vtkm::Id percent_valid)
: PERCENT_VALID(percent_valid)
, N_VALID((Config.ComputeSize<Value>() * percent_valid) / 100)
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
vtkm::Id modulo = arraySize / N_VALID;
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillBinaryTestValueKernel<vtkm::Id>(
modulo, StencilHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Algorithm::CopyIf(ValueHandle, StencilHandle, OutHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "CopyIf on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ") with " << PERCENT_VALID
<< "% valid values";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(CopyIf5, BenchCopyIf, 5);
VTKM_MAKE_BENCHMARK(CopyIf10, BenchCopyIf, 10);
VTKM_MAKE_BENCHMARK(CopyIf15, BenchCopyIf, 15);
VTKM_MAKE_BENCHMARK(CopyIf20, BenchCopyIf, 20);
VTKM_MAKE_BENCHMARK(CopyIf25, BenchCopyIf, 25);
VTKM_MAKE_BENCHMARK(CopyIf30, BenchCopyIf, 30);
VTKM_MAKE_BENCHMARK(CopyIf35, BenchCopyIf, 35);
VTKM_MAKE_BENCHMARK(CopyIf40, BenchCopyIf, 40);
VTKM_MAKE_BENCHMARK(CopyIf45, BenchCopyIf, 45);
VTKM_MAKE_BENCHMARK(CopyIf50, BenchCopyIf, 50);
VTKM_MAKE_BENCHMARK(CopyIf75, BenchCopyIf, 75);
VTKM_MAKE_BENCHMARK(CopyIf100, BenchCopyIf, 100);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchLowerBounds
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
const vtkm::Id N_VALS;
const vtkm::Id PERCENT_VALUES;
ValueArrayHandle InputHandle, ValueHandle;
IdArrayHandle OutHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchLowerBounds(vtkm::Id value_percent)
: N_VALS((Config.ComputeSize<Value>() * value_percent) / 100)
, PERCENT_VALUES(value_percent)
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(InputHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillScaledTestValueKernel<Value>(
2, ValueHandle.PrepareForOutput(N_VALS, DeviceAdapterTag())),
N_VALS);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Algorithm::LowerBounds(InputHandle, ValueHandle, OutHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "LowerBounds on " << arraySize << " input values ("
<< "(" << HumanSize(arraySize * sizeof(Value)) << ") (" << PERCENT_VALUES
<< "% configuration)";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(LowerBounds5, BenchLowerBounds, 5);
VTKM_MAKE_BENCHMARK(LowerBounds10, BenchLowerBounds, 10);
VTKM_MAKE_BENCHMARK(LowerBounds15, BenchLowerBounds, 15);
VTKM_MAKE_BENCHMARK(LowerBounds20, BenchLowerBounds, 20);
VTKM_MAKE_BENCHMARK(LowerBounds25, BenchLowerBounds, 25);
VTKM_MAKE_BENCHMARK(LowerBounds30, BenchLowerBounds, 30);
VTKM_MAKE_BENCHMARK(LowerBounds35, BenchLowerBounds, 35);
VTKM_MAKE_BENCHMARK(LowerBounds40, BenchLowerBounds, 40);
VTKM_MAKE_BENCHMARK(LowerBounds45, BenchLowerBounds, 45);
VTKM_MAKE_BENCHMARK(LowerBounds50, BenchLowerBounds, 50);
VTKM_MAKE_BENCHMARK(LowerBounds75, BenchLowerBounds, 75);
VTKM_MAKE_BENCHMARK(LowerBounds100, BenchLowerBounds, 100);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchReduce
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
ValueArrayHandle InputHandle;
// We don't actually use this, but we need it to prevent sufficently
// smart compilers from optimizing the Reduce call out.
Value Result;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchReduce()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(InputHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
this->Result =
Algorithm::Reduce(this->InputHandle, vtkm::TypeTraits<Value>::ZeroInitialization());
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Value tmp = Algorithm::Reduce(InputHandle, vtkm::TypeTraits<Value>::ZeroInitialization());
vtkm::Float64 time = timer.GetElapsedTime();
if (tmp != this->Result)
{
this->Result = tmp;
}
return time;
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "Reduce on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(Reduce, BenchReduce);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchReduceByKey
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
const vtkm::Id N_KEYS;
const vtkm::Id PERCENT_KEYS;
ValueArrayHandle ValueHandle, ValuesOut;
IdArrayHandle KeyHandle, KeysOut;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchReduceByKey(vtkm::Id key_percent)
: N_KEYS((Config.ComputeSize<Value>() * key_percent) / 100)
, PERCENT_KEYS(key_percent)
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillModuloTestValueKernel<vtkm::Id>(
N_KEYS, KeyHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
Algorithm::SortByKey(KeyHandle, ValueHandle);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
2017-05-18 14:29:41 +00:00
Algorithm::ReduceByKey(KeyHandle, ValueHandle, KeysOut, ValuesOut, vtkm::Add());
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "ReduceByKey on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ") with " << N_KEYS << " ("
<< PERCENT_KEYS << "%) distinct vtkm::Id keys";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(ReduceByKey5, BenchReduceByKey, 5);
VTKM_MAKE_BENCHMARK(ReduceByKey10, BenchReduceByKey, 10);
VTKM_MAKE_BENCHMARK(ReduceByKey15, BenchReduceByKey, 15);
VTKM_MAKE_BENCHMARK(ReduceByKey20, BenchReduceByKey, 20);
VTKM_MAKE_BENCHMARK(ReduceByKey25, BenchReduceByKey, 25);
VTKM_MAKE_BENCHMARK(ReduceByKey30, BenchReduceByKey, 30);
VTKM_MAKE_BENCHMARK(ReduceByKey35, BenchReduceByKey, 35);
VTKM_MAKE_BENCHMARK(ReduceByKey40, BenchReduceByKey, 40);
VTKM_MAKE_BENCHMARK(ReduceByKey45, BenchReduceByKey, 45);
VTKM_MAKE_BENCHMARK(ReduceByKey50, BenchReduceByKey, 50);
VTKM_MAKE_BENCHMARK(ReduceByKey75, BenchReduceByKey, 75);
VTKM_MAKE_BENCHMARK(ReduceByKey100, BenchReduceByKey, 100);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchScanInclusive
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
ValueArrayHandle ValueHandle, OutHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchScanInclusive()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Algorithm::ScanInclusive(ValueHandle, OutHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "ScanInclusive on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(ScanInclusive, BenchScanInclusive);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchScanExclusive
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
ValueArrayHandle ValueHandle, OutHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchScanExclusive()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Algorithm::ScanExclusive(ValueHandle, OutHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "ScanExclusive on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(ScanExclusive, BenchScanExclusive);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchSort
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
ValueArrayHandle ValueHandle;
std::mt19937 Rng;
VTKM_CONT
BenchSort() { ValueHandle.PrepareForOutput(Config.ComputeSize<Value>(), DeviceAdapterTag()); }
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
for (vtkm::Id i = 0; i < ValueHandle.GetNumberOfValues(); ++i)
{
ValueHandle.GetPortalControl().Set(vtkm::Id(i), TestValue(vtkm::Id(Rng()), Value()));
}
Timer timer;
Algorithm::Sort(ValueHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "Sort on " << arraySize << " random values ("
<< HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(Sort, BenchSort);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchSortByKey
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
std::mt19937 Rng;
vtkm::Id N_KEYS;
vtkm::Id PERCENT_KEYS;
ValueArrayHandle ValueHandle;
IdArrayHandle KeyHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchSortByKey(vtkm::Id percent_key)
: N_KEYS((Config.ComputeSize<Value>() * percent_key) / 100)
, PERCENT_KEYS(percent_key)
2017-05-18 14:29:41 +00:00
{
ValueHandle.PrepareForOutput(Config.ComputeSize<Value>(), DeviceAdapterTag());
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
for (vtkm::Id i = 0; i < ValueHandle.GetNumberOfValues(); ++i)
{
ValueHandle.GetPortalControl().Set(vtkm::Id(i), TestValue(vtkm::Id(Rng()), Value()));
}
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillModuloTestValueKernel<vtkm::Id>(
N_KEYS, KeyHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
Timer timer;
Algorithm::SortByKey(ValueHandle, KeyHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "SortByKey on " << arraySize << " random values ("
<< HumanSize(arraySize * sizeof(Value)) << ") with " << N_KEYS << " ("
<< PERCENT_KEYS << "%) different vtkm::Id keys";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(SortByKey5, BenchSortByKey, 5);
VTKM_MAKE_BENCHMARK(SortByKey10, BenchSortByKey, 10);
VTKM_MAKE_BENCHMARK(SortByKey15, BenchSortByKey, 15);
VTKM_MAKE_BENCHMARK(SortByKey20, BenchSortByKey, 20);
VTKM_MAKE_BENCHMARK(SortByKey25, BenchSortByKey, 25);
VTKM_MAKE_BENCHMARK(SortByKey30, BenchSortByKey, 30);
VTKM_MAKE_BENCHMARK(SortByKey35, BenchSortByKey, 35);
VTKM_MAKE_BENCHMARK(SortByKey40, BenchSortByKey, 40);
VTKM_MAKE_BENCHMARK(SortByKey45, BenchSortByKey, 45);
VTKM_MAKE_BENCHMARK(SortByKey50, BenchSortByKey, 50);
VTKM_MAKE_BENCHMARK(SortByKey75, BenchSortByKey, 75);
VTKM_MAKE_BENCHMARK(SortByKey100, BenchSortByKey, 100);
template <typename Value>
struct BenchStableSortIndices
{
using SSI = vtkm::worklet::StableSortIndices<DeviceAdapterTag>;
using ValueArrayHandle = vtkm::cont::ArrayHandle<Value, StorageTag>;
ValueArrayHandle ValueHandle;
std::mt19937 Rng;
VTKM_CONT
BenchStableSortIndices()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
this->ValueHandle.Allocate(arraySize);
auto portal = this->ValueHandle.GetPortalControl();
for (vtkm::Id i = 0; i < portal.GetNumberOfValues(); ++i)
{
portal.Set(vtkm::Id(i), TestValue(vtkm::Id(Rng()), Value()));
}
}
VTKM_CONT
vtkm::Float64 operator()()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
vtkm::cont::ArrayHandle<vtkm::Id> indices;
Algorithm::Copy(vtkm::cont::ArrayHandleIndex(arraySize), indices);
Timer timer;
SSI::Sort(ValueHandle, indices);
return timer.GetElapsedTime();
}
VTKM_CONT
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "StableSortIndices::Sort on " << arraySize << " random values ("
<< HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(StableSortIndices, BenchStableSortIndices);
template <typename Value>
struct BenchStableSortIndicesUnique
{
using SSI = vtkm::worklet::StableSortIndices<DeviceAdapterTag>;
using IndexArrayHandle = typename SSI::IndexArrayType;
using ValueArrayHandle = vtkm::cont::ArrayHandle<Value, StorageTag>;
const vtkm::Id N_VALID;
const vtkm::Id PERCENT_VALID;
ValueArrayHandle ValueHandle;
VTKM_CONT
BenchStableSortIndicesUnique(vtkm::Id percent_valid)
: N_VALID((Config.ComputeSize<Value>() * percent_valid) / 100)
, PERCENT_VALID(percent_valid)
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
Algorithm::Schedule(
FillModuloTestValueKernel<Value>(
N_VALID, this->ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
}
VTKM_CONT
vtkm::Float64 operator()()
{
IndexArrayHandle indices = SSI::Sort(this->ValueHandle);
Timer timer;
SSI::Unique(this->ValueHandle, indices);
return timer.GetElapsedTime();
}
VTKM_CONT
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "StableSortIndices::Unique on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ") with " << this->N_VALID << " ("
<< PERCENT_VALID << "%) valid values";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique5, BenchStableSortIndicesUnique, 5);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique10, BenchStableSortIndicesUnique, 10);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique15, BenchStableSortIndicesUnique, 15);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique20, BenchStableSortIndicesUnique, 20);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique25, BenchStableSortIndicesUnique, 25);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique30, BenchStableSortIndicesUnique, 30);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique35, BenchStableSortIndicesUnique, 35);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique40, BenchStableSortIndicesUnique, 40);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique45, BenchStableSortIndicesUnique, 45);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique50, BenchStableSortIndicesUnique, 50);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique75, BenchStableSortIndicesUnique, 75);
VTKM_MAKE_BENCHMARK(StableSortIndicesUnique100, BenchStableSortIndicesUnique, 100);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchUnique
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
const vtkm::Id N_VALID;
const vtkm::Id PERCENT_VALID;
ValueArrayHandle ValueHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchUnique(vtkm::Id percent_valid)
: N_VALID((Config.ComputeSize<Value>() * percent_valid) / 100)
, PERCENT_VALID(percent_valid)
2017-05-18 14:29:41 +00:00
{
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillModuloTestValueKernel<Value>(
N_VALID, ValueHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
Algorithm::Sort(ValueHandle);
Timer timer;
Algorithm::Unique(ValueHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "Unique on " << arraySize << " values ("
<< HumanSize(arraySize * sizeof(Value)) << ") with " << N_VALID << " ("
<< PERCENT_VALID << "%) valid values";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(Unique5, BenchUnique, 5);
VTKM_MAKE_BENCHMARK(Unique10, BenchUnique, 10);
VTKM_MAKE_BENCHMARK(Unique15, BenchUnique, 15);
VTKM_MAKE_BENCHMARK(Unique20, BenchUnique, 20);
VTKM_MAKE_BENCHMARK(Unique25, BenchUnique, 25);
VTKM_MAKE_BENCHMARK(Unique30, BenchUnique, 30);
VTKM_MAKE_BENCHMARK(Unique35, BenchUnique, 35);
VTKM_MAKE_BENCHMARK(Unique40, BenchUnique, 40);
VTKM_MAKE_BENCHMARK(Unique45, BenchUnique, 45);
VTKM_MAKE_BENCHMARK(Unique50, BenchUnique, 50);
VTKM_MAKE_BENCHMARK(Unique75, BenchUnique, 75);
VTKM_MAKE_BENCHMARK(Unique100, BenchUnique, 100);
2017-05-18 14:29:41 +00:00
template <typename Value>
struct BenchUpperBounds
{
typedef vtkm::cont::ArrayHandle<Value, StorageTag> ValueArrayHandle;
const vtkm::Id N_VALS;
const vtkm::Id PERCENT_VALS;
ValueArrayHandle InputHandle, ValueHandle;
IdArrayHandle OutHandle;
VTKM_CONT
2017-05-18 14:29:41 +00:00
BenchUpperBounds(vtkm::Id percent_vals)
: N_VALS((Config.ComputeSize<Value>() * percent_vals) / 100)
, PERCENT_VALS(percent_vals)
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(
FillTestValueKernel<Value>(InputHandle.PrepareForOutput(arraySize, DeviceAdapterTag())),
arraySize);
2017-05-18 14:29:41 +00:00
Algorithm::Schedule(FillScaledTestValueKernel<Value>(
2, ValueHandle.PrepareForOutput(N_VALS, DeviceAdapterTag())),
N_VALS);
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
vtkm::Float64 operator()()
{
Timer timer;
Algorithm::UpperBounds(InputHandle, ValueHandle, OutHandle);
return timer.GetElapsedTime();
}
VTKM_CONT
2017-05-18 14:29:41 +00:00
std::string Description() const
{
vtkm::Id arraySize = Config.ComputeSize<Value>();
std::stringstream description;
description << "UpperBounds on " << arraySize << " input and " << N_VALS << " ("
<< PERCENT_VALS
<< "%) values (input array size: " << HumanSize(arraySize * sizeof(Value)) << ")";
return description.str();
}
};
VTKM_MAKE_BENCHMARK(UpperBounds5, BenchUpperBounds, 5);
VTKM_MAKE_BENCHMARK(UpperBounds10, BenchUpperBounds, 10);
VTKM_MAKE_BENCHMARK(UpperBounds15, BenchUpperBounds, 15);
VTKM_MAKE_BENCHMARK(UpperBounds20, BenchUpperBounds, 20);
VTKM_MAKE_BENCHMARK(UpperBounds25, BenchUpperBounds, 25);
VTKM_MAKE_BENCHMARK(UpperBounds30, BenchUpperBounds, 30);
VTKM_MAKE_BENCHMARK(UpperBounds35, BenchUpperBounds, 35);
VTKM_MAKE_BENCHMARK(UpperBounds40, BenchUpperBounds, 40);
VTKM_MAKE_BENCHMARK(UpperBounds45, BenchUpperBounds, 45);
VTKM_MAKE_BENCHMARK(UpperBounds50, BenchUpperBounds, 50);
VTKM_MAKE_BENCHMARK(UpperBounds75, BenchUpperBounds, 75);
VTKM_MAKE_BENCHMARK(UpperBounds100, BenchUpperBounds, 100);
public:
static VTKM_CONT int Run()
2017-05-18 14:29:41 +00:00
{
std::cout << DIVIDER << "\nRunning DeviceAdapter benchmarks\n";
// Run fixed bytes / size tests:
for (int sizeType = 0; sizeType < 2; ++sizeType)
{
if (sizeType == 0 && Config.TestArraySizeBytes)
{
std::cout << DIVIDER << "\nTesting fixed array byte sizes\n";
Config.DoByteSizes = true;
if (!Config.ExtendedTypeList)
{
RunInternal<BaseTypes>();
}
else
{
RunInternal<ExtendedTypes>();
}
}
if (sizeType == 1 && Config.TestArraySizeValues)
{
std::cout << DIVIDER << "\nTesting fixed array element counts\n";
Config.DoByteSizes = false;
if (!Config.ExtendedTypeList)
{
RunInternal<BaseTypes>();
}
else
{
RunInternal<ExtendedTypes>();
}
}
}
return 0;
}
template <typename ValueTypes>
static VTKM_CONT void RunInternal()
{
if (Config.BenchmarkFlags & COPY)
2017-05-18 14:29:41 +00:00
{
2016-05-10 19:22:55 +00:00
std::cout << DIVIDER << "\nBenchmarking Copy\n";
VTKM_RUN_BENCHMARK(Copy, ValueTypes());
}
if (Config.BenchmarkFlags & COPY_IF)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking CopyIf\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(CopyIf5, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf10, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf15, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf20, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf25, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf30, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf35, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf40, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf45, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf50, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf75, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(CopyIf5, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf25, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf50, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf75, ValueTypes());
VTKM_RUN_BENCHMARK(CopyIf100, ValueTypes());
}
}
if (Config.BenchmarkFlags & LOWER_BOUNDS)
2017-05-18 14:29:41 +00:00
{
std::cout << DIVIDER << "\nBenchmarking LowerBounds\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(LowerBounds5, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds10, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds15, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds20, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds25, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds30, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds35, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds40, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds45, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds50, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds75, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(LowerBounds5, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds25, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds50, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds75, ValueTypes());
VTKM_RUN_BENCHMARK(LowerBounds100, ValueTypes());
}
}
if (Config.BenchmarkFlags & REDUCE)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking Reduce\n";
VTKM_RUN_BENCHMARK(Reduce, ValueTypes());
}
if (Config.BenchmarkFlags & REDUCE_BY_KEY)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking ReduceByKey\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(ReduceByKey5, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey10, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey15, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey20, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey25, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey30, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey35, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey40, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey45, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey50, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey75, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(ReduceByKey5, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey25, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey50, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey75, ValueTypes());
VTKM_RUN_BENCHMARK(ReduceByKey100, ValueTypes());
}
}
if (Config.BenchmarkFlags & SCAN_INCLUSIVE)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking ScanInclusive\n";
VTKM_RUN_BENCHMARK(ScanInclusive, ValueTypes());
}
if (Config.BenchmarkFlags & SCAN_EXCLUSIVE)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking ScanExclusive\n";
VTKM_RUN_BENCHMARK(ScanExclusive, ValueTypes());
}
if (Config.BenchmarkFlags & SORT)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking Sort\n";
VTKM_RUN_BENCHMARK(Sort, ValueTypes());
}
if (Config.BenchmarkFlags & SORT_BY_KEY)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking SortByKey\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(SortByKey5, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey10, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey15, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey20, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey25, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey30, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey35, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey40, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey45, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey50, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey75, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(SortByKey5, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey25, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey50, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey75, ValueTypes());
VTKM_RUN_BENCHMARK(SortByKey100, ValueTypes());
}
}
if (Config.BenchmarkFlags & STABLE_SORT_INDICES)
{
std::cout << "\n" << DIVIDER << "\nBenchmarking StableSortIndices::Sort\n";
VTKM_RUN_BENCHMARK(StableSortIndices, ValueTypes());
}
if (Config.BenchmarkFlags & STABLE_SORT_INDICES_UNIQUE)
{
std::cout << "\n" << DIVIDER << "\nBenchmarking StableSortIndices::Unique\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(StableSortIndicesUnique5, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique10, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique15, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique20, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique25, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique30, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique35, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique40, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique45, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique50, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique75, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(StableSortIndicesUnique5, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique25, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique50, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique75, ValueTypes());
VTKM_RUN_BENCHMARK(StableSortIndicesUnique100, ValueTypes());
}
}
if (Config.BenchmarkFlags & UNIQUE)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking Unique\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(Unique5, ValueTypes());
VTKM_RUN_BENCHMARK(Unique10, ValueTypes());
VTKM_RUN_BENCHMARK(Unique15, ValueTypes());
VTKM_RUN_BENCHMARK(Unique20, ValueTypes());
VTKM_RUN_BENCHMARK(Unique25, ValueTypes());
VTKM_RUN_BENCHMARK(Unique30, ValueTypes());
VTKM_RUN_BENCHMARK(Unique35, ValueTypes());
VTKM_RUN_BENCHMARK(Unique40, ValueTypes());
VTKM_RUN_BENCHMARK(Unique45, ValueTypes());
VTKM_RUN_BENCHMARK(Unique50, ValueTypes());
VTKM_RUN_BENCHMARK(Unique75, ValueTypes());
VTKM_RUN_BENCHMARK(Unique100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(Unique5, ValueTypes());
VTKM_RUN_BENCHMARK(Unique25, ValueTypes());
VTKM_RUN_BENCHMARK(Unique50, ValueTypes());
VTKM_RUN_BENCHMARK(Unique75, ValueTypes());
VTKM_RUN_BENCHMARK(Unique100, ValueTypes());
}
}
if (Config.BenchmarkFlags & UPPER_BOUNDS)
2017-05-18 14:29:41 +00:00
{
std::cout << "\n" << DIVIDER << "\nBenchmarking UpperBounds\n";
if (Config.DetailedOutputRangeScaling)
{
VTKM_RUN_BENCHMARK(UpperBounds5, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds10, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds15, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds20, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds25, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds30, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds35, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds40, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds45, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds50, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds75, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds100, ValueTypes());
}
else
{
VTKM_RUN_BENCHMARK(UpperBounds5, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds25, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds50, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds75, ValueTypes());
VTKM_RUN_BENCHMARK(UpperBounds100, ValueTypes());
}
}
}
};
#undef ARRAY_SIZE
}
} // namespace vtkm::benchmarking
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
2017-05-18 14:29:41 +00:00
int main(int argc, char* argv[])
2015-07-06 21:44:29 +00:00
{
#if VTKM_DEVICE_ADAPTER == VTKM_DEVICE_ADAPTER_TBB
int numThreads = tbb::task_scheduler_init::automatic;
#endif // TBB
vtkm::benchmarking::BenchDevAlgoConfig& config = vtkm::benchmarking::Config;
for (int i = 1; i < argc; ++i)
2017-05-18 14:29:41 +00:00
{
std::string arg = argv[i];
std::transform(arg.begin(), arg.end(), arg.begin(), ::tolower);
if (arg == "copy")
2017-05-18 14:29:41 +00:00
{
config.BenchmarkFlags |= vtkm::benchmarking::COPY;
}
else if (arg == "copyif")
{
config.BenchmarkFlags |= vtkm::benchmarking::COPY_IF;
}
else if (arg == "lowerbounds")
{
config.BenchmarkFlags |= vtkm::benchmarking::LOWER_BOUNDS;
}
else if (arg == "reduce")
{
config.BenchmarkFlags |= vtkm::benchmarking::REDUCE;
}
else if (arg == "reducebykey")
{
config.BenchmarkFlags |= vtkm::benchmarking::REDUCE_BY_KEY;
}
else if (arg == "scaninclusive")
{
config.BenchmarkFlags |= vtkm::benchmarking::SCAN_INCLUSIVE;
}
else if (arg == "scanexclusive")
{
config.BenchmarkFlags |= vtkm::benchmarking::SCAN_EXCLUSIVE;
}
else if (arg == "sort")
{
config.BenchmarkFlags |= vtkm::benchmarking::SORT;
}
else if (arg == "sortbykey")
{
config.BenchmarkFlags |= vtkm::benchmarking::SORT_BY_KEY;
}
else if (arg == "stablesortindices")
{
config.BenchmarkFlags |= vtkm::benchmarking::STABLE_SORT_INDICES;
}
else if (arg == "stablesortindicesunique")
{
config.BenchmarkFlags |= vtkm::benchmarking::STABLE_SORT_INDICES_UNIQUE;
}
else if (arg == "unique")
{
config.BenchmarkFlags |= vtkm::benchmarking::UNIQUE;
}
else if (arg == "upperbounds")
{
config.BenchmarkFlags |= vtkm::benchmarking::UPPER_BOUNDS;
}
else if (arg == "typelist")
{
++i;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
std::string arg = argv[i];
std::transform(arg.begin(), arg.end(), arg.begin(), ::tolower);
if (arg == "base")
2017-05-18 14:29:41 +00:00
{
config.ExtendedTypeList = false;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
else if (arg == "extended")
2017-05-18 14:29:41 +00:00
{
config.ExtendedTypeList = true;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
else
{
std::cerr << "Unrecognized TypeList: " << argv[i] << std::endl;
return 1;
}
}
else if (arg == "fixbytes")
{
++i;
std::string arg = argv[i];
std::transform(arg.begin(), arg.end(), arg.begin(), ::tolower);
if (arg == "off")
{
config.TestArraySizeBytes = false;
}
else
2017-05-18 14:29:41 +00:00
{
std::istringstream parse(arg);
config.TestArraySizeBytes = true;
parse >> config.ArraySizeBytes;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
}
else if (arg == "fixsizes")
{
++i;
std::string arg = argv[i];
std::transform(arg.begin(), arg.end(), arg.begin(), ::tolower);
if (arg == "off")
2017-05-18 14:29:41 +00:00
{
config.TestArraySizeValues = false;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
2017-05-18 14:29:41 +00:00
else
{
std::istringstream parse(arg);
config.TestArraySizeValues = true;
parse >> config.ArraySizeValues;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
}
else if (arg == "detailedoutputrange")
{
config.DetailedOutputRangeScaling = true;
}
else if (arg == "numthreads")
{
++i;
#if VTKM_DEVICE_ADAPTER == VTKM_DEVICE_ADAPTER_TBB
std::istringstream parse(argv[i]);
parse >> numThreads;
std::cout << "Selected " << numThreads << " TBB threads." << std::endl;
#else
std::cerr << "NumThreads valid only on TBB. Ignoring." << std::endl;
#endif // TBB
}
else
{
std::cerr << "Unrecognized benchmark: " << argv[i] << std::endl;
return 1;
}
}
#if VTKM_DEVICE_ADAPTER == VTKM_DEVICE_ADAPTER_TBB
// Must not be destroyed as long as benchmarks are running:
tbb::task_scheduler_init init(numThreads);
#endif // TBB
if (config.BenchmarkFlags == 0)
{
config.BenchmarkFlags = vtkm::benchmarking::ALL;
Measurement and general improvements to the benchmark suite - A warm up run is done and not timed to allow for any allocation of room for output data without accounting for it in the run times. Previously this time spent allocating memory would be included in the time we measured for the benchmark. - Benchmarks are run multiple times and we then compute some statistics about the run time of the benchmark to give a better picture of the expected run time of the function. To this end we run the benchmark either 500 times or for 1.5s, whichever comes sooner (though these are easily changeable). We then perform outlier limiting by Winsorising the data (similar to how Rust's benchmarking library works) and print out the median, mean, min and max run times along with the median absolute deviation and standard deviation. - Because benchmarks are run many times they can now perform some initial setup in the constructor, eg. to fill some test input data array with values to let the main benchmark loop run faster. - To allow for benchmarks to have members of the data type being benchmarked the struct must now be templated on this type, leading to a bit of awkwardness. I've worked around this by adding the `VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make benchmark macro generates a struct that has an `operator()` templated on the value type which will construct and return the benchmark functor templated on that type. The run macro will then use this generated struct to run the benchmark functor on the type list passed. You can also pass arguments to the benchmark functor's constructor through the make macro however this makes things more awkward because the name of the MakeBench struct must be different for each variation of constructor arguments (for example see `BenchLowerBounds`). - Added a short comment on how to add benchmarks in `vtkm/benchmarking/Benchmarker.h` as the new system is a bit different from how the tests work. - You can now pass an extra argument when running the benchmark suite to only benchmark specific functions, eg. `Benchmarks_TBB BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark ScanInclusive and Sort. Running without any extra arguments will run all the benchmarks as before.
2015-07-24 21:22:10 +00:00
}
//now actually execute the benchmarks
return vtkm::benchmarking::BenchmarkDeviceAdapter<VTKM_DEFAULT_DEVICE_ADAPTER_TAG>::Run();
2015-07-06 21:44:29 +00:00
}