1ea6f732 Add our own version of is_sorted to check the assert
311b5dcc Remove C++11 feature is_sorted and increase time alloted to run bench
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !115
Don't use patched version of TBB on newer versions.
We have a patched version of TBB's parallel_for.h in our files that
fixes a problem with using std::swap. This issue has since been fixed in
TBB, so for newer versions we should revert back to TBB's
implementation.
See merge request !118
Robert Maynard tells me that the TBB backend has been tried on versions
of TBB back to 4.0. Since the patch appears to work across them, allow
those versions too.
We have a patched version of TBB's parallel_for.h in our files that
fixes a problem with using std::swap. This issue has since been fixed in
TBB, so for newer versions we should revert back to TBB's
implementation.
GCC warnings
Attempt to fix any compiler warnings that appear on GCC dashboards.
I also was using a pretty picky compiler, so there is probably several fixes that have no impact on the current dashboard set.
See merge request !110
One of the dashboards gave yet another warning in the boost random
library. This time the warning is on unused parameters. Go ahead and add
that to the list of things to not check in boost.
The renar dashboard gave some warnings about shadowed variables in the
boost random library (version 1.57.0). This might be a bug in boost that
is already fixed (I didn't get the same warning on my gcc compile with
boost 1.58.0), but I don't see a problem with disabling the shadow
warning everywhere in boost.
C and C++ has a funny feature where operations on small integers (char
and short) actually promote the result to a 32 bit integer. Most often
in our code the result is pushed back to the same type, and picky compilers
can then give a warning about an implicit type conversion (that we
inevitably don't care about). Here are a lot of changes to suppress
the warnings.
On one of my compile platforms, GCC was giving conversion warnings from
any boost include that was not wrapped in pragmas to disable conversion
warnings. To make things easier and more robust, I created a pair of
macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should
be wrapped around any #include of a boost header file.
DynamicCellSet
Add a ```DynamicCellSet``` class to use in place of raw pointers or boost ```smart_ptr```s to make managing the anonymous class and casting easier.
See merge request !103
Update the cuda IteratorFromArrayPortal to use ptrdiff_t.
This make the advance / distance_to function signatures constant no matter
if we are building with 32/64 bit ids.
See merge request !106
Make detecting if we are cuda 3+ gpu running cuda 2 code faster.
The original implementing tried to run 2^31 kernels and detect a
launch failure to determine this use-case. The issue with this approach
is that on a cuda 3+ gpu, this would take multiple seconds and cause
the gpu to terminate the kernel when opengl was also loaded.
See merge request !104
The Invoke of the topology dispatcher is also changed to expect a
concrete cell set (which the DynamicCellSet is automatically cast to)
rather than a connectivity structure. The dispatcher calls the
GetNodeToCellConnectivity method for you. (That is currently the only
one supported.)
The original implementing tried to run 2^31 kernels and detect a
launch failure to determine this use-case. The issue with this approach
is that on a cuda 3+ gpu, this would take multiple seconds and cause
the gpu to terminate the kernel when opengl was also loaded.
Previously, IteratorFromArrayPortal was declaring its difference_type
to be vtkm::Id. Although this is allowed, there is code that assumes
that iterators have a difference_type that is ptrdiff_t or something
similar. This change makes the difference_type the default for the
boost iterator facade, which should be the type other code that\
neglects to check expects.
- A warm up run is done and not timed to allow for any allocation of
room for output data without accounting for it in the run times.
Previously this time spent allocating memory would be included in the
time we measured for the benchmark.
- Benchmarks are run multiple times and we then compute some statistics
about the run time of the benchmark to give a better picture of the
expected run time of the function. To this end we run the benchmark
either 500 times or for 1.5s, whichever comes sooner (though these are
easily changeable). We then perform outlier limiting by Winsorising the
data (similar to how Rust's benchmarking library works) and print out
the median, mean, min and max run times along with the median absolute
deviation and standard deviation.
- Because benchmarks are run many times they can now perform some
initial setup in the constructor, eg. to fill some test input data
array with values to let the main benchmark loop run faster.
- To allow for benchmarks to have members of the data type being
benchmarked the struct must now be templated on this type, leading to
a bit of awkwardness. I've worked around this by adding the
`VTKM_MAKE_BENCHMARK` and `VTKM_RUN_BENCHMARK` macros, the make
benchmark macro generates a struct that has an `operator()` templated on
the value type which will construct and return the benchmark functor
templated on that type. The run macro will then use this generated
struct to run the benchmark functor on the type list passed. You can
also pass arguments to the benchmark functor's constructor through the
make macro however this makes things more awkward because the name of
the MakeBench struct must be different for each variation of constructor
arguments (for example see `BenchLowerBounds`).
- Added a short comment on how to add benchmarks in
`vtkm/benchmarking/Benchmarker.h` as the new system is a bit different
from how the tests work.
- You can now pass an extra argument when running the benchmark suite to
only benchmark specific functions, eg. `Benchmarks_TBB
BenchmarkDeviceAdapter ScanInclusive Sort` will only benchmark
ScanInclusive and Sort. Running without any extra arguments will run all
the benchmarks as before.
The test_equal method compares the ratio of the two values to decide if
they are close enough. Although there is a previous check to make sure
that neither value is too close to zero, the MSVC sometimes gives a
warning because it cannot trace the flow of the check. Add another
conditional (that will never actually be executed) to check a second time
that we never divide by 0.
The DynamicCellSet will be used in place of the pointer to a CellSet
in a DataSet. This will prevent us from having to cast it all the time
and also remove reliance on boost smart_ptr.
MSVC ArrayHandle fail
Fix the fact that UnitTestArrayHandle is failing on the Windows dashboards. Also fix some of the MSVC warnings.
See merge request !101