825f351d04
I had assumed that the compiler would be clever enough to turn the iterative implementation of Copy into a memcpy, but inspecting the disassembly on a release GCC build shows that this is not the case, likely because it can't assume that the memory ranges do not overlap. Replacing the loop with std::copy speeds things up (about 30-50%) for most data types, though there is a slight (usually < 5%) slowdown for Vec types. The uint8 copy improved by a factor of 8. Comparison: | Speedup | iteration | std::copy | Benchmark (Type) | |---------|----------------------|----------------------|------------------| | 1.363 | 0.001590 +- 0.000087 | 0.001166 +- 0.000049 | Copy 2097152 values (vtkm::Float32) | | 1.487 | 0.003429 +- 0.000185 | 0.002305 +- 0.000146 | Copy 2097152 values (vtkm::Float64) | | 1.379 | 0.001568 +- 0.000072 | 0.001137 +- 0.000093 | Copy 2097152 values (vtkm::Int32) | | 1.420 | 0.003410 +- 0.000173 | 0.002402 +- 0.000101 | Copy 2097152 values (vtkm::Int64) | | 1.303 | 0.001564 +- 0.000083 | 0.001201 +- 0.000078 | Copy 2097152 values (vtkm::UInt32) | | 7.204 | 0.002441 +- 0.000104 | 0.000339 +- 0.000029 | Copy 2097152 values (vtkm::UInt8) | | 0.987 | 0.006602 +- 0.000266 | 0.006688 +- 0.000291 | Copy 2097152 values (vtkm::Vec< vtkm::Float32, 4 >) | | 0.965 | 0.010065 +- 0.000528 | 0.010427 +- 0.000617 | Copy 2097152 values (vtkm::Vec< vtkm::Float64, 3 >) | | 0.979 | 0.003327 +- 0.000191 | 0.003398 +- 0.000142 | Copy 2097152 values (vtkm::Vec< vtkm::Int32, 2 >) | | 0.851 | 0.001579 +- 0.000090 | 0.001856 +- 0.000098 | Copy 2097152 values (vtkm::Vec< vtkm::UInt8, 4 >) | |
||
---|---|---|
.. | ||
benchmarking | ||
cont | ||
exec | ||
filter | ||
internal | ||
interop | ||
io | ||
rendering | ||
testing | ||
worklet | ||
.gitattributes | ||
Assert.h | ||
BaseComponent.h | ||
BinaryOperators.h | ||
BinaryPredicates.h | ||
Bounds.h | ||
CellShape.h | ||
CellTraits.h | ||
CMakeLists.txt | ||
Hash.h | ||
ListTag.h | ||
Math.h | ||
Math.h.in | ||
Matrix.h | ||
NewtonsMethod.h | ||
Pair.h | ||
Range.h | ||
RangeId3.h | ||
RangeId.h | ||
StaticAssert.h | ||
TopologyElementTag.h | ||
Transform3D.h | ||
TypeListTag.h | ||
Types.h | ||
TypeTraits.h | ||
UnaryPredicates.h | ||
VecAxisAlignedPointCoordinates.h | ||
VecFromPortal.h | ||
VecFromPortalPermute.h | ||
VectorAnalysis.h | ||
VecTraits.h | ||
VecVariable.h | ||
Version.h.in |