vtk-m

mirror of https://gitlab.kitware.com/vtk/vtk-m synced 2024-09-19 18:45:43 +00:00

History

Robert Maynard c1560e2d3f Perform less unnecessary copies when deducing a worklets parameters. One of the causes of the large library size and slow compile times has been that vtkm has been creating unnecessary copies when not needed. When the objects being copied use shared_ptr this causes a bloom in library size. I presume this bloom is caused by the atomic increment/decrement that is required by shared_ptr. For testing I used the following example: ``` struct ExampleFieldWorklet : public vtkm::worklet::WorkletMapField { typedef void ControlSignature( FieldIn<>, FieldIn<>, FieldIn<>, FieldOut<>, FieldOut<>, FieldOut<> ); typedef void ExecutionSignature( _1, _2, _3, _4, _5, _6 ); template<typename T, typename U, typename V> VTKM_EXEC_EXPORT void operator()( const vtkm::Vec< T, 3 > & vec, const U & scalar1, const V& scalar2, vtkm::Vec<T, 3>& out_vec, U& out_scalar1, V& out_scalar2 ) const { out_vec = vec * scalar1; out_scalar1 = scalar1 + scalar2; out_scalar2 = scalar2; } template<typename T, typename U, typename V, typename W, typename X, typename Y> VTKM_EXEC_EXPORT void operator()( const T & vec, const U & scalar1, const V& scalar2, W& out_vec, X& out_scalar, Y& ) const { //no-op } }; int main(int argc, char** argv) { std::vector< vtkm::Vec<vtkm::Float32, 3> > inputVec; std::vector< vtkm::Int32 > inputScalar1; std::vector< vtkm::Float64 > inputScalar2; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleV = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS1 = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS2 = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOV; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS1; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS2; std::cout << "Making 3 output DynamicArrayHandles " << std::endl; vtkm::cont::DynamicArrayHandle out1(handleOV), out2(handleOS1), out3(handleOS2); typedef vtkm::worklet::DispatcherMapField<ExampleFieldWorklet> DispatcherType; std::cout << "Invoking ExampleFieldWorklet" << std::endl; DispatcherType dispatcher; dispatcher.Invoke(handleV, handleS1, handleS2, out1, out2, out3); } ``` Original vtkm would generate a binary of size 4684kb and would perform 91 ArrayHandle copies or assignments. With this branch the binary size is reduced to 2392kb and will perform 36 copies or assignments.		2016-01-19 09:20:49 -05:00
..
testing	Enable output to input map in fetch mechanism.	2015-11-06 18:05:20 -07:00
ClipTables.h	CellSetExplicit uses UInt8 for shape, and IdComponent for numIndices.	2015-09-23 11:17:04 -04:00
CMakeLists.txt	Make new tetrahedralize functors work on CUDA	2015-11-07 10:09:19 -07:00
DispatcherBase.h	Perform less unnecessary copies when deducing a worklets parameters.	2016-01-19 09:20:49 -05:00
DispatcherBaseDetailInvoke.h	Fix pyexpander errors	2015-09-02 13:47:33 -07:00
DispatcherBaseDetailInvoke.h.in	The Copyright statement now has all the periods in the correct location.	2015-05-21 10:30:11 -04:00
TriangulateTables.h	Make new tetrahedralize functors work on CUDA	2015-11-07 10:09:19 -07:00
WorkletBase.h	Reduce the name length on Args<1>, etc to help reduce symbol size.	2016-01-15 08:32:50 -05:00