vtk-m/vtkm/worklet
Robert Maynard c1560e2d3f Perform less unnecessary copies when deducing a worklets parameters.
One of the causes of the large library size and slow compile times has been
that vtkm has been creating unnecessary copies when not needed. When the
objects being copied use shared_ptr this causes a bloom in library size. I
presume this bloom is caused by the atomic increment/decrement that is
required by shared_ptr.

For testing I used the following example:
```
struct ExampleFieldWorklet : public vtkm::worklet::WorkletMapField
{
  typedef void ControlSignature( FieldIn<>, FieldIn<>, FieldIn<>,
                                 FieldOut<>, FieldOut<>, FieldOut<> );
  typedef void ExecutionSignature( _1, _2, _3, _4, _5, _6 );

  template<typename T, typename U, typename V>
  VTKM_EXEC_EXPORT
  void operator()( const vtkm::Vec< T, 3 > & vec,
                   const U & scalar1,
                   const V& scalar2,
                   vtkm::Vec<T, 3>& out_vec,
                   U& out_scalar1,
                   V& out_scalar2 ) const
  {
    out_vec = vec * scalar1;
    out_scalar1 = scalar1 + scalar2;
    out_scalar2 = scalar2;
  }

  template<typename T, typename U, typename V, typename W, typename X, typename Y>
  VTKM_EXEC_EXPORT
  void operator()( const T & vec,
                   const U & scalar1,
                   const V& scalar2,
                   W& out_vec,
                   X& out_scalar,
                   Y& ) const
  {
  //no-op
  }
};

int main(int argc, char** argv)
{
  std::vector< vtkm::Vec<vtkm::Float32, 3> > inputVec;
  std::vector< vtkm::Int32 > inputScalar1;
  std::vector< vtkm::Float64 > inputScalar2;

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleV =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS1 =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS2 =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOV;
  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS1;
  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS2;

  std::cout << "Making 3 output DynamicArrayHandles " << std::endl;
  vtkm::cont::DynamicArrayHandle out1(handleOV), out2(handleOS1), out3(handleOS2);

  typedef vtkm::worklet::DispatcherMapField<ExampleFieldWorklet> DispatcherType;

  std::cout << "Invoking ExampleFieldWorklet" << std::endl;
  DispatcherType dispatcher;

  dispatcher.Invoke(handleV, handleS1, handleS2, out1, out2, out3);

}
```

Original vtkm would generate a binary of size 4684kb and would perform 91
ArrayHandle copies or assignments. With this branch the binary size is
reduced to 2392kb and will perform 36 copies or assignments.
2016-01-19 09:20:49 -05:00
..
internal Perform less unnecessary copies when deducing a worklets parameters. 2016-01-19 09:20:49 -05:00
splatkernels Rename kernels directory to splatkernels to avoid confusion 2015-09-15 19:46:53 +02:00
testing Simplify and unify cast interface. 2016-01-18 15:58:04 -07:00
AverageByKey.h Add ArrayHandleIndex class. 2015-09-14 22:11:09 -06:00
CellAverage.h Add convenience tags like FieldInPoint, FieldInCell, to WorkletMapPointToCell 2015-10-23 09:50:48 -04:00
Clip.h Update Clip worklets to work with more types 2015-11-12 10:28:22 -05:00
CMakeLists.txt Merge topic 'marching-cubes' 2015-12-03 14:54:02 -05:00
DispatcherMapField.h Perform less unnecessary copies when deducing a worklets parameters. 2016-01-19 09:20:49 -05:00
DispatcherMapTopology.h Perform less unnecessary copies when deducing a worklets parameters. 2016-01-19 09:20:49 -05:00
ExternalFaces.h Turn off the benchmarking ExternalsFaces. 2016-01-14 15:56:10 -05:00
FieldHistogram.h Attempt to fix compiler errors and warnings. 2015-10-01 15:08:36 -06:00
FieldStatistics.h Attempt to fix compiler errors and warnings. 2015-10-01 15:08:36 -06:00
KernelSplatter.h Rename kernels directory to splatkernels to avoid confusion 2015-09-15 19:46:53 +02:00
Magnitude.h Restrict the Magnitude worklet signature to vectors of 3 components. 2015-08-31 22:49:53 -04:00
MarchingCubes.h Generalize MarchingCubes input with additional template parameters. 2016-01-08 14:56:10 -05:00
MarchingCubesDataTables.h Move marching cubes edge table out of the worklet. 2015-12-02 15:33:52 -05:00
PointElevation.h Have CoordinateSystem inherit from Field 2015-08-25 14:38:41 -06:00
ScatterCounting.h MarchingCubes is now able to not generate normals. 2016-01-08 10:42:49 -05:00
ScatterIdentity.h Adding ScatterCounting 2015-11-06 18:05:20 -07:00
ScatterUniform.h Change tetrahedralize filters to use new Scatter mechanism 2015-11-07 04:57:16 -07:00
StreamLineUniformGrid.h Simplify and unify cast interface. 2016-01-18 15:58:04 -07:00
TetrahedralizeExplicitGrid.h Simplify and unify cast interface. 2016-01-18 15:58:04 -07:00
TetrahedralizeUniformGrid.h Simplify and unify cast interface. 2016-01-18 15:58:04 -07:00
Threshold.h Threshold worklet is not templated on device adapter. 2016-01-14 15:55:22 -05:00
VertexClustering.h VertexClustering worklet is not templated on device adapter. 2016-01-14 15:42:27 -05:00
WorkletMapField.h Add in-place (in-out) arrays to worklets. 2015-08-12 14:41:56 -06:00
WorkletMapTopology.h Add PointCount to WorkletMapPointToCell. 2016-01-07 15:26:29 -07:00