Commit Graph

104 Commits

Author SHA1 Message Date
Kenneth Moreland
731bb64a0b Make .in files match new formatting
More corrections for the autoformatter and .in files.
2017-05-31 09:37:29 -06:00
Kenneth Moreland
071c792148 Merge topic 'indent-generated'
b03a61da Make .in files match new formatting

Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !784
2017-05-28 10:28:54 -04:00
Kenneth Moreland
b03a61da5d Make .in files match new formatting
The automatic formatter formatted the result of the .in files, but not
the .in files themselves. This caused the .in file check to fail.
2017-05-27 09:46:32 -06:00
Robert Maynard
5dd346007b Respect VTK-m convention of parameters all or nothing on a line
clang-format BinPack settings have been disabled to make sure that the
VTK-m style guideline is obeyed.
2017-05-26 13:53:28 -04:00
Robert Maynard
60a405ef65 Add TaskTiling1D/3D which use faux virtuals to reduce binary size.
Redesigns the TBB and Serial backends and the vtkm::exec::Task concept so that
we can re-use the same launching logic for all Worklets, instead of generating
per worlet code. To keep the performance the same the TilingTask now is past
a range of indices to work on, rather than a single index.

Binary size reduction:
WorkletTests_SERIAL old - 19MB
WorkletTests_SERIAL new - 18MB

WorkletTests_TBB old - 39MB
WorkletTests_TBB new - 18MB

libvtkAcceleratorsVTKm old - 48MB
libvtkAcceleratorsVTKm new - 19MB
2017-05-25 11:00:01 -04:00
Kitware Robot
4ade5f5770 clang-format: apply to the entire tree 2017-05-25 07:51:37 -04:00
Kitware Robot
efbde1d54b clang-format: sort include directives 2017-05-18 12:59:33 -04:00
Robert Maynard
57ab48fe8e Replace occurrences of NULL with nullptr. 2017-05-04 10:50:57 -04:00
Robert Maynard
022c36fa4f Add vtkm::exec::TaskBase, and rename WorkletInvokeFunctor to TaskSingular
Previously WorkletInvokeFunctor inherited from vtkm::exec::FunctorBase,
which is also the base class for all users Worklets and for all functors
based to DeviceAdapter::Schedule.

This is done for a few reasons. The first is that we reduce the
minimum size of user worklets. Previously the users worklet would hold
a reference to the error message, and so would the wrapper class added
when calling DeviceAdapter::Schedule. Now we only have the users worklet
holding a reference.

Second, by refactoring to have two base classes we can better improve
the documentation on what responsibilities FunctorBase.h has, compared
to TaskBase.
2017-05-02 16:38:43 -04:00
Andrew Bauer
7b165842cd Fixing documentation typo 2017-01-25 10:41:09 -05:00
Kenneth Moreland
58eb8f168d Add WorkletReduceByKey and dispatcher
And the basic type for a reduce by key worklet and its associated
adapter. Right now the worklet only supports passing in keys. Values
come next.
2017-01-17 15:53:06 -07:00
Kenneth Moreland
b3d0e1f99b Move VecFromPortal classes to vtkm package
These Vec-like objects can be generally usable in both the control and
execution environments.
2016-11-22 17:04:55 -07:00
Kenneth Moreland
fdaccc22db Remove exports for header-only functions/methods
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.

Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
2016-11-15 22:22:13 -07:00
Christopher Sewell
93d7956daf Attempt 13 to resolve Windows compiler warning with streaming storage 2016-10-21 16:08:13 -06:00
Christopher Sewell
05975a2325 Attempt 3 to resolve Windows compiler warning with streaming storage 2016-10-20 10:32:30 -06:00
Christopher Sewell
72d9783c38 Merge remote-tracking branch 'upstream/master' into StreamingArray 2016-10-17 17:51:23 -06:00
Robert Maynard
3e09b2cebc VecFromPortal::CopyTo can now handle const value Portals. 2016-10-12 13:28:36 -04:00
Christopher Sewell
d92f39df12 Merge branch 'master' into StreamingArray 2016-09-15 17:54:59 -06:00
Robert Maynard
12810165bb Switch over to c++11 type_traits. 2016-08-31 16:11:26 -04:00
Christopher Sewell
9f01b59b97 Adding global thread index offset to ThreadIndicesTopologyMap, eliminating warnings 2016-08-18 17:22:17 -06:00
Christopher Sewell
7892ebf67c Making work index return global value when streaming 2016-08-17 21:33:04 -06:00
Robert Maynard
76cd2ac4da More corrections needed to suppress false positive host / device warnings. 2016-06-30 16:04:37 -04:00
Robert Maynard
90099d1c55 Simplify ThreadIndicies so link time is reduced.
ThreadIndicies constructor was templated on the invocation type, which created
thousand's of versions of that symbol which all had the same behavior. So now
remove that and move that logic into a Worklet function since it requires
the invocation info.
2016-05-04 14:48:42 -04:00
Robert Maynard
12ffd536fd Suppress false positive warnings from nvcc about host/device. 2016-04-01 15:50:52 -04:00
Robert Maynard
8683240b85 vtkm::exec::FunctorBase now properly initializes ErrorMessageBuffer. 2016-03-14 16:57:35 -04:00
Robert Maynard
821096cfd7 Perform necessary copies when deducing a worklets parameters.
As part of the work to reduce the number of copies of array handles the CUDA
backend was broken. The transportation of stack allocated classes to CUDA
relies on all member variables being value based, not references/pointers.
This correct the issue of sending references to host side memory to CUDA, at
the cost of two copies of the Invocation object.

When we move to C++11 we need to revisit this work and see if std::move
can help reduce the cost of these copies.
2016-01-26 15:08:46 -05:00
Robert Maynard
dd85fc1366 Document why we certain classes member variables need to be const ref. 2016-01-19 09:29:55 -05:00
Robert Maynard
c1560e2d3f Perform less unnecessary copies when deducing a worklets parameters.
One of the causes of the large library size and slow compile times has been
that vtkm has been creating unnecessary copies when not needed. When the
objects being copied use shared_ptr this causes a bloom in library size. I
presume this bloom is caused by the atomic increment/decrement that is
required by shared_ptr.

For testing I used the following example:
```
struct ExampleFieldWorklet : public vtkm::worklet::WorkletMapField
{
  typedef void ControlSignature( FieldIn<>, FieldIn<>, FieldIn<>,
                                 FieldOut<>, FieldOut<>, FieldOut<> );
  typedef void ExecutionSignature( _1, _2, _3, _4, _5, _6 );

  template<typename T, typename U, typename V>
  VTKM_EXEC_EXPORT
  void operator()( const vtkm::Vec< T, 3 > & vec,
                   const U & scalar1,
                   const V& scalar2,
                   vtkm::Vec<T, 3>& out_vec,
                   U& out_scalar1,
                   V& out_scalar2 ) const
  {
    out_vec = vec * scalar1;
    out_scalar1 = scalar1 + scalar2;
    out_scalar2 = scalar2;
  }

  template<typename T, typename U, typename V, typename W, typename X, typename Y>
  VTKM_EXEC_EXPORT
  void operator()( const T & vec,
                   const U & scalar1,
                   const V& scalar2,
                   W& out_vec,
                   X& out_scalar,
                   Y& ) const
  {
  //no-op
  }
};

int main(int argc, char** argv)
{
  std::vector< vtkm::Vec<vtkm::Float32, 3> > inputVec;
  std::vector< vtkm::Int32 > inputScalar1;
  std::vector< vtkm::Float64 > inputScalar2;

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleV =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS1 =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS2 =
    vtkm::cont::make_ArrayHandle(inputVec);

  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOV;
  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS1;
  vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS2;

  std::cout << "Making 3 output DynamicArrayHandles " << std::endl;
  vtkm::cont::DynamicArrayHandle out1(handleOV), out2(handleOS1), out3(handleOS2);

  typedef vtkm::worklet::DispatcherMapField<ExampleFieldWorklet> DispatcherType;

  std::cout << "Invoking ExampleFieldWorklet" << std::endl;
  DispatcherType dispatcher;

  dispatcher.Invoke(handleV, handleS1, handleS2, out1, out2, out3);

}
```

Original vtkm would generate a binary of size 4684kb and would perform 91
ArrayHandle copies or assignments. With this branch the binary size is
reduced to 2392kb and will perform 36 copies or assignments.
2016-01-19 09:20:49 -05:00
Kenneth Moreland
45abbb5c75 Share from indices vector.
Previously, each VecFromPortalPermute (the type that held the from field
values) held its own copy of the indices. For point to cell on
structured grids, this was a lot of repeated data values, which has the
potential to fill up cache and registers. Instead, just use pointer
references.
2015-11-06 18:05:21 -07:00
Kenneth Moreland
7b6e6e4a66 Enable output to input map in fetch mechanism.
This changes the interface to the ThreadIndices classes to have both
input and output indices. It also adds a visit index to ThreadIndices.

Also added the VisitIndex execution signature tag, which relies on this
behavior.
2015-11-06 18:05:20 -07:00
Kenneth Moreland
b0c5a32611 Add Scatter parameters to Invocation.
We are passing in execution objects with the Invocation when the Worklet
is scheduled, but we are not using it yet.
2015-11-06 18:05:20 -07:00
Robert Maynard
8de216c088 Propagate vtkm::Id3 scheduling down to the ThreadIndex classes.
This now allows for even more efficient construction of uniform point
coordinates when running under the 3d scheduler, since we don't need to go
from 3d index to flat index to 3d index, instead we stay in 3d index
2015-10-20 09:29:41 -04:00
Kenneth Moreland
99ce66c6fe Change Fetches to use ThreadIndices instead of Invocation.
Previously, all Fetch objects received an Invocation object in their
Load and Store methods. The point of this was that it allowed the Fetch
to get data from any of the execution objects. However, every Fetch
either just got data directly from its associated execution object or
else used a secondary execution object (the input domain) to get indices
into their own execution object.

This left two potential areas for improvement. First, pulling data out
of the Invocation object was unnecessarily complicated. It would be much
nicer to get data directly from the associated execution object. Second,
when getting index information from the input domain, it was often the
case that extra computations were necessary (particularly on structured
cell sets). There was no way to share the index information among
Fetches, and therefore the computations were replicated.

This change removes the Invocation from the Fetch Load and Store.
Instead, it passes the associated execution object and a new object type
called the ThreadIndices. The ThreadIndices are customized for the input
domain and therefore have all the information needed for a redirected
lookup. It is also a thread-local object so it can cache computed
indices and save on computation time.
2015-10-07 17:01:42 -06:00
Robert Maynard
fc79055f76 Add suppression pragmas to exec::Fetch classes 2015-09-24 10:39:48 -04:00
Kenneth Moreland
b15940c1e3 Declare new VTKM_STATIC_ASSERT
This is to be used in place of BOOST_STATIC_ASSERT so that we can
control its implementation.

The implementation is designed to fix the issue where the latest XCode
clang compiler gives a warning about a unused typedefs when the boost
static assert is used within a function. (This warning also happens when
using the C++11 static_assert keyword.) You can suppress this warning
with _Pragma commands, but _Pragma commands inside a block is not
supported in GCC. The implementation of VTKM_STATIC_ASSERT handles all
current cases.
2015-09-17 14:40:39 -06:00
Kenneth Moreland
cf6af174eb Merge branch 'variable-topology-fields' into 'master'
Variable topology fields

Changes to fetching in topology maps that lets you properly deal with cases where you do not know how many values are being fetched at compile time. For example, explicit cell sets can have any number of cell shapes that have different numbers of nodes.

This change should resolve issue #26.

See merge request !128
2015-08-19 11:27:43 -04:00
Kenneth Moreland
e301ba0a98 Replace BOOST_MPL_ASSERT with BOOST_STATIC_ASSERT
BOOST_MPL_ASSERT is causing warnings in the PGI compiler. Apparently,
when BOOST_MPL_ASSERT succeeds it declares a static object with a unqiue
name scoped to the file. The problem is that the PGI compiler is pretty
picky about things being declared without being used, so it was emitting
useless warnings about successful BOOST_MPL_ASSERTs. However,
BOOST_STATIC_ASSERT does not seem to have this problem, so for the benefit
of PGI change the compile-time assert method.
2015-08-14 21:16:12 +00:00
Kenneth Moreland
50ac3af910 Fix PGI compiler issues.
The PGI compiler is fussy about finding declared variables and methods
that have limited scope and are never used. Thus, it is complaining about
some internal test classes that are properly implementing the ArrayPortal
interface even though not all of it is being accessed.

To get around the problem, put them in a non-anonymous namespace with a
name unlikely to conflict with anything. The compiler will recognize that
it is possible to access these classes outside the scope of the file and
shut up about items not being used.
2015-08-14 21:03:38 +00:00
Kenneth Moreland
891979e3fa Added VecFromPortalPermute class. 2015-08-14 09:15:47 -06:00
Kenneth Moreland
c9d95298b0 Added VecFromPortal class. 2015-08-14 09:15:47 -06:00
Robert Maynard
ab59e34a2f Rename pragma header guard so it makes sense for tbb and thrust.
Boost is not the only thirdparty that we are supressing warnings for, so
make the name more generic.
2015-08-13 09:04:23 -04:00
Kenneth Moreland
21b3b318ba Always disable conversion warnings when including boost header files
On one of my compile platforms, GCC was giving conversion warnings from
any boost include that was not wrapped in pragmas to disable conversion
warnings. To make things easier and more robust, I created a pair of
macros, VTKM_BOOST_PRE_INCLUDE and VTKM_BOOST_POST_INCLUDE, that should
be wrapped around any #include of a boost header file.
2015-07-30 17:40:40 -06:00
Brent Lessley
0a72789304 Resolved all implicit conversions between unsigned int and vtkm::Id. 2015-05-26 09:34:43 -04:00
Robert Maynard
6b8e7822be The Copyright statement now has all the periods in the correct location. 2015-05-21 10:30:11 -04:00
Kenneth Moreland
ef093d5c07 The DoWorkletInvokeFunctor methods were missing VTKM_EXEC_EXPORT. 2015-01-15 22:47:28 -07:00
Kenneth Moreland
40efb51342 Fix MSVC warnings
MSVC is picky about type conversions. To get it to shut up, explicitly
cast the worklet return value to the fetch value in the
WorkletInvokeFunctor. The good is that it will help with needing
explicit conversions on these return values. But that is also bad in
that it might make some unexpected conversions possible.
2014-10-23 07:12:01 -06:00
Kenneth Moreland
9ac538b6b9 MSVC fixes
One fix is a simple (pointless) compiler warning about precision. The
other fix is an error in one of the test codes that did not clear out
the message string in an error message buffer like it was supposed to.
2014-10-22 10:52:35 -06:00
Kenneth Moreland
53a454fe77 Add basic dispatcher functionality.
These changes support the implementation of DispatcherBase. This class
provides the basic functionality for calling an Invoke method in the
control environment, transferring data to the execution environment,
scheduling threads in the execution environment, pulling data for each
calling of the worklet method, and actually calling the worklet.
2014-10-21 11:49:23 -06:00
Kenneth Moreland
01d6619774 Add basic worklet superclasses and signature tags 2014-10-15 15:47:39 -06:00
Kenneth Moreland
b012668345 Add a FunctorBase class for scheduling non-worklets
Whenever creating a functor to be launched in the execution environment
using the device adapter Schedule algorithm, you had to also create a
couple of methods to handle error message buffers. For convenience, lots
of code started to just inherit from WorkletBase. Although this worked,
it was a misnomer (and might cause problems in the future if worklets
later require different things from its base). To get around this
problem, add a FunctorBase class that is intended to be used as the
superclass to functors called with Schedule.
2014-06-10 11:35:13 -06:00
Kenneth Moreland
3ed7093945 Add test for ErrorMessageBuffer 2014-06-10 11:21:55 -06:00
Kenneth Moreland
b692cb3d89 Add CMake configuration for execution environment code 2014-06-10 11:14:11 -06:00
Robert Maynard
c80fb9259f Update the initial repository to use the correct indentation style. 2014-02-11 16:20:30 -05:00
Robert Maynard
c2101b8ffc Add in a serial device adapter and required supporting classes.
We now can verify that the array handle is usable by a device adapter.
2014-02-11 12:34:56 -05:00