vtk-m

mirror of https://gitlab.kitware.com/vtk/vtk-m synced 2024-09-19 10:35:42 +00:00

Author	SHA1	Message	Date
Chuck Atkins	f74c0d3c88	Remove type conversion related warnings for GCC	2016-03-17 13:05:38 -04:00
Robert Maynard	8683240b85	vtkm::exec::FunctorBase now properly initializes ErrorMessageBuffer.	2016-03-14 16:57:35 -04:00
Matt Larsen	5ddade7a44	Adding some basic documentation on atomics.	2016-03-09 14:29:59 -05:00
Matt Larsen	43131ee02b	Adding comments about CAS	2016-03-08 09:58:20 -08:00
Matt Larsen	3b46706e1f	Adding compare and swap and removing unsigned atomics	2016-03-08 09:41:02 -08:00
Matt Larsen	e8b08f2e00	Merge branch 'master' into feature/atomics	2016-03-04 08:03:33 -08:00
Robert Maynard	bb90493920	Resolves Issue 52, we now install all vtkm files correctly.	2016-02-22 14:20:35 -05:00
Matt Larsen	2baac9cd8b	initial commit of atomic adds	2016-02-10 07:51:31 -08:00
Robert Maynard	07d299209e	Merge topic 'fix/ExecutionWholeArray' 5b705a52 Fixing return value for void function Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Robert Maynard <robert.maynard@kitware.com> Merge-request: !334	2016-01-27 15:57:40 -05:00
mclarsen	5b705a5239	Fixing return value for void function	2016-01-27 08:07:20 -08:00
Robert Maynard	821096cfd7	Perform necessary copies when deducing a worklets parameters. As part of the work to reduce the number of copies of array handles the CUDA backend was broken. The transportation of stack allocated classes to CUDA relies on all member variables being value based, not references/pointers. This correct the issue of sending references to host side memory to CUDA, at the cost of two copies of the Invocation object. When we move to C++11 we need to revisit this work and see if std::move can help reduce the cost of these copies.	2016-01-26 15:08:46 -05:00
Robert Maynard	bd3d29577a	Fix ArrayPortalFromThrust to re-enable texture memory fast path.	2016-01-26 14:30:25 -05:00
Robert Maynard	b2cd41d765	Fix ArrayPortalFromThrust to re-enable texture memory fast path.	2016-01-26 14:29:52 -05:00
Robert Maynard	dd85fc1366	Document why we certain classes member variables need to be const ref.	2016-01-19 09:29:55 -05:00
Robert Maynard	c1560e2d3f	Perform less unnecessary copies when deducing a worklets parameters. One of the causes of the large library size and slow compile times has been that vtkm has been creating unnecessary copies when not needed. When the objects being copied use shared_ptr this causes a bloom in library size. I presume this bloom is caused by the atomic increment/decrement that is required by shared_ptr. For testing I used the following example: ``` struct ExampleFieldWorklet : public vtkm::worklet::WorkletMapField { typedef void ControlSignature( FieldIn<>, FieldIn<>, FieldIn<>, FieldOut<>, FieldOut<>, FieldOut<> ); typedef void ExecutionSignature( _1, _2, _3, _4, _5, _6 ); template<typename T, typename U, typename V> VTKM_EXEC_EXPORT void operator()( const vtkm::Vec< T, 3 > & vec, const U & scalar1, const V& scalar2, vtkm::Vec<T, 3>& out_vec, U& out_scalar1, V& out_scalar2 ) const { out_vec = vec * scalar1; out_scalar1 = scalar1 + scalar2; out_scalar2 = scalar2; } template<typename T, typename U, typename V, typename W, typename X, typename Y> VTKM_EXEC_EXPORT void operator()( const T & vec, const U & scalar1, const V& scalar2, W& out_vec, X& out_scalar, Y& ) const { //no-op } }; int main(int argc, char** argv) { std::vector< vtkm::Vec<vtkm::Float32, 3> > inputVec; std::vector< vtkm::Int32 > inputScalar1; std::vector< vtkm::Float64 > inputScalar2; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleV = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS1 = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleS2 = vtkm::cont::make_ArrayHandle(inputVec); vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOV; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS1; vtkm::cont::ArrayHandle< vtkm::Vec<vtkm::Float32, 3> > handleOS2; std::cout << "Making 3 output DynamicArrayHandles " << std::endl; vtkm::cont::DynamicArrayHandle out1(handleOV), out2(handleOS1), out3(handleOS2); typedef vtkm::worklet::DispatcherMapField<ExampleFieldWorklet> DispatcherType; std::cout << "Invoking ExampleFieldWorklet" << std::endl; DispatcherType dispatcher; dispatcher.Invoke(handleV, handleS1, handleS2, out1, out2, out3); } ``` Original vtkm would generate a binary of size 4684kb and would perform 91 ArrayHandle copies or assignments. With this branch the binary size is reduced to 2392kb and will perform 36 copies or assignments.	2016-01-19 09:20:49 -05:00
Kenneth Moreland	1a538ca196	Merge branch 'scatter-worklets' into 'master' Scatter in worklets Add the functionality to perform a scatter operation from input to output in a worklet invocation. This allows you to, for example, specify a variable amount of outputs generated for each input. See merge request !221	2015-11-11 13:09:47 -05:00
Kenneth Moreland	7b05604a66	Add more tolerance to UnitTestParametricCoordinates I noticed a failure in a dashboard run of UnitTestParametricCoordinates. This test uses randomly generated numbers to test the behavior of some cell shapes, and there was an instance that occured with seed 1447261681 that caused one of the comparisons to be just slightly larger than the default tolerance but still within reasonable value. I just increased the tolerance of that particular comparison. Hopefully this will prevent all future failures.	2015-11-11 10:38:17 -07:00
Robert Maynard	b3687c6f3c	Workaround inclusive_scan issues in thrust 1.8.X for complex value types. The original workaround for inclusive_scan bugs in thrust 1.8 only solved the issue for basic arithmetic types such as int, float, double. Now we go one step further and fix the problem for all types. The solution is to provide a proper implementation of destructive_accumulate_n and make sure it exists before any includes of thrust occur.	2015-11-09 17:14:30 -05:00
Kenneth Moreland	8ef0a4ee50	Fix conversion warnings. Recent changes to algorithm implementations caused CellDerivative to be called in a way such that it gave conversion warnings on some compilers. Fix that.	2015-11-07 13:13:17 -07:00
Kenneth Moreland	342a57efcd	Fix double reference compile error.	2015-11-07 11:23:03 -07:00
Kenneth Moreland	0d394db0ce	Fix conversion warnings when using double precision. There were some conversion warnings issued when the default float was set to 64-bit. Fixed these (on clang).	2015-11-07 06:35:24 -07:00
Kenneth Moreland	bf03243516	Add ability to multiply any Vec by vtkm::Float64. This has been requested on the mailing list to make it easier to interpolate integer vectors. There are a couple of downsides to this addition. First, it implicitly casts doubles back to whatever the vector type is, which can cause a loss of precision. Second, it makes it more likely to get overload errors when multiplying with Vec. In particular, the operator to cast Vec of size 1 to the component class had to be removed.	2015-11-07 06:33:50 -07:00
Kenneth Moreland	d44860c3cf	Change tetrahedralize filters to use new Scatter mechanism The tetrahedralize algorithms have been changed to use the Scatter classes to build indices rather than build them on their own. To implement this efficiently with structured grids, a new ScatterUniform class was made. I also added a new execution argument tag that allows you to get the thread indices object from within the worklet.	2015-11-07 04:57:16 -07:00
Kenneth Moreland	45abbb5c75	Share from indices vector. Previously, each VecFromPortalPermute (the type that held the from field values) held its own copy of the indices. For point to cell on structured grids, this was a lot of repeated data values, which has the potential to fill up cache and registers. Instead, just use pointer references.	2015-11-06 18:05:21 -07:00
Kenneth Moreland	f7789f0ed7	Fix issue with const types in Thrust array management Previously, there was a declaration ConstArrayPortalFromThrust<const T> in ArrayManagerExecutionThrustDevice. This proved problematic because values read from the array in the worklet were typed as const T rather than simply T. Any Vec or Matrix built from that type would then fail because they are not meant to work with a const value (which means they have to be set on construction and never changed. Instead, declare ConstArrayPortalFromThrust<T> and internally set all the Thrust pointers to have type const T. Also declare other thrust pointers used as method parameters to have const T rather than T. This should work as conversion from T to const T should be fine, but not the other way around.	2015-11-06 18:05:21 -07:00
Kenneth Moreland	7b6e6e4a66	Enable output to input map in fetch mechanism. This changes the interface to the ThreadIndices classes to have both input and output indices. It also adds a visit index to ThreadIndices. Also added the VisitIndex execution signature tag, which relies on this behavior.	2015-11-06 18:05:20 -07:00
Kenneth Moreland	b0c5a32611	Add Scatter parameters to Invocation. We are passing in execution objects with the Invocation when the Worklet is scheduled, but we are not using it yet.	2015-11-06 18:05:20 -07:00
Robert Maynard	97550d5e2d	Update Cuda so that UnaryPredictes work with fancy cuda array handles.	2015-11-03 13:28:07 -05:00
T.J. Corona	829c1b1f7f	Install missing cuda device backend header.	2015-11-02 16:44:19 -05:00
Robert Maynard	8de216c088	Propagate vtkm::Id3 scheduling down to the ThreadIndex classes. This now allows for even more efficient construction of uniform point coordinates when running under the 3d scheduler, since we don't need to go from 3d index to flat index to 3d index, instead we stay in 3d index	2015-10-20 09:29:41 -04:00
Kenneth Moreland	99ce66c6fe	Change Fetches to use ThreadIndices instead of Invocation. Previously, all Fetch objects received an Invocation object in their Load and Store methods. The point of this was that it allowed the Fetch to get data from any of the execution objects. However, every Fetch either just got data directly from its associated execution object or else used a secondary execution object (the input domain) to get indices into their own execution object. This left two potential areas for improvement. First, pulling data out of the Invocation object was unnecessarily complicated. It would be much nicer to get data directly from the associated execution object. Second, when getting index information from the input domain, it was often the case that extra computations were necessary (particularly on structured cell sets). There was no way to share the index information among Fetches, and therefore the computations were replicated. This change removes the Invocation from the Fetch Load and Store. Instead, it passes the associated execution object and a new object type called the ThreadIndices. The ThreadIndices are customized for the input domain and therefore have all the information needed for a redirected lookup. It is also a thread-local object so it can cache computed indices and save on computation time.	2015-10-07 17:01:42 -06:00
Robert Maynard	9a8809f933	Add CellSetPermutation which allows custom iteration over a cell set. When you create a CellSetPermutation you provide an array of the cell ids that you want to iterate. This allows the user to do custom blanking of a data set, or to do multi iteration over a set of cells.	2015-10-01 09:23:10 -04:00
Robert Maynard	9965977f47	Merge topic 'FetchTagTopologyIn_return_shape_type' a1f5bc9f FetchTagTopologyIn updated to properly return CellShape. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !209	2015-09-30 10:18:17 -04:00
Robert Maynard	fc79055f76	Add suppression pragmas to exec::Fetch classes	2015-09-24 10:39:48 -04:00
Robert Maynard	a1f5bc9f0a	FetchTagTopologyIn updated to properly return CellShape.	2015-09-23 10:45:06 -04:00
Robert Maynard	056f69bf96	Remove unused variable and conversion warnings from cuda code.	2015-09-21 14:17:25 -04:00
Kenneth Moreland	fd21a12f4a	Merge branch 'xcode-7-warnings' into 'master' Xcode 7 warnings The XCode 7 compiler has a new warning for unused typedefs. The Boost code we use has some instances where this warning gets issued. Suppress these warnings. See merge request !199	2015-09-17 18:12:31 -04:00
Kenneth Moreland	b15940c1e3	Declare new VTKM_STATIC_ASSERT This is to be used in place of BOOST_STATIC_ASSERT so that we can control its implementation. The implementation is designed to fix the issue where the latest XCode clang compiler gives a warning about a unused typedefs when the boost static assert is used within a function. (This warning also happens when using the C++11 static_assert keyword.) You can suppress this warning with _Pragma commands, but _Pragma commands inside a block is not supported in GCC. The implementation of VTKM_STATIC_ASSERT handles all current cases.	2015-09-17 14:40:39 -06:00
Robert Maynard	9b877ef49b	Merge topic 'multiple_backend_example' fd685210 Always install all device headers even when device isn't enabled. b1663b24 Add an example of using multiple backends from a single translation unit. fc0ff69d Methods with try/catch need to be host only. 4d635d64 DeviceAdapter Tags now always exist, and contain if the device is valid. cf32b430 Teach Configure.h to store if TBB and CUDA are enabled. Acked-by: Kitware Robot <kwrobot@kitware.com> Acked-by: Kenneth Moreland <kmorel@sandia.gov> Merge-request: !198	2015-09-17 09:49:49 -04:00
Robert Maynard	fd68521066	Always install all device headers even when device isn't enabled. vtkm_declare_headers now is able to not test headers, by using the TESTABLE keyword.	2015-09-17 09:28:21 -04:00
Kenneth Moreland	2ff6576c65	Add third party wrappers around boost macros. The boost assert macros seem to have an issue where they define an unused typedef. This is causing the XCode 7 compiler to issue a warning. Since the offending code is in a macro, the warning is identified with the VTK-m header even though the code is in boost. To get around this, wrap all uses of the boost assert that is causing the warning in the third party pre/post macros to disable the warning.	2015-09-16 23:34:49 -06:00
Robert Maynard	1d97f886e0	Remove the thrust pragma statements that are not needed.	2015-09-15 14:20:56 -04:00
Kenneth Moreland	13d4087657	Change ExecutionWholeArray interface to match expected for ArrayPortal When ExecutionWholeArray is passed to a worklet, it is expected to behave like an array portal. However, it was missing the GetNumberOfValues method and the ValueType typedef. These are now added.	2015-09-09 13:30:12 -06:00
Robert Maynard	5b8cc44ed4	Merge branch 'improve_sort_perf_on_thrust' into 'master' Tell thrust to use fast code paths when using our predicates and operators. See merge request !176	2015-09-07 10:38:17 -04:00
Hendrik Schroots	801d4dd1e5	Merge topic 'make_cont_export_macro_be_device_host' 0d6dfb1e Make it possible to use Cuda TextureMemory from device/host method. Acked-by: Kitware Robot <kwrobot@kitware.com> Merge-request: !181	2015-09-04 13:50:13 -04:00
Robert Maynard	72450e87f3	Make thrust use fast paths when doing sort and scan. By introducing our own custom thrust execution policy we can make sure to hit the fastest code paths in thrust for the sort operation. This makes sure that for UInt32,Int32, and Float32 we use the radix sort from thrust which offers a 2x to 3x speed improvement over the merge sort implementation. Secondly by telling thrust that our BinaryOperators are commutative we make sure that we get the fastest code paths when executing Inclusive and Exclusive Scan Benchmark 'Radix Sort on 1048576 random values vtkm::Int32' results: median = 0.0117049s median abs dev = 0.00324614s mean = 0.0167615s std dev = 0.00786269s min = 0.00845875s max = 0.0389063s Benchmark 'Radix Sort on 1048576 random values vtkm::Float32' results: median = 0.0234463s median abs dev = 0.000317249s mean = 0.021452s std dev = 0.00470307s min = 0.011255s max = 0.0250643s Benchmark 'Merge Sort on 1048576 random values vtkm::Int32' results: median = 0.0310486s median abs dev = 0.000182129s mean = 0.0286914s std dev = 0.00634102s min = 0.0116225s max = 0.0317379s Benchmark 'Merge Sort on 1048576 random values vtkm::Float32' results: median = 0.0310617s median abs dev = 0.000193583s mean = 0.0295779s std dev = 0.00491531s min = 0.0147257s max = 0.032307s	2015-09-03 16:00:37 -04:00
Robert Maynard	0d6dfb1e40	Make it possible to use Cuda TextureMemory from device/host method.	2015-09-03 11:52:40 -04:00
Kenneth Moreland	20c5819397	Remove unused typedef A typedef in a method was left over from a copy/paste. Although harmless, it was causing a (valid) warning on some compilers.	2015-09-02 13:54:54 -07:00
Kenneth Moreland	08f9c04fab	Add specialization of topology map fetch for regular point coords In the special case where you are loading the point coordinates for a structured grid in a point to cell map (an important use case), create a VecRectilinearPointCoordinates rather than build a Vec of the values. This will activate the cell specalizations in previous commits. These changes also added some flat-to-logical index conversion and vice versa in ConnectivityStructuredInternals. This change also fixed a bug in getting cells attached to points in 2D grids. (Actually, technically someone else fixed it and checked it in first. The changes were merged during a rebase.) I also added a specalization to Vec for 1D that implicitly converts between the 1D Vec and the component. This can be convenient when templating on the Vec length.	2015-09-02 13:54:51 -07:00
Kenneth Moreland	b58543297a	Special implementation of cell derivative for rectilinear cells	2015-09-02 13:50:31 -07:00

1 2 3 4

160 Commits