There was a warning comes from the functors in support of the portals
for ArrayHandleMultiplexer. The template has no good way to determine
whether the object it is calling is for control or execution, so it
supports both. It is not useful to warn when it happens to compile only
for the host.
Sometimes the CUDA runtime would not allocate sufficient stack
space for the particle advection code to run. This issue was exposed by
!1737 -- for some reason, once those changes to unrelated filters/worklets
are added to VTK, CUDA allocates less stack and the following tests would
fail:
UnitTestLagrangianFilterCUDA
UnitTestLagrangianStructuresFilterCUDA
UnitTestStreamlineFilterCUDA
UnitTestStreamSurfaceFilterCUDA
These were fixed by increasing the stack size in the particle advection
worklet Run(...) methods.
An RAII helper has been added that will restore the previous stack size
in case an exception is thrown, and the KDTree code has been updated
to use this helper when it adjusts the CUDA stack allocation.
This is done through a new version of ApplyPolicy. This version takes
a type of the array as its first template argument, which must be
specified.
This requires having a list of potential storage to try. It will use
that to construct an ArrayHandleMultiplexer containing all potential
types. This list of storages comes from the policy. A StorageList
item was added to the policy.
Types are automatically converted. So if you ask for a vtkm::Float64 and
field contains a vtkm::Float32, it will the array wrapped in an
ArrayHandleCast to give the expected type.
The DataSetBuilderExplicitIterative class used to use an ArrayHandle of
vtkm::Vec3f_32 values, which are always of type Float32. This can cause
unexpected results when using double precision by default (i.e. when
FloatDefault is set to Float64). Change that to give Float32 values by
default.
IsWritableArrayHandle can now be used directly as `std::true_type` or
`std::false_type` without having to pull the `type` member out of a
dependent namespace.
This behaves just like `ScanExclusive`, but rather than returning the
total sum, it is appended to the end of the output array.
This is in preparation for the CellSetExplicit refactoring described in
issue #408.
The `MultiBlock` class has been renamed to `PartitionedDataSet`, and its API
has been refactored to refer to "partitions", rather than "blocks".
Additionally, the `AddBlocks` method has been changed to `AppendPartitions` to
more accurately reflect the operation performed. The associated
`AssignerMultiBlock` class has also been renamed to
`AssignerPartitionedDataSet`.
This change is motivated towards unifying VTK-m's data model with VTK. VTK has
started to move away from `vtkMultiBlockDataSet`, which is a hierarchical tree
of nested datasets, to `vtkPartitionedDataSet`, which is always a flat vector
of datasets used to assist geometry distribution in multi-process environments.
This simplifies traversal during processing and clarifies the intent of the
container: The component datasets are partitions for distribution, not
organizational groupings (e.g. materials).
Ref #405
One of the dashboard compilers was complaining about not being able to
resolve which overload to use for std::size_t. (Perhaps on CUDA
std::size_t is sometimes not an unsigned 64-bit integer.) Try to correct
this by adding a templated method that casts anything to vtkm::UInt64.
The functions GetHumanReadableSize and GetSizeString accepted a
vtkm::UInt64 as the size, which should hold pretty much any reasonable
memory size and is compatible with std::size_t. But it has a different
sign-ness as vtkm::Id. So if you are holding an array size with vtkm::Id
(which is common), you could get a compiler warning when using these
functions, which is annoying. So, for convenience add a second form of
these methods that takes a vtkm::Id and automatically converts.
By removing the ability to have multiple CellSets in a DataSet
we can simplify the following things:
- Cell Fields now don't require a CellSet name when being constructed
- Filters don't need to manage what the active cellset is
Although convenient, one of the issues of creating data with
MakeTestDataSet is that it is hard to look at the data created. It is
often helpful to be able to bring in the data into something like
ParaView or VisIt to play with it. To enable that, write them all out as
part of UnitTestVTKDataSetWriter.
a9bbb6ead Permute cells inline
d37ab7732 Add ability to read FIELD section in vtk legacy files
19a610f5e Fix skipping over information in vtk files
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1797
1f8030a6e Add non-const version of DataSet::Get methods
69226803c Make PointTransform actually transform the points
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1803
A non-const version of GetCoordinateSystem was added to implement a
change to the point transform filter. To make things symmetric (and
provide likely future needs), also add non-const versions of getting the
fields and cell sets.
Previously, the VTK file readers just skipped over everything in FIELD
sections. However, it is common for many point and cell fields to be
written in this section (it is how ParaView writes out most of its
data). This change will allow these fields to be read in correctly.
The primary (likely only) use of PointTransform is to perform affine
transform on the position of the mesh. However, PointTransform did not
actually move the mesh. Rather, it just created a new field with the
transformed points.
Now, the output has its coordinate system replaced with the transformed
one generated (in addition to be added as a field). This can be turned
off (but defaults to on).
Also changed the constructor to turn on UseCoordinateSystemAsField so
that by default the filter operates on the existing coordinate system
and replaces it with the new coordinate system.
Also removed the template parameter on the filter. That added an
unnecessary complication to using it.
884616788 Simplify and extend AtomicArray implementation.
9560c4f63 Limit testing dispatch of atomic array to only atomic types.
0e728c800 Update atomic interfaces to support Add/CAS for UInt32/64.
720b452eb Force AtomicArray to use only basic storage arrays.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1802
- Use AtomicInterface to implement device-specific atomic operations.
- Remove DeviceAdapterAtomicArrayImplementations.
- Extend supported atomic types to include unsigned 32/64-bit ints.
- Add a static_assert to check that AtomicArray type is supported.
- Add documentation for AtomicArrayExecutionObject, including a CAS
example.
- Add a `T Get(idx)` method to AtomicArrayExecutionObject that does
an atomic load, and update existing CAS usage to use this instead
of `Add(idx, 0)`.
This allows for developers to do things such as the following
as constexpr's:
```cxx
constexpr vtkm::Id2 dims(16,16);
constexpr vtkm::Float64 dx = vtkm::Float64(4.0 * vtkm::Pi()) / vtkm::Float64(dims[0] - 1);
```
bf96d921d Add extra braces around std::array initializers
9bbf4a5a6 Corrections and expanded testing of implicit functions
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1776
Because ArrayPortalSOA calls a delegate portal to get the actual values,
it can only implement its own Set or Get if the delegate portal supports
it. Previously this was done by calling an overloaded internal method
based on the result of PortalSupportsSets/Gets. However, regardless of
whether the delegate portal supported Set or Get, ArrayPortalSOA
provided one. Thus, if something else tried to use PortalSupportsSets/
Gets on ArrayPortalSOA, it would always report true even if it was not
really supported.
Instead, use SFINAE to remove the Set or Get if that method is not
supported in the delegate portal.
Since ArrayHandleSOA is only really used for portals from basic storage
arrays, it will be rare that Set or Get is not supported. However, a
device adapter is free to remove one of these methods on a device
portal. For example, if you call PrepareForInput on an ArrayHandle, it
is possible that the device adapter will create a portal that has no Set
method because the array is not writable.
Thanks to Allison Vacanti for recomending this solution.
I kept getting warnings from different compilers about type conversions
because I was making values by adding an index to them. Change how we
create and test values so that these type issues are less likely to come
up.
This appears to be a bug in older clang and gcc compilers where they
gave a warning asking for an extra unnecessary set of braces around the
initializer of std::array. See for example
https://bugs.llvm.org/show_bug.cgi?id=21629https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25137
I don't think newer compilers give this warning, but we still support
some older compilers it happens on, so we live with it.
To help provide a better time writing VTK-m filter this streamlines
the CreateResult API to provide a focused set of methods, based on
how CreateResult has been used by existing filters.
For demonstration purposes, I added a vector field to one of the data
sets (the cow nose). However, the serialization test was not expecting a
field of that type and therefore had not compiled the case to serialize
that type of field, which caused a failure. Fixed the problem by
instructing the test to also consider the correct type of vector fields.
The `From` and `To` nomenclature for topology mapping has been confusing for
both users and developers, especially at lower levels where the intention of
mapping attributes from one element to another is easily conflated with the
concept of mapping indices (which maps in the exact opposite direction).
These identifiers have been renamed to `VisitTopology` and `IncidentTopology`
to clarify the direction of the mapping. The order in which these template
parameters are specified for `WorkletMapTopology` have also been reversed,
since eventually there may be more than one `IncidentTopology`, and having
`IncidentTopology` at the end will allow us to replace it with a variadic
template parameter pack in the future.
Other implementation details supporting these worklets, include `Fetch` tags,
`Connectivity` classes, and methods on the various `CellSet` classes (such as
`PrepareForInput` have also reversed their template arguments. These will need
to be cautiously updated.
The convenience implementations of `WorkletMapTopology` have been renamed for
clarity as follows:
```
WorkletMapPointToCell --> WorkletVisitCellsWithPoints
WorkletMapCellToPoint --> WorkletVisitPointsWithCells
```
The `ControlSignature` tags have been renamed as follows:
```
FieldInTo --> FieldInVisit
FieldInFrom --> FieldInMap
FromCount --> IncidentElementCount
FromIndices --> IncidentElementIndices
```
The Invoker is a control side object that handles the construction
of the relevant worklet dispatcher. Moving it to control makes it
obvious that it isn't an algorithm itself but a way to launch
worklets.
There was a special case for ArrayHandleMultiplexer where if you gave it
just one type it would treat that as a value type rather than an array
to support and instead provide a default list of types. However, GCC 4.8
is having trouble compiling the code to create the default list, the
semantics are confusing, and the more I think about it the less likely I
think we will need this functionality. So, just getting rid of that.
d80a8125c Sprinkle noexcept goodness on Variant and ArrayPortalMultiplexer
a96a13cf3 Use large case statements to CastAndCall variants
866e1d7d5 Update comparison for virtual and multiplexer arrays
5416cbeb7 Add ArrayHandleMultiplexer testing to BenchmarkFieldAlgorithms
d45106452 Add changedoc for ArrayHandleMultiplexer
0aa15c97c Fix 'Failed to specialize alias template' error from Visual Studio
7b72e31df Fixes for CUDA
5e2385352 Create ArrayHandleMultiplexer
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1726
The code was working fine on all the dashboards except for the Visual
Studio 2015 compiles on delve. It gave an error like:
ArrayHandleMultiplexer.h(398): error C2938: 'ArrayHandleToStorageTag<unknown-type>' : Failed to specialize alias template
A StackOverflow article (https://stackoverflow.com/questions/43411542/
metaprogramming-failed-to-specialize-alias-template) suggests that this
is a bug in older versions of Visual Studio. Although fixed in more
recent versions, we might have to support older versions.
Currently, ListTags are implemented by having a subtype name list set to
a brigand::list. However, there is always a chance this will change. To
make things more explicit, create a vtkm::internal::ListTagToBrigandList
to make it clear what the resulting type should be (and provide some
potential future-proofing).
Also add a convenient vtkm::ListTagApply that allows you to easily
instantiate a template with the list of types in a ListTag.
Previously, IsWriteableArrayHandle just checked to see if an
ArrayHandle's portal has a ValueType of void* because we had coded the
special read-only array handles to have fake portals for writing.
However, we recently removed that because it was more trouble than it
was worth. Now IsWritableArrayHandle checks for the existance of the Set
method. If it does not exist, then the portal is considered read-only.
Also corrected spelling (writeable -> writable).
Previously, `ArrayHandleCast` was considered a read-only array handle.
However, it is trivial to reverse the cast (now that `ArrayHandleTransform`
supports an inverse transform). So now you can write to a cast array
(assuming the underlying array is writable).
One trivial consequence of this change is that you can no longer make a
cast that cannot be reversed. For example, it was possible to cast a simple
scalar to a `Vec` even though it is not possible to convert a `Vec` to a
scalar value. This was of dubious correctness (it is more of a construction
than a cast) and is easy to recreate with `ArrayHandleTransform`.
Several ArrayHandles (actuall Storage implementations) had a fake portal
type that only defined invalid value types and no Get/Set methods. The
idea was to quickly identify when using a read-only array for writing.
However, this was more trouble than it was worth as the compiler just
gives an incomprehensible error and it is hard to track down the actual
value.
Now actually define some type even if it is never used.
28484fc6a Update examples and benchmarks to use new VTK-m CMake helper function
ea50e82aa Move VTK-m CMake testing wrappers to the testing folder
0b7dd7c38 Add CMake vtkm_add_target_information() to make using vtk-m easier
e934e2273 vtkm_library WRAP_FOR_CUDA renamed to clarify the intent of the property
a2e6660fd Remove unused vtkm_compile_as_cuda CMake function
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Matt Larsen <larsen30@llnl.gov>
Merge-request: !1718
461f87dbc Fix issues with PointLocatorUniformGrid not finding all points
e473cb4bb Fix PointLocatorUniformGrid for points on boundary
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Li-Ta Lo <ollie@lanl.gov>
Merge-request: !1720
When creating the search structures in PointLocatorUniformGrid, a point
outside the boundary would be given an invalid bin id. These points
could never be found. Generally, this is not a big deal for points
outside of the boundary, but it could be a problem for points on the
boundary. A point on the boundary could be taken as outside the
boundary. Since the boundary is chosen from limits of the points, some
will almost always be on the boundary.
Fix this problem by clamping all points to the nearest valid bin. This
could cause a problem if the user has selected a boundary excluding a
lot of points. All those points could be grouped to the same edge bins,
but that is probably not a great idea anyway.
We want the option name to be clear that it might be applicable for
more than just CUDA. If VTK-m ever supported something like SYCL
it would not be clear that those sources should go in `WRAP_FOR_CUDA`.
how did any of this work?
match other CellSet file layouts.
???
compile in CUDA.
unit tests.
also only serial.
make error message accurate
Well, this compiles and works now.
Did it ever?
use CellShapeTagGeneric
UnitTest matches previous changes.
whoops
Fix linking problems.
Need the same interface
as other ThreadIndices.
add filter test
okay, let's try duplicating CellSetStructure.
okay
inching...
change to wedge in CellSetListTag
Means changing these to support it.
switch back to wedge from generic
compiles and runs
remove ExtrudedType
need vtkm_worklet
vtkm_worklet needs to be included
fix segment count for wedge specialization
need to actually save the index
for the other constructor.
specialize on Explicit
clean up warning
angled brackets not quotes.
formatting
512d0431e Cell and Point locators have correct export visibility
c7f827581 Correct signed to unsigned warning conversion found by clang-8
c28797845 Gradient's ComputeDivergence is now properly initialized
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1705
2e0f4dd37 Fix floating point error in test
b766d9a92 Improve support of testing recursive types
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1691
The UnitTestCudaArrayHandleFancy was failing because it was
was calling TestValue in a worklet and comparing that to
a value that was generated by calling TestValue on the host.
Because these were called on different architecture, it is
possible that they will not be bit-wise exact. Add a little
bit of tolerance to the check to avoid false failures for
this reason.
bcaf7d9be ScopedRuntimeDeviceTracker have better controls of setting devices.
4212d0c04 RuntimeDeviceInformation now says the AnyTag exists.
fa03dc664 ScopedRuntimeDeviceTracker requires a device to execute on when constructed.
4020f5198 RuntimeDeviceTracker can't be copied and is only accessible via reference.
e9482018e ScopedRuntimeDeviceTracker has the same API as RuntimeDeviceTracker
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1676
These methods need to be marked inline as they are used by multiple
TU's. When they aren't marked as inline we see random failures in
downstream VTK-m users generally when CUDA is enabled due to ODR
violations.
The ScopedRuntimeDeviceTracker now can force, enable, or disable
devices. Additionally the ScopedRuntimeDeviceTracker and the
RuntimeDeviceTracker handle the DeviceAdapterTagAny robustly
across all methods.
As the RuntimeDeviceTracker is a per thread construct we now make
it explicit that you can only get a reference to the per-thread
version and can't copy it.
The 8x8x8 is a better launch strategy for most VTK-m kernels.
The current problem is that a couple of VTK-m kernels use a
high number of registers and this number of threads combines to
require too many registers.
What we should do in the longer run is have more controls over
kernel launches on a per kernel basis. This will require VTK-m
to extract the number of registers being used by each kernel
The consistent API for control to execution memory transfers is
the ArrayHandle class. Previously the tests would verify memory
transfer by calling the ArrayManagerExecution class directly. This
is problematic as the class isn't used by ArrayHandle<T, StorageBasic>.
When TransferInfo is given memory from VirtualObjectTransferShareWithControl
it doesn't have a bound function ptr for the destruction. In those cases
we need to make sure the HostCopyOfDevice is properly deleted, otherwise
we will cause a memory leak.
bdabfbe11 Make sure ArrayPortalUniformPointCoordinates constructor is explicit
b3d951b50 vtkm::Range Include function now requires half as many min/max calls
ddaa0df26 ArrayHandleVirtualCoordinates now calls the proper parent constructor
61e800379 Make sure all execution side CellLocator objects have explicit destructors
307898ff6 Cleanup the CellLocatorBoundingIntervalHierarchy.cxx style.
0f31c69f3 Remove unnecessary constructor from ParameterContainer
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1657
Previously it was calling the ArrayHandle<T,StorageTagVirtual>
constructor and not the ArrayHandleVirtual constructor which
generated a warning with some compilers
ff687016e For VTK-m libs all includes of DeviceAdapterTagCuda happen from cuda files
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1648
It is very easy to cause ODR violations with DeviceAdapterTagCuda.
If you include that header from a C++ file and a CUDA file inside
the same program we an ODR violation. The reasons is that the C++
versions will say the tag is invalid, and the CUDA will say the
tag is valid.
The solution to this is that any compilation unit that includes
DeviceAdapterTagCuda from a version of VTK-m that has CUDA enabled
must be invoked by the cuda compiler.
Fixes#277
DeviceAdapterError existed to make sure that the default device adapter
template was being handled properly. Since the default device adapter doesn't
exist, and nothing is templated over it we can now remove DeviceAdapterError.
9c2920072 UnitTestBoundingIntervalHierarchy handles systems under load better
671c1df5c Timer logs the proper device name when called with an invalid device
d3d66a331 GameOfLife example always uses the proper device adapter
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1645
VTK-m now offers a more GPU aware set of defaults for kernel scheduling.
When VTK-m first launches a kernel we do system introspection and determine
what GPU's are on the machine and than match this information to a preset
table of values. The implementation is designed in a way that allows for
VTK-m to offer both specific presets for a given GPU ( V100 ) or for
an entire generation of cards ( Pascal ).
Currently VTK-m offers preset tables for the following GPU's:
- Tesla V100
- Tesla P100
If the hardware doesn't match a specific GPU card we than try to find the
nearest know hardware generation and use those defaults. Currently we offer
defaults for
- Older than Pascal Hardware
- Pascal Hardware
- Volta+ Hardware
Some users have workloads that don't align with the defaults provided by
VTK-m. When that is the cause, it is possible to override the defaults
by binding a custom function to `vtkm::cont::cuda::InitScheduleParameters`.
As shown below:
```cpp
ScheduleParameters CustomScheduleValues(char const* name,
int major,
int minor,
int multiProcessorCount,
int maxThreadsPerMultiProcessor,
int maxThreadsPerBlock)
{
ScheduleParameters params {
64 * multiProcessorCount, //1d blocks
64, //1d threads per block
64 * multiProcessorCount, //2d blocks
{ 8, 8, 1 }, //2d threads per block
64 * multiProcessorCount, //3d blocks
{ 4, 4, 4 } }; //3d threads per block
return params;
}
vtkm::cont::cuda::InitScheduleParameters(&CustomScheduleValues);
```
661fb64de AtomicInterfaceControl functions are marked with VTKM_SUPPRESS_EXEC_WARNINGS
0c70f9b9a Add BitFieldIn/Out/InOut worklet signature tags.
a66510e81 Add ArrayHandleBitField, a boolean-valued AH backed by a BitField.
56cc5c3d3 Add support for BitFields.
d01b97382 Allow VTKM_SUPPRESS_EXEC_WARNINGS to be used inside macros.
2f2ca9370 Add bit operations FindFirstSetBit and CountSetBits to Math.h.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1629
BitFields are:
- Stored in memory using a contiguous buffer of bits.
- Accessible via portals, a la ArrayHandle.
- Portals operate on individual bits or words.
- Operations may be atomic for safe use from concurrent kernels.
The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle
containing the indices of all set bits, in no particular order.
The new AtomicInterface classes provide an abstraction into bitwise
atomic operations across control and execution environments and are used
to implement the BitPortals.
When reducing an input type that differs from the output type
you need to write a custom binary operator that also implements
how to do the unary transformation.
0130088b8 Suppress more self-assign-overloaded warnings found by clang
fa5455854 UnitTestArrayPortalValueReference doesn't warn when compiled with appleclang
5cc0d03f6 Merge branch 'upstream-diy' into clang_warnings
dbd3781d5 diy 2019-04-09 (f7a68da4)
5c2f2ebce suppress warnings found by Wself-assign-overloaded
77426f044 Correct casting long to long long warning from clang.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1635
After the change to `ArrayHandleVirtual` where it became a subclass of
`vtkm::cont::ArrayHandle`, a few extra changes are required.
1. Functions with `ArrayHandleVirtual` as parameters will not be callable
with the superclass `ArrayHandle`. This is fixed by changing the argument
types to the superclass.
2. Add Serialization classes specializations for the superclass, as "is-a"
relation is not considered for class template parameters.
18f3b0dcc Document all the free functions for VariantArrayHandle
5180d6a73 DynamicCellSet has the same free function support as VariantArrayHandle
8f545662d CellSet uses NewInstance to be consistent with the rest of VTK-m
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !1620
f1056affa Move select functions to host only to remove host/device suppressions
4f2156dfa Thrust detail::aligned_reinterpret_cast doesn't warn now
f4840618c Make sure ThrustPatches is included before thrust.
b2bbd66e6 Merge branch 'upstream-taotuple' into update_taoo
4ec6fc812 taotuple 2019-04-03 (8e70fa8a)
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1607
566e220ea Suppress dashboard warnings
f20d7e788 Document the changes that are part of this MR.
f78e763be Add CellLocatorGeneral
c6bead838 Rename CellLocatorTwoLevelUniformGrid to CellLocatorUniformBins
ee838b829 Stylistic changes to CellLocators to match VTK-m
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1615
The CastAndCall for ArrayHandleVirtual was always a bad idea.
The ArrayHandleVirtual is not considered a dynamic type by the
dispatch engine so it would never have tried to use this specialization.
A general purpose `CellLocator` that should work with any type of dataset.
Internally, it tries to chose one of the existing cell locators that would be
optimal for the input data.
Also removes `CellLocatorHelper`.
18ff6681f Less vtkm_cont cxx files bring in the cuda device
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1612
This adds an ExecutionSignature tag named Device that passes the
DeviceAdapterTag as an argument to the worklet's operator(). This allows
worklets to specialize their code based on the device.
0581b368f Fix issues with importing installed vtkm::cuda after refactor.
ed4374b05 diy 2019-03-13 (e8e68c7c)
b85dcf229 Add more vtkm_cont compilation units to device_sources.
9014de0eb Suppress new host/device warnings for cuda 10.1.
04a464cc4 Update cuda detection to create vtkm::cuda as an interface library.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1582
Previously we just took the optionparser.h file and stuck it right in
our source code. That was problematic for a variety of reasons.
1. It incorrectly assigned our license to external code.
2. It made lots of unnecessary changes to the original source (like
reformatting).
3. It made it near impossible to track patches we make and updates to
the original software.
Instead, use the third-party system to track changes to optionparser.h
in a different repository and then pull that into ours.
When a library requires reading some command line arguments through a
function like Initialize, it is typical that it will parse through
arguments it supports and then remove those arguments from argc and argv
so that the remaining arguments can be parsed by the calling program.
VTK-m's initialize did not do that, so add that functionality.
The RuntimeDeviceTracker had grown organically to handle multiple
different roles inside VTK-m. Now that we have device tags
that can be passed around at runtime, large portions of
the RuntimeDeviceTracker API aren't needed.
Additionally the RuntimeDeviceTracker had a dependency on knowing
the names of each device, and this wasn't possible
as that information was part of its self. Now we have moved that
information into RuntimeDeviceInformation and have broken
the recursion.
e1f5c4dd9 Modify VariantAH::AsVirtual to cast to new ValueType if needed.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1585
E.g:
```
ArrayHandle<Float64> doubleArray;
VariantArrayHandle varHandle{doubleArray};
ArrayHandleVirtual<Float32> = varHandle.AsVirtual<Float32>();
```
If there is a loss in range and/or precision, a warning is logged. If
the ValueTypes are Vecs with mismatched widths, an ErrorBadType is thrown.
Internally, an ArrayHandleCast is used between the VariantArrayHandle's
stored array and the ArrayHandleVirtual.
a09bb9eca Add a new test for CellSet API
8868fb989 Improve CellSet API
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1589
5960b8abf Suppress nvlink warnings about virtual methods not used
0af017b03 Move virtual methods of other CellLocators to vtkm_cont
e87864b0e Put CellLocatorBoundingIntervalHierarchy in vtkm_cont library
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1594
These changes caused some warnings in clang to show up based on virtual
methods in other cell locators. Hence, the rest of the cell locators
have also had some of their code moved to vtkm_cont.
All of the methods in CellLocatorBoundingIntervalHierarchy were listed in
header files. This is sometimes problematic with virtual methods. Since
everything implemented in it can just be embedded in a library, move the
code into the vtkm_cont library.
When benchmarking the VTK-m algorithms on Summit I discovered
that our scheduling choices aren't optimal for the hardware.
This is a short term fix where we select good numbers for
Summit, and in the future make the defaults controllable
by the calling programming and/or environment variables.
Performance numbers can be found at:
https://gitlab.kitware.com/snippets/755
546a1d14a VariantArrayHandle should report run time NumberOfComponents
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1586
3868a5b30 Remove the commented out stack code.
e7066ad94 compiler errors.
de7d9cc27 fix compile errors, remove cudastack hacks.
9b9742f43 Merge branch 'no-recurse-bih' of https://gitlab.kitware.com/kmorel/vtk-m into gridEval
ab9d0fad2 Remove warning exceptions for BoundingIntervalHierarchy recursive calls
8127093a2 Add CellLocator to name of BoundingIntervalHierarchy
c008df90c Non-recursive method to find cells in BoundingIntervalHierarchyExec
c463bbec9 Add parent index to BIH tree
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1536
d4e1c1825 Update existing code to use the new functions
7b072b159 Add useful Get functions to Fields and Coordinates
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1578
Query and report number of components at runtime so that it can work with
arrays of types with runtime varying number of components
Eg. ArrayHandleGroupVecVariable.
The timer test would blindly wait for an expected interval each time.
However, in a busy system that might wait more than expected, this extra
wait time can accumulate. Fix the issue by monitoring how much time has
actually elapsed and only wait to the next expected point. Also add a
check to make sure that we have not waited too long (and adjust the
error message to hopefully make it more clear that the system waited
more than expected).
db0f5c31b Add a Transfer object for ArrayHandleVirtual
99cb10b9a Fix warnings about override keyword
0ff83e94b Properly handle conditions when VirtualStorage is null
0571c6335 Add missing allocation methods to ArrayHandleVirtual
68b2e5e65 Add move constructors to ArrayHandle subclasses
0b32831af Make ArrayHandleVirtual conform with other ArrayHandle structure
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1557
Previously, UnitTestTimer paused for 1 second each time it wanted to
test that the timer did (or didn't) record the time elapsing. This was
done 3 times per test and tested seperately on (currently) 5 devices.
The accumulated 15 seconds is not a whole lot, but it can add up when
coupled with the well over 300 other tests.
This change moves from the POSIX sleep function (which is limited to
increments of 1 second) to the updated C++ thread/chrono classes. The
amount of time to wait each time is now 0.25 seconds, which should speed
the test up by 4 times. The risk is that the shorter wait times can
throw off the results if the computer being run on is busy. If that is
the case, we can bump up the wait time (perhaps to 0.5 seconds).
Previously, ArrayHandleVirtual was using the default Transfer object.
This was problematic because it would copy/allocate things in the
execution environment independently from the array that it was wrapped
around. This caused several negative effects, particularly for CUDA
devices. First, if the data were already on the device (or the array is
implicit), a second copy of the data would be made. Second, the copy to
the device is likely less efficient. Third (and worst of all), the data
did not always get pulled back to the original array correctly.
This commit also contains instantiations of ArrayHandleVirtual and its
components for the most common types.
If an ArrayHandleVirtual is constructed without an underlying concrete
array handle, then the Storage<T, StorageTagVirtual> holds a
StorageVirtual pointer that is null. Generally, a null
ArrayHandleVirtual cannot do much, but its operations should still be
correct. There were a few places where the Storage would blindly try to
use its StorageVirtual pointer without checking it first. This adds some
conditions that should correct the behavior when StorageVirtual is null.
There is a test to ensure that basic VTK-m classes have proper move
constructors that do not throw exceptions. Some of these are subclasses
of ArrayHandle. Add these move constructors to the
VTK_M_ARRAY_HANDLE_SUBCLASS macros so they get automatically added.
Previously, ArrayHandleVirtual was defined as a specialization of
ArrayHandle with the virtual storage tag. This was because the storage
object was polymorphic and needed to be handled special. These changes
moved the existing storage definition to an internal class, and then
managed the pointer to that implementation class in a Storage object
that can be managed like any other storage object.
Also moved the implementation of StorageAny into the implementation of
the internal storage object.
6797c6e33 Specify return type for GetTimerImpl
25f3432b1 Increase the conditions on which Timer is tested
4d9ce2488 Synchronize CUDA timer when stopping it
85265a9c8 Add const correctness to Timer
465508993 Allow resetting Timer with a new device
dd4a93952 Enable initializing Timer with a DeviceAdapterId
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Haocheng LIU <haocheng.liu@kitware.com>
Merge-request: !1562
The internal function GetTimerImpl has a rather complex expression for
its return type. Prevously this was derived using declspec, but one of
the versions of Visual Studio barfed on that for some reason. So now
declare the return type explicitly.
UnitTestTimer was changed to be initialized across all possible devices
and getting times across all possible devices. Also test all possible
ways to set the device in the Timer.
Previously, when Stop was called on a Cuda timer, it would record a stop
event but it would not synchronize it at that time. Instead, the
synchronize was only called when GetElapsedTime was called. The problem
is that the time of the event is only marked when synchronize is called.
Thus, if the event completed before GetElapsedTime was called, it would
record the time from when the event acutally happened to the time when
GetElapsedTime was called as part of the elapsed time, which is
incorrect.
Fix the problem by synchronizing when Stop is called. Although this
makes the Timer more invasive, generally using the Timer can cause
synchronization to happen. This behavior is consistent with the Timer
implementation for other devices.
It should be possible to query a vtkm::cont::Timer without modifying it.
As such, its query functions (such as Stopped and GetElapsedTime) should
be const.
Previously, in order to specify a device with the timer, it had to be
specified in the timer's construction or had to be specified every time
GetElapsedTime was called. The first method was inconvienient in the
case where there are multiple code paths to define the device and the
latter method was inconvienient because you would have to pass around a
device id.
Both these techniques still exist, but we have also added a new form of
Reset that allows you to change the device the timer is used on.
Previously, a vtkm::cont::Timer had to be initialized with either no
device or with a device adapter tag. However, this precluded
initializing the timer with a DeviceAdapterId, which made it difficult
to create a timer at runtime. Instead, just accept a DeviceAdapterId
(which all device adapter tags inherit from) and do runtime checks.
In certain circumstances (currently, when logging is enabled), VTK-m
libraries depend on the threading library. However, when the VTK-m
package was included from an external project, it did not automatically
find the threads package. This change makes the Threads library loaded
when the VTK-m package is found.
VTK-m has been updated to replace old per device benchmark executables with a device
dependent shared library so that it's able to accept a device adapter at runtime through
the "--device=" argument.
The ArrayPortalValueReference is supposed to behave just like the value
it encapsulates and does so by automatically converting to the base type
when necessary. However, when it is possible to convert that to
something else, it is possible to get errors about ambiguous overloads.
To avoid these, add specialized versions of the operators to specify
which ones should be used.
Also consolidated the CUDA version of an ArrayPortalValueReference to the
standard one. The two implementations were equivalent and we would like
changes to apply to both.
The timer class now is asynchronous and device independent. it's using an
similiar API as vtkOpenGLRenderTimer with Start(), Stop(), Reset(), Ready(),
and GetElapsedTime() function. For convenience and backward compability, Each
Start() function call will call Reset() internally and each GetElapsedTime()
function call will call Stop() function if it hasn't been called yet for keeping
backward compatibility purpose.
Bascially it can be used in two modes:
* Create a Timer without any device info. vtkm::cont::Timer time;
* It would enable timers for all enabled devices on the machine. Users can get a
specific elapsed time by passing a device id into the GetElapsedtime function.
If no device is provided, it would pick the maximum of all timer results - the
logic behind this decision is that if cuda is disabled, openmp, serial and tbb
roughly give the same results; if cuda is enabled it's safe to return the
maximum elapsed time since users are more interested in the device execution
time rather than the kernal launch time. The Ready function can be handy here
to query the status of the timer.
* Create a Timer with a device id. vtkm::cont::Timer time((vtkm::cont::DeviceAdapterTagCuda()));
* It works as the old timer that times for a specific device id.
The kernel launch component of the runtime device adapter is fairly
pointless. If the hardware supports CUDA we should expect that
VTK-m has the correct kernel versions.
Plus in the original version if the CUDA device was being used
and the kernel launch returns cudaErrorDevicesUnavailable it
was never possible to restore CUDA support. Now what happens
is that the runtime tracker is marked as failed, but the
calling code can always go back and trying the device again.
This is only set while compiling device code, and is useful
for code that needs different implementations on devices (e.g.
they call CUDA device intrinsics, etc).
`vtkm::cont::testing` now initializes with logging enabled and support
for device being passed on the command line, `vtkm::testing` only
enables logging.
When you call VariantArrayHandle::CastAndCall, it now tries both basic
storage and virtual storage. You can modify the types of storages tried
by giving a type list of storage tags as the first argument.
The purpose of the TestBuild infrastructure was to confirm that
VTK-m didn't have any lexical issues when it was a pure header
only project. As we now move to have more compiled components
the need for this form of testing is mitigated. Combined
with the issue of TestBuilds causing MSVC issues, we should
just remove this infrastructure.
The script fixed up most of the issues. However, there were some
instances that the script was not able to pick up on. There were
also some instances that still needed a means to select types.
We want to make sure that VariantArrayHandleContainer has as little
overhead when launch worklets as possible. To do so we cache
type information to make deducing the `T` type of ArrayHandles
as fast as possible.
543255c37 Move the HumanSize function to Logging.cxx as it is only used by the logger
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1508
Previously you had to exactly match the case of a device adapter's name to
construct it, which was a source of lots of problems ( OpenMP versus OPENMP, CUDA or Cuda ).
Now `vtkm::cont::make_DeviceAdapterId` and `vtkm::cont::RuntimeDeviceTracker` support
case-insensitive device construction.
The DeepCopy method is used when a ScopedGlobalRuntimeDeviceTracker
is constructed. This in turn causes the rebuilding of the device
names and states which isn't a free operation. Now we copy the already
computed information.
This was noticeable when using ArrayHandleTransform since it uses
ScopedGlobalRuntimeDeviceTracker when construction host side
portals.
Previously these two functions would give compile errors when asked to
compare against an Array with a different value type. This makes it easier
to write generic code that compares virtual handles.
ArrayHandleVirtual can automatically be constructed from any ArrayHandle.
In the cases where the input ArrayHandle doesn't derived from ArrayHandleVirtual,
it will automatically construct StorageAny to hold the array.