These were previously suppressed because they are unavoidable when
calling virtual methods. But we no longer support virtual methods on
devices (it is deprecated).
These warnings can still happen if you have unbounded recursion. But we
would like to avoid unbounded recursion, so we would like to see these
warnings.
Also turned on other nvlink warnings, which include when a recursive
function call means that the compiler cannot figure out the full
stack depth.
Often times you have an array of an unknown type (likely from a data set),
and you need it to be of a particular type (or can make a reasonable but
uncertain assumption about it being a particular type). You really just
want a shallow copy (a reference in a concrete `ArrayHandle`) if that is
possible.
`ArrayCopyShallowIfPossible` pulls an array of a specific type from an
`UnknownArrayHandle`. If the type is compatible, it will perform a shallow
copy. If it is not possible, a deep copy is performed to get it to the
correct type.
Fixes#572.
It updates the ReleaseProcess.md document to describe our bi-branchial
workflow composed by a release branch and master branch
It also adds ReleaseHotFix.md which describes how to perform a HotFix
onto master/release branch.
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
The major changes to VTK-m from 1.5.1 can be found in:
docs/changelog/1.6.0/release-notes.md
Signed-off-by: Vicente Adolfo Bolea Sanchez <vicente.bolea@kitware.com>
* commit 'd2d1c854adc8c0518802f153b48afd17646b6252': (1346 commits)
extend the default clipping plane
Fix unintended cast in TBB Reduce's return value
follow coding conventions
make scalar normilization consistent across rendering
correct a potential divide by zero
Do not assume CUDA reduce operator is unary
Fix casting issues in TBB functors
Add casts to FunctorsGeneral.h
Allow for different types in basic type operators
kick the builds
Cleanup per review.
Fixes per review
Add ArrayHandleSOA to default
Disallow references in Variant
Be more conservative about is_trivial support
Removed two TODO comments after verifying parameters
Port bug fix from distributed to augmented contour tree filter
Fix hang in distributed contour tree
more missing sstream headers
add another missing header
...
04f020ae6 Update Field to use new ArrayRangeCompute features
2a41428fe Add implementation of ArrayRangeCompute for UnknownArrayHandle
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Sujin Philip <sujin.philip@kitware.com>
Merge-request: !2409
This is a fancy array that takes an array of offsets and converts it to
an array of the number of components for each packed entry.
This replaces the use of `ArrayHandleDecorator` in `CellSetExplicit`.
The two implementation should do the same thing, but the new
`ArrayHandleOffsetsToNumComponents` should be less complex for
compilers.
cecd81d5d Add types appropriate for Ascent
865855ea0 Add changelog for making ArrayHandleSOA a default array
50ff9c22a Add support of `ArrayHandleSOA` as a default storage type
bc09a9cd1 Add precompiled versions of `ArrayRangeCompute` for `ArrayHandleSOA`
77f9ae653 Support `ArrayHandleSOA` only for `Vec` value types
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2349
`vtkm::VecFlat` is a wrapper around a `Vec`-like class that may be a
nested series of vectors. For example, if you run a gradient operation
on a vector field, you are probably going to get a `Vec` of `Vec`s that
looks something like `vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 3>`. That
is fine, but what if you want to treat the result simply as a `Vec` of
size 9?
The `VecFlat` wrapper class allows you to do this. Simply place the
nested `Vec` as an argument to `VecFlat` and it will behave as a flat
`Vec` class. (In fact, `VecFlat` is a subclass of `Vec`.) The `VecFlat`
class can be copied to and from the nested `Vec` it is wrapping.
There is a `vtkm::make_VecFlat` convenience function that takes an
object and returns a `vtkm::VecFlat` wrapped around it.
The changelog is not quite accurate because it is claiming that all
virtual methods are removed when that is not quite the case.
Hopefully soon the changelog text will be accurate.
Previously, all atomic functions were stored in classes named
`AtomicInterfaceControl` and `AtomicInterfaceExecution`, which required
you to know at compile time which device was using the methods. That in
turn means that anything using an atomic needed to be templated on the
device it is running on.
That can be a big hassle (and is problematic for some code structure).
Instead, these methods are moved to free functions in the `vtkm`
namespace. These functions operate like those in `Math.h`. Using
compiler directives, an appropriate version of the function is compiled
for the current device the compiler is using.
We have made several improvements to adding data into an `ArrayHandle`.
## Moving data from an `std::vector`
For numerous reasons, it is convenient to define data in a `std::vector`
and then wrap that into an `ArrayHandle`. It is often the case that an
`std::vector` is filled and then becomes unused once it is converted to an
`ArrayHandle`. In this case, what we really want is to pass the data off to
the `ArrayHandle` so that the `ArrayHandle` is now managing the data and
not the `std::vector`.
C++11 has a mechanism to do this: move semantics. You can now pass
variables to functions as an "rvalue" (right-hand value). When something is
passed as an rvalue, it can pull state out of that variable and move it
somewhere else. `std::vector` implements this movement so that an rvalue
can be moved to another `std::vector` without actually copying the data.
`make_ArrayHandle` now also takes advantage of this feature to move rvalue
`std::vector`s.
There is a special form of `make_ArrayHandle` named `make_ArrayHandleMove`
that takes an rvalue. There is also a special overload of
`make_ArrayHandle` itself that handles an rvalue `vector`. (However, using
the explicit move version is better if you want to make sure the data is
actually moved.)
## Make `ArrayHandle` from initalizer list
A common use case for using `std::vector` (particularly in our unit tests)
is to quickly add an initalizer list into an `ArrayHandle`. Now you can
by simply passing an initializer list to `make_ArrayHandle`.
## Deprecated `make_ArrayHandle` with default shallow copy
For historical reasons, passing an `std::vector` or a pointer to
`make_ArrayHandle` does a shallow copy (i.e. `CopyFlag` defaults to `Off`).
Although more efficient, this mode is inherintly unsafe, and making it the
default is asking for trouble.
To combat this, calling `make_ArrayHandle` without a copy flag is
deprecated. In this way, if you wish to do the faster but more unsafe
creation of an `ArrayHandle` you should explicitly express that.
This requried quite a few changes through the VTK-m source (particularly in
the tests).
## Similar changes to `Field`
`vtkm::cont::Field` has a `make_Field` helper function that is similar to
`make_ArrayHandle`. It also features the ability to create fields from
`std::vector`s and C arrays. It also likewise had the same unsafe behavior
by default of not copying from the source of the arrays.
That behavior has similarly been depreciated. You now have to specify a
copy flag.
The ability to construct a `Field` from an initializer list of values has
also been added.
As a programming convenience, all `vtkm::cont::DataSet` written by
`vtkm::io::VTKDataSetWriter` were written as a structured grid. Although
technically correct, it changed the structure of the data. This meant that
if you wanted to capture data to run elsewhere, it would run as a different
data type. This was particularly frustrating if the data of that structure
was causing problems and you wanted to debug it.
Now, `VTKDataSetWriter` checks the type of the `CoordinateSystem` to
determine whether the data should be written out as `STRUCTURED_POINTS`
(i.e. a uniform grid), `RECTILINEAR_GRID`, or `STRUCTURED_GRID`
(curvilinear).
`assert` is supported on recent CUDA cards, but compiling it appears to be
very slow. By default, the `VTKM_ASSERT` macro has been disabled whenever
compiling for a CUDA device (i.e. when `__CUDA_ARCH__` is defined).
Asserts for CUDA devices can be turned back on by turning the
`VTKm_NO_ASSERT_CUDA` CMake variable off. Turning this CMake variable off
will enable assertions in CUDA kernels unless there is another reason
turning off all asserts (such as a release build).
ECP is no longer offering CI via NMC, so we can now remove the
infrastructure we used for that.
See merge request 2115 for how we are adding gitlab-ci at OLCF
de3bda373 Use deque instead of list for ArrayHandle queue
498d44548 Pass Token::Reference by value
c32c9e8e8 Fix deadlock when changing device during read
99e14ab8a Add proper enqueuing of Tokens for ArrayHandle
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !2130
c0dee7402 make it explicit that we are using 64-bit unsigned integer in bit op
34f350588 Added changelog
e9f584a91 ArrayHandleRandomUniformReal
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !2116
An issue that was identified for the thread safety of `ArrayHandle` is
that if several threads are waiting to use an `ArrayHandle`, there might
be an expectation of the order in which the operations happen. For
example, if one thread is modifying the contents of an `ArrayHandle` and
another is reading those results, we would need the first one to start
before the second one.
To solve this, a queue is added to `ArrayHandle` such that when waiting
to read or write an `ArrayHandle` the `Token` has to be at the top of
the queue in addition to other requirements being met.
Additionally, an `Enqueue` method is added to add a `Token` to the queue
without blocking. This allows a control thread to queue the access and
then spawn a thread where the actual work will be done. As long as
everything is enqueued on the main thread, the operations will happen in
the expected order.
Initially, the probe filter would simply not set a value if a sample was
outside the input `DataSet`. This is not great as the memory could be
left uninitalized and lead to unpredictable results. The testing
compared these invalid results to 0, which seemed to work but is
probably unstable.
This was partially fixed by a previous change that consolidated to
mapping of cell data with a general routine that permuted data. However,
the fix did not extend to point data in the input, and it was not
possible to specify a particular invalid value.
This change specifically updates the probe filter so that invalid values
are set to a user-specified value.
544a078cd Remove use of deprecated policies in examples
06f5119c2 Fix deprecation warning
f29a4712b Correct field types for ComputeMoments filter
a20ec03d0 Disable proxies in filter benchmark
72cd0107e Deprecate Execute with policy
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !2093
There have been several new features that were merged without
appropriate documentation in the changelogs. This adds some
new changelogs for some of these new features.
The version of `Filter::Execute` that takes a policy as an argument is now
deprecated. Filters are now able to specify their own fields and types,
which is often why you want to customize the policy for an execution. The
other reason is that you are compiling VTK-m into some other source that
uses a particular types of storage. However, there is now a mechanism in
the CMake configuration to allow you to provide a header that customizes
the "default" types used in filters. This is a much more convenient way to
compile filters for specific types.
One thing that filters were not able to do was to customize what cell sets
they allowed using. This allows filters to self-select what types of cell
sets they support (beyond simply just structured or unstructured). To
support this, the lists `SupportedCellSets`, `SupportedStructuredCellSets`,
and `SupportedUnstructuredCellSets` have been added to `Filter`. When you
apply a policy to a cell set, you now have to also provide the filter.
To simplify reproducing docker based CI workers locally, VTK-m has python program that handles all the
work automatically for you.
The program is located in `[Utilities/CI/reproduce_ci_env.py ]` and requires python3 and pyyaml.
To use the program is really easy! The following two commands will create the `build:rhel8` gitlab-ci
worker as a docker image and setup a container just as how gitlab-ci would be before the actual
compilation of VTK-m. Instead of doing the compilation, instead you will be given an interactive shell.
```
./reproduce_ci_env.py create rhel8
./reproduce_ci_env.py run rhel8
```
To compile VTK-m from the the interactive shell you would do the following:
```
> src]# cd build/
> build]# cmake --build .
```
Previously, the PolicyDefault used to compile all the filters was hard-
coded. The problem was that if any external project that depends on VTK-
m needs a different policy, it had to recompile everything in its own
translation units with a custom policy.
This change allows an external project provide a simple header file that
changes the type lists used in the default policy. That allows VTK-m to
compile the filters exactly as specified by the external project.
Currently, VTK-m is using C++11. However, it is often useful to use
features in the `std` namespace that are defined for C++14 or later. We
can provide our own versions (sometimes), but it is preferable to use
the version provided by the compiler if available.
There were already some examples of defining portable versions of C++14
and C++17 classes in a `vtkmstd` namespace, but these were sprinkled
around the source code.
There is now a top level `vtkmstd` directory and in it are header files
that provide portable versions of these future C++ classes. In each
case, preprocessor macros are used to select which version of the class
to use.
Previously, when a ReadPortal or a WritePortal was returned from an
ArrayHandle, it had wrapped in it a Token that was attached to the
ArrayHandle. This Token would prevent other reads and writes from the
ArrayHandle.
This added safety in the form of making sure that the ArrayPortal was
always valid. Unfortunately, it also made deadlocks very easy. They
happened when an ArrayPortal did not leave scope immediately after use
(which is not all that uncommon).
Now, the ArrayPortal no longer locks up the ArrayHandle. Instead, when
an access happens on the ArrayPortal, it checks to make sure that
nothing has happened to the data being accessed. If it has, a fatal
error is reported to the log.
This commit also:
- Removes a corner case not longer used at ArrayPortalGroupVecVariable::get
- Changes doc regarding the number of offset elements in the input
array handler of ConvertNumComponentsToOffsets.
- Updates invokation of make_ArrayGroupVectVariable in multiple files
- Adds its corresponding changelog entry
To get a portal to access ArrayHandle values in the control
environment, you now use the ReadPortal and WritePortal methods.
The portals returned are wrapped in an ArrayPortalToken object
so that the data between the portal and the ArrayHandle are
guaranteed to be consistent.
Also discovered that many C++ compilers have trouble giving warnings
for partial specialization of classes marked as deprecated. Fix
the problem by instead deprecating the items in the class.
The convenience functions `ArrayPortalToIteratorBegin()` and
`ArrayPortalToIteratorEnd()` wouldn't detect specializations of
`ArrayPortalToIterators<PortalType>` since the specializations aren't
visible when the `Begin`/`End` functions are declared.
Since the CUDA iterators rely on a specialization, the convenience
functions would not compile on CUDA.
Now, instead of specializing `ArrayPortalToIterators` to provide custom
iterators for a particular portal, the portal may advertise custom
iterators by defining `IteratorType`, `GetIteratorBegin()`, and
`GetIteratorEnd()`. `ArrayPortalToIterators` will detect such portals
and automatically switch to using the specialized portals.
This eliminates the need for the specializations to be visible to the
convenience functions and allows them to be usable on CUDA.
Previously we relied on CMake's compiler detection module to build the
macros for using the deprecated attribute. However, CMake created macros
for pre-C++14 versions of the feature, which do not work in all cases.
Also, we have the need to be able to suppress deprecation warnings when
we are implementing a deprecated thing. Since we have to query compilers
ourself, we might as well figure out if the deprecated attribute we want
is supported.
Worst case is that we won't support deprecation warnings everywhere we
could. That will not create incorrect code and we can always add that
later.
The `VTKM_DEPRECATED` macro allows us to remove (and usually replace)
features from VTK-m in minor releases while still following the conventions
of semantic versioning. The idea is that when we want to remove or replace
a feature, we first mark the old feature as deprecated. The old feature
will continue to work, but compilers that support it will start to issue a
warning that the use is deprecated and should stop being used. The
deprecated features should remain viable until at least the next major
version. At the next major version, deprecated features from the previous
version may be removed.
If a worklet doesn't explicitly state an ExecutionSignature, VTK-m
assumes the worklet has no return value, and each ControlSignature
argument is passed to the worklet in the same order.
For example if we had this worklet:
```cxx
struct DotProduct : public vtkm::worklet::WorkletMapField
{
using ControlSignature = void(FieldIn, FieldIn, FieldOut);
using ExecutionSignature = void(_1, _2, _3);
template <typename T, vtkm::IdComponent Size>
VTKM_EXEC void operator()(const vtkm::Vec<T, Size>& v1,
const vtkm::Vec<T, Size>& v2,
T& outValue) const
{
outValue = vtkm::Dot(v1, v2);
}
};
```
It can be simplified to be:
```cxx
struct DotProduct : public vtkm::worklet::WorkletMapField
{
using ControlSignature = void(FieldIn, FieldIn, FieldOut);
template <typename T, vtkm::IdComponent Size>
VTKM_EXEC void operator()(const vtkm::Vec<T, Size>& v1,
const vtkm::Vec<T, Size>& v2,
T& outValue) const
{
outValue = vtkm::Dot(v1, v2);
}
};
VTK-m now provides the following filters with the default policy
as part of the vtkm_filter library:
- CellAverage
- CleanGrid
- ClipWithField
- ClipWithImplicitFunction
- Contour
- ExternalFaces
- ExtractStructured
- PointAverage
- Threshold
- VectorMagnitude
By building these as a library we hope to provide faster compile
times for consumers of VTK-m when using common configurations.
05b679250 Merge branch 'master' into tangle_source
a6c044df9 added changelog
0d818701c put VTKM_SOURCE_EXPORT in .cxx
e0aae7d86 remove EXPORT from .cxx
4d7de67ee use VTKM_SOURCE_EXPORT
cd49136d5 add copyright notice, add installation of header
0c094a568 used the Tangle source
aeb8877a9 add newline at EOF, minor change based review
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1804
Consumers of VTK-m when enabling of dropping of unused functions
will see VTK-m functions dropped. Previously this didn't happen
as VTK-m didn't build object files with the correct flags for this.
By allowing the linker to remove unused symbols we see a significant
saving the file size of VTK-m tests, examples, and benchmarks.
An OpenMP build of the tests and benchmarks goes from 168MB to
141MB which is roughly a 16% filesize reduction.
Initially I had presumed that these changes would increase link times.
But in measurements the total wall time for compilation of VTK-m has
stayed about the same ( seeing a decrease of 1.5% ). Presumably the
increased computation is offset by the reduction in file writing.
Instead of having to be called for each target you can now
pass multiple targets at once. Do note that if you pass multiple
targets you will need to pass all the sources from those targets
that need to be 'device' compiled.
Rather than do a CastAndCall on all possible field types when calling a
worklet with two fields (where they all typically get cast to the same
type as the primary field), use the new mechanism with
ArrayHandleMultiplexer to create one code path.
Also update the ApplyPolicy to accept the Field type, which is used to
determine any additional storage types to support.
This is done through a new version of ApplyPolicy. This version takes
a type of the array as its first template argument, which must be
specified.
This requires having a list of potential storage to try. It will use
that to construct an ArrayHandleMultiplexer containing all potential
types. This list of storages comes from the policy. A StorageList
item was added to the policy.
Types are automatically converted. So if you ask for a vtkm::Float64 and
field contains a vtkm::Float32, it will the array wrapped in an
ArrayHandleCast to give the expected type.
Added a BaseComponentType to VecTraits that recursively finds the base
(non-Vec) type of a Vec. This is useful when dealing with potentially
nested Vec's (e.g. Vec<Vec<T, M>, N>) and you need to keep the structure
but know the base type.
Also added a couple of templates for keeping the structure but changing
the type. These are ReplaceComponentType and ReplaceBaseComponentType.
These allow you to create new Vec's with the same structure as the query
Vec but with differen component types.
This behaves just like `ScanExclusive`, but rather than returning the
total sum, it is appended to the end of the output array.
This is in preparation for the CellSetExplicit refactoring described in
issue #408.
The `MultiBlock` class has been renamed to `PartitionedDataSet`, and its API
has been refactored to refer to "partitions", rather than "blocks".
Additionally, the `AddBlocks` method has been changed to `AppendPartitions` to
more accurately reflect the operation performed. The associated
`AssignerMultiBlock` class has also been renamed to
`AssignerPartitionedDataSet`.
This change is motivated towards unifying VTK-m's data model with VTK. VTK has
started to move away from `vtkMultiBlockDataSet`, which is a hierarchical tree
of nested datasets, to `vtkPartitionedDataSet`, which is always a flat vector
of datasets used to assist geometry distribution in multi-process environments.
This simplifies traversal during processing and clarifies the intent of the
container: The component datasets are partitions for distribution, not
organizational groupings (e.g. materials).
Ref #405
By removing the ability to have multiple CellSets in a DataSet
we can simplify the following things:
- Cell Fields now don't require a CellSet name when being constructed
- Filters don't need to manage what the active cellset is
For polygon cell shapes (that are not triangles or quadrilaterals),
interpolations are done by finding the center point and creating a
triangle fan around that point. Previously, the gradient was computed in
the same way as interpolation: identifying the correct triangle and
computing the gradient for that triangle.
The problem with that approach is that makes the gradient discontinuous
at the boundaries of this implicit triangle fan. To make things worse,
this discontinuity happens right at each vertex where gradient
calculations happen frequently. This means that when you ask for the
gradient at the vertex, you might get wildly different answers based on
floating point imprecision.
Get around this problem by creating a small triangle around the point in
question, interpolating values to that triangle, and use that for the
gradient. This makes for a smoother gradient transition around these
internal boundaries.
This allows for developers to do things such as the following
as constexpr's:
```cxx
constexpr vtkm::Id2 dims(16,16);
constexpr vtkm::Float64 dx = vtkm::Float64(4.0 * vtkm::Pi()) / vtkm::Float64(dims[0] - 1);
```
Perhaps a better title for this change would be "Make the Threshold
filter not totally useless."
A long standing issue with the Threshold filter is that its output
CellSet was stored in a CellSetPermutation. This made Threshold hyper-
efficient because it required hardly any data movement to implement.
However, the problem was that any other unit that had to use the CellSet
failed. To have VTK-m handle that output correctly in other filters and
writers, they all would have to check for the existance of
CellSetPermutation. And CellSetPermutation is templated on the CellSet
type it is permuting, so all units would have to compile special cases
for all these combinations. This is not likely to be feasible in any
static solution.
The simple solution, implemented here, is to deep copy the cells to a
CellSetExplicit, which is a known type that is already used everywhere
in VTK-m. The solution is a bit disappointing since it requires more
memory and time to build. But it is on par with solutions in other
libraries (like VTK). And it really does not matter how efficient the
old solution was if it was useless.
3a7d3cb5f VTK-m filters now launch all worklets via a vtkm::cont::Invoker
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1760
The `From` and `To` nomenclature for topology mapping has been confusing for
both users and developers, especially at lower levels where the intention of
mapping attributes from one element to another is easily conflated with the
concept of mapping indices (which maps in the exact opposite direction).
These identifiers have been renamed to `VisitTopology` and `IncidentTopology`
to clarify the direction of the mapping. The order in which these template
parameters are specified for `WorkletMapTopology` have also been reversed,
since eventually there may be more than one `IncidentTopology`, and having
`IncidentTopology` at the end will allow us to replace it with a variadic
template parameter pack in the future.
Other implementation details supporting these worklets, include `Fetch` tags,
`Connectivity` classes, and methods on the various `CellSet` classes (such as
`PrepareForInput` have also reversed their template arguments. These will need
to be cautiously updated.
The convenience implementations of `WorkletMapTopology` have been renamed for
clarity as follows:
```
WorkletMapPointToCell --> WorkletVisitCellsWithPoints
WorkletMapCellToPoint --> WorkletVisitPointsWithCells
```
The `ControlSignature` tags have been renamed as follows:
```
FieldInTo --> FieldInVisit
FieldInFrom --> FieldInMap
FromCount --> IncidentElementCount
FromIndices --> IncidentElementIndices
```
The Invoker is a control side object that handles the construction
of the relevant worklet dispatcher. Moving it to control makes it
obvious that it isn't an algorithm itself but a way to launch
worklets.
There was a special case for ArrayHandleMultiplexer where if you gave it
just one type it would treat that as a value type rather than an array
to support and instead provide a default list of types. However, GCC 4.8
is having trouble compiling the code to create the default list, the
semantics are confusing, and the more I think about it the less likely I
think we will need this functionality. So, just getting rid of that.
6592e5288 Fix IsWritableArrayHandle for portals that exist but cannot be written
0e15a1116 Enable writing to ArrayHandleCast
6d37ce945 Remove invalid PortalType
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1731
Previously, `ArrayHandleCast` was considered a read-only array handle.
However, it is trivial to reverse the cast (now that `ArrayHandleTransform`
supports an inverse transform). So now you can write to a cast array
(assuming the underlying array is writable).
One trivial consequence of this change is that you can no longer make a
cast that cannot be reversed. For example, it was possible to cast a simple
scalar to a `Vec` even though it is not possible to convert a `Vec` to a
scalar value. This was of dubious correctness (it is more of a construction
than a cast) and is easy to recreate with `ArrayHandleTransform`.
The ScopedRuntimeDeviceTracker now can force, enable, or disable
devices. Additionally the ScopedRuntimeDeviceTracker and the
RuntimeDeviceTracker handle the DeviceAdapterTagAny robustly
across all methods.
As the RuntimeDeviceTracker is a per thread construct we now make
it explicit that you can only get a reference to the per-thread
version and can't copy it.
VTK-m now offers a more GPU aware set of defaults for kernel scheduling.
When VTK-m first launches a kernel we do system introspection and determine
what GPU's are on the machine and than match this information to a preset
table of values. The implementation is designed in a way that allows for
VTK-m to offer both specific presets for a given GPU ( V100 ) or for
an entire generation of cards ( Pascal ).
Currently VTK-m offers preset tables for the following GPU's:
- Tesla V100
- Tesla P100
If the hardware doesn't match a specific GPU card we than try to find the
nearest know hardware generation and use those defaults. Currently we offer
defaults for
- Older than Pascal Hardware
- Pascal Hardware
- Volta+ Hardware
Some users have workloads that don't align with the defaults provided by
VTK-m. When that is the cause, it is possible to override the defaults
by binding a custom function to `vtkm::cont::cuda::InitScheduleParameters`.
As shown below:
```cpp
ScheduleParameters CustomScheduleValues(char const* name,
int major,
int minor,
int multiProcessorCount,
int maxThreadsPerMultiProcessor,
int maxThreadsPerBlock)
{
ScheduleParameters params {
64 * multiProcessorCount, //1d blocks
64, //1d threads per block
64 * multiProcessorCount, //2d blocks
{ 8, 8, 1 }, //2d threads per block
64 * multiProcessorCount, //3d blocks
{ 4, 4, 4 } }; //3d threads per block
return params;
}
vtkm::cont::cuda::InitScheduleParameters(&CustomScheduleValues);
```
661fb64de AtomicInterfaceControl functions are marked with VTKM_SUPPRESS_EXEC_WARNINGS
0c70f9b9a Add BitFieldIn/Out/InOut worklet signature tags.
a66510e81 Add ArrayHandleBitField, a boolean-valued AH backed by a BitField.
56cc5c3d3 Add support for BitFields.
d01b97382 Allow VTKM_SUPPRESS_EXEC_WARNINGS to be used inside macros.
2f2ca9370 Add bit operations FindFirstSetBit and CountSetBits to Math.h.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1629
BitFields are:
- Stored in memory using a contiguous buffer of bits.
- Accessible via portals, a la ArrayHandle.
- Portals operate on individual bits or words.
- Operations may be atomic for safe use from concurrent kernels.
The new BitFieldToUnorderedSet device algorithm produces an ArrayHandle
containing the indices of all set bits, in no particular order.
The new AtomicInterface classes provide an abstraction into bitwise
atomic operations across control and execution environments and are used
to implement the BitPortals.
When reducing an input type that differs from the output type
you need to write a custom binary operator that also implements
how to do the unary transformation.
566e220ea Suppress dashboard warnings
f20d7e788 Document the changes that are part of this MR.
f78e763be Add CellLocatorGeneral
c6bead838 Rename CellLocatorTwoLevelUniformGrid to CellLocatorUniformBins
ee838b829 Stylistic changes to CellLocators to match VTK-m
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1615
This adds an ExecutionSignature tag named Device that passes the
DeviceAdapterTag as an argument to the worklet's operator(). This allows
worklets to specialize their code based on the device.
Previously we just took the optionparser.h file and stuck it right in
our source code. That was problematic for a variety of reasons.
1. It incorrectly assigned our license to external code.
2. It made lots of unnecessary changes to the original source (like
reformatting).
3. It made it near impossible to track patches we make and updates to
the original software.
Instead, use the third-party system to track changes to optionparser.h
in a different repository and then pull that into ours.
When a library requires reading some command line arguments through a
function like Initialize, it is typical that it will parse through
arguments it supports and then remove those arguments from argc and argv
so that the remaining arguments can be parsed by the calling program.
VTK-m's initialize did not do that, so add that functionality.
The RuntimeDeviceTracker had grown organically to handle multiple
different roles inside VTK-m. Now that we have device tags
that can be passed around at runtime, large portions of
the RuntimeDeviceTracker API aren't needed.
Additionally the RuntimeDeviceTracker had a dependency on knowing
the names of each device, and this wasn't possible
as that information was part of its self. Now we have moved that
information into RuntimeDeviceInformation and have broken
the recursion.
e1f5c4dd9 Modify VariantAH::AsVirtual to cast to new ValueType if needed.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1585
E.g:
```
ArrayHandle<Float64> doubleArray;
VariantArrayHandle varHandle{doubleArray};
ArrayHandleVirtual<Float32> = varHandle.AsVirtual<Float32>();
```
If there is a loss in range and/or precision, a warning is logged. If
the ValueTypes are Vecs with mismatched widths, an ErrorBadType is thrown.
Internally, an ArrayHandleCast is used between the VariantArrayHandle's
stored array and the ArrayHandleVirtual.
These changes caused some warnings in clang to show up based on virtual
methods in other cell locators. Hence, the rest of the cell locators
have also had some of their code moved to vtkm_cont.
All of the methods in CellLocatorBoundingIntervalHierarchy were listed in
header files. This is sometimes problematic with virtual methods. Since
everything implemented in it can just be embedded in a library, move the
code into the vtkm_cont library.
Previously, ArrayHandleVirtual was defined as a specialization of
ArrayHandle with the virtual storage tag. This was because the storage
object was polymorphic and needed to be handled special. These changes
moved the existing storage definition to an internal class, and then
managed the pointer to that implementation class in a Storage object
that can be managed like any other storage object.
Also moved the implementation of StorageAny into the implementation of
the internal storage object.
VTK-m has been updated to replace old per device benchmark executables with a device
dependent shared library so that it's able to accept a device adapter at runtime through
the "--device=" argument.
Mask objects allow you to specify which output values should be
generated when a worklet is run. That is, the Mask allows you to skip
the invocation of a worklet for any number of outputs.
The ArrayPortalValueReference is supposed to behave just like the value
it encapsulates and does so by automatically converting to the base type
when necessary. However, when it is possible to convert that to
something else, it is possible to get errors about ambiguous overloads.
To avoid these, add specialized versions of the operators to specify
which ones should be used.
Also consolidated the CUDA version of an ArrayPortalValueReference to the
standard one. The two implementations were equivalent and we would like
changes to apply to both.
The timer class now is asynchronous and device independent. it's using an
similiar API as vtkOpenGLRenderTimer with Start(), Stop(), Reset(), Ready(),
and GetElapsedTime() function. For convenience and backward compability, Each
Start() function call will call Reset() internally and each GetElapsedTime()
function call will call Stop() function if it hasn't been called yet for keeping
backward compatibility purpose.
Bascially it can be used in two modes:
* Create a Timer without any device info. vtkm::cont::Timer time;
* It would enable timers for all enabled devices on the machine. Users can get a
specific elapsed time by passing a device id into the GetElapsedtime function.
If no device is provided, it would pick the maximum of all timer results - the
logic behind this decision is that if cuda is disabled, openmp, serial and tbb
roughly give the same results; if cuda is enabled it's safe to return the
maximum elapsed time since users are more interested in the device execution
time rather than the kernal launch time. The Ready function can be handy here
to query the status of the timer.
* Create a Timer with a device id. vtkm::cont::Timer time((vtkm::cont::DeviceAdapterTagCuda()));
* It works as the old timer that times for a specific device id.
The kernel launch component of the runtime device adapter is fairly
pointless. If the hardware supports CUDA we should expect that
VTK-m has the correct kernel versions.
Plus in the original version if the CUDA device was being used
and the kernel launch returns cudaErrorDevicesUnavailable it
was never possible to restore CUDA support. Now what happens
is that the runtime tracker is marked as failed, but the
calling code can always go back and trying the device again.
When you call VariantArrayHandle::CastAndCall, it now tries both basic
storage and virtual storage. You can modify the types of storages tried
by giving a type list of storage tags as the first argument.
The script fixed up most of the issues. However, there were some
instances that the script was not able to pick up on. There were
also some instances that still needed a means to select types.
Previously you had to exactly match the case of a device adapter's name to
construct it, which was a source of lots of problems ( OpenMP versus OPENMP, CUDA or Cuda ).
Now `vtkm::cont::make_DeviceAdapterId` and `vtkm::cont::RuntimeDeviceTracker` support
case-insensitive device construction.
Also
- Renamed vtkm::cont::make_DeviceAdapterIdFromName to just overload
make_DeviceAdapterId.
- Refactored CMake logic for unit tests
- Since we're now querying the device tracker for the names, they
cannot be all caps.
- Updated usages of InitLogging to use Initialize instead.
- Added changelog.
VTK-m has been updated to replace old per device worklet testing executables with a device
dependent shared library so that it's able to accept a device adapter
at runtime.
Meanwhile, it updates the testing infrastructure APIs. vtkm::cont::testing::Run
function would call ForceDevice when needed and if users need the device
adapter info at runtime, RunOnDevice function would pass the adapter into the functor.
Optional Parser is bumped from 1.3 to 1.7.
By making RuntimeDeviceInformation class template independent, vtkm is
able to detect
device info at runtime with a runtime specified deviceId. In the past
it's impossible
because the CRTP pattern does not allow function overloading(compiler
would complain
that DeviceAdapterRuntimeDetector does not have Exists() function
defined).
It's a filter that Split sharp manifold edges where the feature angle
between the adjacent surfaces are larger than the threshold value.
When an edge is split, it would add a new point to the coordinates
and update the connectivity of an adjacent surface.
Ex. there are two adjacent triangles(0,1,2) and (2,1,3). Edge (1,2) needs
to be split. Two new points 4(duplication of point 1) an 5(duplication of point 2)
would be added and the later triangle's connectivity would be changed
to (5,4,3).
By default, all old point's fields would be copied to the new point.
Use with caution.
The VTKM_TEST_ASSERT macro is a very useful tool for performing checks
in tests. However, it is rather annoying to have to always specify a
message for the assert. Often the failure is self evident from the
condition (which is already printed out), and specifying a message is
both repetative and annoying.
Also, it is often equally annoying to print out additional information
in the case of an assertion failure. In that case, you have to either
attach a debugger or add a printf, see the problem, and remove the
printf.
This change solves both of these problems. VTKM_TEST_ASSERT now takes a
condition and a variable number of message arguments. If no message
arguments are given, then a default message (along with the condition)
are output. If multiple message arguments are given, they are appended
together in the result. The messages do not have to be strings. Any
object that can be sent to a stream will be printed correctly. This
allows you to print out the values that caused the issue.
This is a subclass of ExecutionObject and a superset of its
functionality. In addition to having a PrepareForExecution method, it
also has a PrepareForControl method that gets an object appropriate for
the control environment. This is helpful for situations where you need
code to work in both environments, such as the functor in an
ArrayHandleTransform.
Also added several runtime checks for execution objects and execution
and cotnrol objects.
This change allows you to set a subclass of
vtkm::cont::ExecutionObjectBase as a functor
used in ArrayHandleTransform. This latter class will then detect that
the functor is an ExecObject and will call PrepareForExecution with the
appropriate device to get the actual Functor object.
This change allows you to use virtual objects and other device dependent
objects as functors for ArrayHandleTransform without knowing a priori
what device the portal will be used on.
Rather than force all dispatchers to be templated on a device adapter,
instead use a TryExecute internally within the invoke to select a device
adapter.
Because this removes the need to declare a device when invoking a
worklet, this commit also removes the need to declare a device in
several other areas of the code.
59c8bd28a vtkm::cont::Algorithm now can be told which device to run on at runtime
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1365
084b4d760 Document some of the VTK-m coding style such as single line loops
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1366
e34301eca Allow disabling/enabling of CUDA managed memory via an env variable
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1359
By setting the environment variable "VTKM_MANAGEDMEMO_DISABLED" to be 1,
users are able to disable CUDA managed memory even though the hardware is
capable of doing so.
The auto code formatting doesn't capture the entire VTK-m style
and developers should know the style before they open a merge request
and get a billion warnings.
The changelog is updated to replace initializer list changes with
variadic constructors and constexpr changes.
On OSX and Linux:
| Type |Constructed with| Compile time | Run Time
|-----|---|---|---|
|scalar(1~4) | () | Yes | |
|scalar(1~4) | {} | Yes | |
|scalar(>4) | () | Yes | |
|scalar(>4) | {} | | Yes |
|nested type(1~4) | () | Yes | |
|nested type(1~4) | {} | Yes | |
|nested type(>4) | () | ERROR | ERROR|
|nested type(>4) | {} | | Yes |
Only on windows with a compiler older than Visual Studio 2017
version 15.0 is vec with size>4 constructed at compile time.
The added files provide support for Lagrangian analysis of velocity fields of time-varying data. Examples show how to use the filter to generate data and a second example demonstrates consuming generated information to calculate new particle trajectories.
554bc3d36 At runtime TryExecute supports a specific deviceId to execute on.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !1334
std::random_shuffle is deprecated in C++14 because it's using std::rand
which uses a non uniform distribution and the underlying algorithm is
unspecified. Using std::shuffle can provide a reliable result in a 64
bit version.
It will reduce the cost of getting the thread runtime device tracker,
and will have a better runtime overhead if user constructs a lot of
short lived threads that use VTK-m.
The oscillator is a simple analytical source of time-varying data.
It provides a function value at each point that is computed as a
sum of Gaussian kernels -- each with a specified position, amplitude,
frequency, and phase.
This commit adds a worklet as well as a filter that modify point coordinates by moving points
along point normals by the scalar amount times the scalar factor.
It's a simpified version of the vtkWarpScalar class in VTK. Additionally the filter doesn't
modify the point coordinates, but creates a new point coordinates that have been warped.
This commit adds a worklet as well as a filter that modify point coordinates by moving points
along a vector by a certain scalar amount.
It's a simpified version of the vtkWarpVector in VTK.
It doesn't modify the point coordinates, but creates a new point coordinates that have been warped.
The original design of invoke and the transport infrastructure
relied on the implementation behavior of vtkm::cont types
such as ArrayHandle that used an internal shared_ptr to managed
state. This allowed passing by value instead of passing by
non-const ref when needing to transfer information to the device.
As VTK-m adds support for classes that use virtuals the ability
to pass by base pointer type allows for us to invoke worklets
using a base type without the risk of type slicing.
Additional by moving over to a non-const ref Invocation we
can update all transports that have 'output' to now be
by ref and therefore support types that can't be copied while
being 'more' correct.
Most of the arguments given to device adapter algorithms are actually
control-side arguments that get converted to execution objects internally
(usually a `vtkm::cont::ArrayHandle`). However, some of the algorithms,
take an argument that is passed directly to the execution environment, such
as the predicate argument of `Sort`. If the argument is a plain-old-data
(POD) type, which is common enough, then you can just pass the object
straight through. However, if the object has any special elements that have
to be transferred to the execution environment, such as internal arrays,
passing this to the `vtkm::cont::Algorithm` functions becomes
problematic.
To cover this use case, all the `vtkm::cont::Algorithm` functions now
support automatically transferring objects that support the `ExecObject`
worklet convention. If any argument to any of the `vtkm::cont::Algorithm`
functions inherits from `vtkm::cont::ExecutionObjectBase`, then the
`PrepareForExecution` method is called with the device the algorithm is
running on, which allows these device-specific objects to be used without
the hassle of creating a `TryExecute`.
1. Have a per-thread pinned array for cuda errors
2. Check for errors before scheduling new tasks and at explicit sync points
3. Remove explicit synchronizations from most places
Addresses part 2 of #168
880d8a989 Add `vtkm/Geometry.h` and test it.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1262
This commit adds several geometric constructs to vtk-m
in the `vtkm/Geometry.h` header. They may be used from
both the execution and control environments.
We also add methods to perform projection and Gram-Schmidt
orthonormalization to `vtkm/VectorAnalysis.h`.
See `docs/changelog/geometry.md` included in this commit
for more information.
Found via `codespell` and `grep`
more typos
includes source typo change and a typo that needs further review
follow-up typos
Follow-up typos
Revert a commit
157894436 Enable Vec construction with intializer_list of single value
79afe2a16 Simplify make_ArrayHandleSwizzle
ae8d994d2 Add support for initializer lists in Vec
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1239
Previously, the initilaizer_list of Vec had to be the exact
same number of values as the size of the Vec. Now, we also
accept a single value that is duplicated for all values in
the Vec. That allows you to more easily construct lists
of lists with repeated values.
Calling std::swap isn't legal from CUDA code, but the new vtkm::Swap
method is safe. It currently does a naive swap when compiling CUDA
code, and falls back to an ADL swap
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
Add constructors to the `vtkm::Vec` classes that accept
`std::initializer_list`. The main advantage of this addition is that it
makes it much easier to initialize `Vec`s of arbitrary length.
245d27bd Add changelog documents for major changes I made to VTK-m since 1.2
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Allison Vacanti <allison.vacanti@kitware.com>
Merge-request: !1219
This commit adds a worklet and a filter to compute
signed integral measures of cells.
It also begins the practic of adding a markdown
changelog entry in `doc/changelog`. See the
changelog entry for details.
Reformat to look correct in 100 column display.
Update the copyright.
Remove and update some outdated specifications.
Remove description of formatting guidelines. There are done
automatically now.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.