Somehow deducing the parameters to a static function was causing
gcc to segfault and crash. Changed around the code to use a free
function and everything works properly
Previously, when a Worklet needed a scatter, the scatter object was
stored in the Worklet object. That was problematic because that means
the Scatter, which is a control object, was shoved into the execution
environment.
To prevent that, move the Scatter into the Dispatcher object. The
worklet still declares a ScatterType alias, but no longer has a
GetScatter method. Instead, the Dispatcher now takes a Scatter object in
its constructor. If using the default scatter (ScatterIdentity), the
default constructor is used. If using another type of Scatter that
requires data to set up its state, then the caller of the worklet needs
to provide that to the dispatcher. For convenience, worklets are
encouraged to have a MakeScatter method to help construct a proper
scatter object.
`vtkm_unit_tests` now supports an MPI option that can be used to add
test that run with MPI. Adding `UnitTestFieldRangeGlobalCompute` to test
global ranges for fields.
Removing MultiBlock::GetGlobalRange API to keep things consistent with
DataSet API. Instead, one should use `FieldRangeCompute` or
`FieldRangeGlobalCompute` as appropriate.
Previously memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. By extending the ArrayHandle constructors
to support a Storage object that is being moved, we can clearly express that
the ArrayHandle now owns memory it didn't allocate.
Here is an example of how this is done:
```cpp
T* buffer = new T[100];
auto user_free_function = [](void* ptr) { delete[] static_cast<T*>(ptr); };
vtkm::cont::internal::Storage<T, vtkm::cont::StorageTagBasic>
storage(buffer, 100, user_free_function);
vtkm::cont::ArrayHandle<T> arrayHandle(std::move(storage));
```
This fixes the three following issues with StorageBasic.
1. Memory that was allocated by VTK-m and Stolen by the user needed the
proper free function called which is generally StorageBasicAllocator::deallocate.
But that was hard for the user to hold onto. So now we provide a function
pointer to the correct free function.
2. Memory that was allocated outside of VTK-m was impossible to transfer to
VTK-m as we didn't know how to free it. This is now resolved by allowing the
user to specify a free function to be called on release.
3. When the CUDA backend allocates memory for an ArrayHandle that has no
control representation, and the location we are running on supports concurrent
managed access we want to specify that cuda managed memory as also the host memory.
This requires that StorageBasic be able to call an arbitrary new delete function
which is chosen at runtime.
Changed the "default" ColorTable preset from "cool to warm" to
"viridis." Also made a default constructor for ColorTable that sets it
to this default preset.
The main reason to change to viridis for the default is that it is in
LAB space. We are concerned that having the default ColorTable preset
being Diverging space could lead to users using that color space
inappropriately.
The problem is that there is no good "default" constructor for
ColorTable. The previous default constructor created an empty color
table, but that would be confusing if someone actually tried to use it.
We could set ot to the default preset, but the default preset uses the
diverging color map, which could foul people up if they actually want to
edit or create their own color map. Instead, force the declaration of
ColorTable to indicate what you plan to do with it.
The new and improved vtkm::cont::ColorTable provides a more feature complete
color table implementation that is modeled after
vtkDiscretizableColorTransferFunction. This class therefore supports different
color spaces ( rgb, lab, hsv, diverging ) and supports execution across all
device adapters.
DIY now depends on MPI optionally. Hence we no longer need to depend on
DIY optionally based on whether MPI was enabled. Update cmake and c++
code to always use DIY-based components.
DIY is built with MPI support if VTKm_ENABLE_MPI is ON.
By hard coding the PrepareForDevice to know about all the different VTK-m
devices, we can have a single base class do the execution allocation, and not
have that logic repeated in each child class.
When using vtkm::dot on narrow types you easily rollover the values.
Instead the result type of vtkm::dot should be wide enough to store the results
(32bits) when this occurs.
Fixes#193
1. Add option to copy user supplied array in make_ArrayHandle.
2. Replace Field constructors that take user supplied arrays with make_Field.
3. Replace CoordinateSystem constructors that take user supplied arrays with
make_CoordinateSystem.
Updating MultiBlock to use `diy` for computing block summaries like
ranges, bounds etc. This makes it possible to MultiBlock to
work in distributed operations without explicit logic.
Previously we allowed a const ref as we would make a copy, this only works
as it relies on RuntimeDeviceTracker implementing state through a shared_ptr.
Instead if we require modifiable types only we can make TryExecute more
efficient and clearer on what it does.
By using perfect forwarding we can reduce not only the amount of TryExecute
signatures, but we can enable the ability to pass temporary functors to
TryExecute.
At the same time we have optimized TryExecute by moving the string generation
code into a single function that is compiled into the vtkm_cont library.
The end result is that the vtkm_rendering library size has been reduced from
12MB to 11MB, and we shave off about 5% of our build time.
The implementation of ScanExclusiveByKey in
DeviceAdapterAlgorithmGeneral by shifting values in the input values
array and then calling ScanInclusiveByKey. However, the temporary
shifted values array was created using the key type instead of the
values type. This caused a compile error when the keys and values had
different types.
For std::copy to optimize a copy to memcpy, the valuetype must be both
trivially constructable and trivially copyable.
The new copy benchmarks highlighted an issue that std::copy'ing pairs
and vecs were not optimized to memcpy. For a 256 MiB buffer on my
laptop w/ GCC, the serial copy speeds were:
UInt8: 10.10 GiB/s
Vec<UInt8, 2> 3.12 GiB/s
Pair<UInt32, Float32> 6.92 GiB/s
After this patch, the optimization occurs and a bitwise copy occurs:
UInt8: 10.12 GiB/s
Vec<UInt8, 2> 9.66 GiB/s
Pair<UInt32, Float32> 9.88 GiB/s
Check were also added to the Vec and Pair unit tests to ensure that
this classes continue to be trivial.
The ArrayHandleSwizzle test was refactored a bit to eliminate a new
'possibly uninitialized memory' warning introduced with the default
Vec ctors.
In generic code, it's a pain to use the equality operators since they
requires the ValueType and Storage to match, else the operator is undefined.
This commit adds operators for such comparisons, as well as a unit test.
8fabece1 Use median point from cluster as representative vertex.
c7bf0c95 Compute PointIdMap while reducing cluster ids.
5dee7c6a Select input point from cluster rather than averaging.
28e76ddb Update vertex clustering benchmarking code.
e3c9e7bb Optimize cell map computation.
d7669650 Use requested grid in VertexClustering worklet.
0472dc11 Fix warning on Cuda.
3f4e17e2 Add field mapping to VertexClustering.
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !960
The idea of the test was to turn off the "default" storage to ensure
that the fancy array was not making assumptions about the storage of its
delegate array. But there is lots of code elsewhere that uses the
default storage (rightly so) to create intermediate arrays, which will
fail if you disable the default storage. This was causing a test to
fail, so turn default storage back on for this case.
This is a convenience method to do a deep copy of an array. This comes
up a lot, but can be a pain if you don't have a specific device adapter
on which to do the copy.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
Previously, ConvertNumComponentsToOffsets always used TryCompile on the
global set of runtime devices. That is still the default behavior, but
now you are able to specify your own runtime tracker. Also, there are
now versions of ConvertNumComponentsToOffsets that take a device adapter
tag.
Previously once an ArrayHandle was stolen it was placed in an invalid state
where it could not used again by VTK-m. Now instead after being stolen it
is placed into a state where it is identical to memory allocated outside
of VTK-m and passed in.
75517554 Move check for cell variables to it gets executed.
147247e8 Code formatting changes and compiler warning fixes.
a3fd135b Fix errors and warnings on Mac and Windows
347af497 Poly Data for External Faces
aeed7a07 Cell variables for External Faces
ad13e9b4 Merge branch 'master' into external-faces-production
ab25c160 External Faces Uniform and Rectilear grids
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !860
The external faces filter and worklet now pass input
PolyData (0D, 1D, and 2D) cells to the output. The external
faces filter has a flag to control this output (PassPolyData).
Added tests to the external faces filter and worklet.
This is part of #43, which will ultimately simplify the
ArrayHandleCompositeVector to a new implementation that can be easily
written to. Part of this effort will remove the ability to pull a single
component from a vector-typed input ArrayHandle for use in the
CompositeVector, and this new class makes sure we can still support that
usecase.
The old templated array transfer mechanism generated a lot of code
that ended up doing a simple, type-agnostic memcpy for most devices.
This patch specialized array handles for basic storage and uses a
fast-path array transfer implementation. This reduces the size of the
vtkm_cont library by 27% on gcc (from 6.2MB to 4.5MB).
5226fa8b add read only (for the moment) test and implementation of ArrayHandleReverse (a.k.a reverse iterator)
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Kenneth Moreland <kmorel@sandia.gov>
Merge-request: !763
ec6589d3 Only enable -fPIC on component static libraries when necessary.
cbfe5fdd Fix up various issues with ArrayHandles in vtkm_cont.
355eea88 Get the vtkm cont cuda object to compile properly.
6ecc22bb First pass at compiling ArrayHandle into vtkm_cont.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !715
windows.h was only being included for MSVC, while in UnitTestTimer.cxx, the
Windows function Sleep was being called after check for _WIN32. This was
causing compilation failure in MINGW.
Fixes#122
Following what was done with ArrayRangeCompute, the GetRange and
GetBounds methods are embedded into the vtkm_cont library for the most
common type lists.
Also, and probably more importantly, the device adapter is no longer one
of the arguments for either of these methods. It is no longer needed as
ArrayRangeCompute no longer needs it.
Most uses of ArrayRangeCompute just want to get the range of the data
and probably don't have a particular device in mind. Thus, it is better
to use a TryExecute internally use whatever devices are available.
Note that when using TryExecute, the calling code is expected to be able
to support all devices. That might not always be the case. Thus, I am
experimenting a bit with how we incorporate this in a library. The
advantage of having the code compiled in a library is that you only have
to compile it once and the calling code does not need to worry about
CUDA, etc.
However, because ArrayRangeCompute is templated, we can only pre-compile
some subset of array handle types. The most common are compiled into the
code (matching all the predefined ArrayHandles as well as some special
cases). If the code wants to use some other type, it has to include
ArrayRangeCompute.hxx. The only place where this is necessary is a test
that intentially trys to find the range on an uncommon type.
If array portals were to support virtual methods, then we should be able
to modify this code so that we could precompile for all array handle
types.
There were some issues for device adapter algorithms (like scan and
reduce) for empty arrays or arrays of size 1. This adds tests for these
short arrays to the device adapter algorithm tests and fixes these
problems.
17ed7a36 Remove typedef that is no longer used
364f4175 Only print cell arrays that are valid
5b8389f9 Use printSummary_ArrayHandle when testing fancy arrays
873ceefc Implement ArrayHandleGroupVecVariable::GetPortalConst
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !695
dcbbb727 Merge branch 'master' into external-face-more-generic
0703139a Make Keys class do in-place sort
059c7f6d Fix issue with ExternalFaces on CellSetSingleType
876514ba Add better test for external faces
53679dfc Update ExternalFaces to support mixed face types
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Brent Lessley <blessley@cs.uoregon.edu>
Merge-request: !662
It was a typedef for a Portal. Instead of setting the portal directly,
the portal is just sent to a function, so we don't directly use this
type anymore.
Not a big deal, but it could cause compiler warnings.
This makes it easier to see what is going on in the fancy arrays and do
diagnostics.
This change required some changes to printSummary_ArrayHandle to support
more array types.
Previously if you constructed an array handle without allocating it, you
would get an error if you tried to use the array as input. This
conflicted with some recent changes to accept empty vectors.
Now when you try to use an unallocated ArrayHandle as input (calling
PrepareForInput or PrepareForInPlace), it internally calls Allocate(0)
(to establish internal state) and sets up a valid execution ArrayPortal
of size 0.
4fc6a6a4 Updating formatting and fixing compiler warnings
00b73b63 Updating formatting and fixing compiler warnings
eee9edde Updating formatting and fixing compiler warnings
6cac1843 Updating formatting and fixing compiler warnings
48d22460 Updating formatting and fixing compiler warnings
19b61a53 Merge branch 'master' into DataSetAddUniform-test-enhance
cefc333a Adding fields to data set builder unit tests to test fields.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !681
ArrayHandleDiscard is intended to be used for worklets that produce
multiple output arrays when one or more outputs is not needed. It
does not allocate space for its data and the Set method is a no-op,
allowing the compiler to prune unnecessary instructions.
Reading from the array handle is not allowed.
The current design for ArrayPortalVirtual makes it a requirement for all
array portals (that it wraps) to have Set defined. Thus, make sure Set is
defined for all ArrayPortal. Where Set is invalid, an assert is thrown if
something calls it at runtime.
The implementation was calling PrepareForOutput on the delegate arrays
rather than PrepareForInPlace, do when used with CUDA you did not get
the data on the device.
Also added a regression test to check this.
The CellSetExplicit and CellSetSingleType classes have an ivar that
marks the number of points. There were several instances of code
creating cell sets without specifying the number of points. This can be
very bad if subsequent code needs that information.
This will expose bugs inside the tbb backend. We had to use heap allocations
for test arrays, instead of stack as the new array increases started to
cause stack overflow on windows.
While writing a test I noticed that some of the MakeTestDataSet
hexahedrons had improper point ordering. It was close but backwards so
that all the faces pointed in instead of out.
8a93ecc4 code alignment tweaks.
6fa448b5 Remove the 1D camera. 1D plots will use a 2D camera.
52aa9b9a Fix some compile errors.
23d8d585 Add explicit 1D rendering. Also added some data model suport.
db522c4c Add GetCanvas calls to the mapper classes.
87b1cdca cleanup. Remove some compiler warnings.
d38e6270 Support for 1D rendering.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !616
b97b4cc7 Allow thrust::reduce to work when iterator and initial value types differ.
64bcc343 Refactor MinAndMax to use vtkm::Vec<T,2> instead of Pair.
8d60ed57 Refactor MinAndMax to be a shared binary operator.
18375b54 Update Bound computations to always use a single Reduce call
2cfc9743 Reduce can support reduce to a T type that isn't the arrayhandles T type.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !614
We previously included windows.h in numerous locations using different
techniques to guard against bringing in parts of the file that are bad
(min/max macros, etc). This solves the problem by consistently using
vtkm/internal/Windows.h to setup everything.
CUDA has some strange rules about using private classes and anonymous
namespaces. For whatever reason, recent changes have introduced such an
issue. When compiling on CUDA, expose the problematic class. It is
testing code, so it does not matter much.
This is a fancy array handle that can group entries in another array by
arbitrary amounts. This allows us to implement input and output arrays
with a different sized Vec for each instance. This is necessary for
generating new topologies with cells of different types.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
There were many tests that created code paths for every base and Vec
type that VTK-m supports (up to 4 components). Although this is
admirable, it is also excessive, and our compile times for the tests are
very long.
To shorten compile times, remove the TryAllTypes method. Replace it with
a version of TryTypes that uses a default list of "exemplar" set of
integers, floats, and Vecs.
By far the source file that was taking the longest to compile was that
for the fancy array handles. This is because this test was being
pedantic about all the different types it was testing. This change
should drastically reduce the types actually compiled for and,
therefore, also drastically reduce the compile time for this test.
This allows callers to copy a subsection of an array into another array,
without clearing the contents of the destination array if a resize
is required.
d677d0d1 small tweaks
816364d2 in an effort to get rid of a warning
778da350 In attempt to fix errors and warnings
bb450c51 fix a warning
49e56b61 two new wavelet filters, HAAR and CDF8/4 supported now
767356bc working on even length filters; need ASYM* support in Extend1D()
a6efad04 half done even length filters implementation
ee32ea4c took off timing code
...
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !482
There are various reasons why you might want to execute something but
not have a specific device to execute on. To mange this, add a general
function that will try a list of devices in order and attempt to run on
them in order.