These changes now allow VTK-m to compile on CUDA 7.5 by using const arrays,
when compiling with CUDA 8+ support we upgrade to static const arrays, and
lastly when CUDA is disabled we fully elevate to static constexpr.
There was a bug where if you included an empty range into another range,
the range suddenly became infinite. This is because empty ranges have
the max at -infinity and the min at infinity. If you just include the
endpoints independently (which is what include does), then the min gets
-infinity and the max gets infinity. This gets a check to make sure that
empty ranges are ignored.
When using vtkm::dot on narrow types you easily rollover the values.
Instead the result type of vtkm::dot should be wide enough to store the results
(32bits) when this occurs.
Fixes#193
The intel compiler could not generate code in a timely manner ( 12+ hours ) when
asked to produce a cross product of very long lists. By moving to a lazy
evaluation scheme we now have all compilers product a cross product in a
reasonable amount of time ( 2-4 seconds ).
This resolves Issues:
- https://gitlab.kitware.com/vtk/vtk-m/issues/190
- https://gitlab.kitware.com/vtk/vtk/issues/17196
If a global static array is declared with VTKM_EXEC_CONSTANT and the code
is compiled by nvcc (for multibackend code) then the array is only accesible
on the GPU. If for some reason a worklet fails on the cuda backend and it is
re-executed on any of the CPU backends, it will continue to fail.
We couldn't find a simple way to declare the array once and have it available
on both CPU and GPU. The approach we are using here is to declare the arrays
as static inside some "Get" function which is marked as VTKM_EXEC_CONT.
Rather than requiring all the arguments to be placed as member variables to
the functor you can now pass extra arguments that will be added to the functor
call signature.
So for example:
vtkm::ForEach(functor, vtkm::TypeListTagCommon(), double{42.0}, int{42});
will be converted into:
functor(vtkm::Int32, double, int)
functor(vtkm::Int64, double, int)
functor(vtkm::Float32, double, int)
functor(vtkm::Float64, double, int)
...
For std::copy to optimize a copy to memcpy, the valuetype must be both
trivially constructable and trivially copyable.
The new copy benchmarks highlighted an issue that std::copy'ing pairs
and vecs were not optimized to memcpy. For a 256 MiB buffer on my
laptop w/ GCC, the serial copy speeds were:
UInt8: 10.10 GiB/s
Vec<UInt8, 2> 3.12 GiB/s
Pair<UInt32, Float32> 6.92 GiB/s
After this patch, the optimization occurs and a bitwise copy occurs:
UInt8: 10.12 GiB/s
Vec<UInt8, 2> 9.66 GiB/s
Pair<UInt32, Float32> 9.88 GiB/s
Check were also added to the Vec and Pair unit tests to ensure that
this classes continue to be trivial.
The ArrayHandleSwizzle test was refactored a bit to eliminate a new
'possibly uninitialized memory' warning introduced with the default
Vec ctors.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
1d5a4d47 Update ExternalFaces worklet to use hashes
fc7b90ac Add hash function
b1e6c1e3 Add method for getting a cononical edge id
8e72dc73 Add method for getting a cononical face id
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !900
All types of cell sets should have a consistent interface. This is
tested (and fixed) for explicit and permutation cell sets. However, an
unfixed bug that has been identified is that permutation cell sets only
work for point to cell topologies. All others are not supported
correctly.
- Exception classes cannot be exported due to MSVC's design decisions.
See http://stackoverflow.com/questions/24511376. We must leave these
classes as header only and silence the warnings.
- TransferResource in BufferState.h must remain a header-only class since
there is no vtkm_interop library to compile the class into.
- The VTKDataSetReader hierarchy must similarly remain header-only since
there is no vtkm_io library.
- The OptionParser Action classes are part of a header-only utility and
cannot be easily compiled into a library.
-
2e05c1dd Support derivatives of vectors
dd45b568 Vec of Vec fixes
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !630
Fix a couple of places where having a Vec of a Vec (for example
vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 3>) did not work as it should.
One major place was in the test_equal function used in the testing
framework. If given a Vec of Vec, it would try to convert the Vec to
Float64 and fail. Now the comparison decends as many levels as need be.
The second place was in TypeTraits::ZeroInitialization() for all Vec
types. Originally it initialized the Vec using the default constructor
of the component type. This works fine when the component type is a
standard type, but if the component type is itself a Vec, then the
default constructor does not actually set any of the sub components.
Solved the issue by using the result of ZeroInitialization for the
component type, which will work for basic types and Vecs and anything
else supported by TypeTraits.
These structs behave much like Vec except that they work on a short C
array given to them rather than having the statically sized short array
defined within.
I expect to use this in the short term to help implement cell face
classes, but there are probably many other uses.
I noticed the UnitTestTransform3D test failed using the random seed
1480544620. On closer inspection, I found that the issue was with the
comparison of two numbers close to 0. The numbers were just above the
threshold, but their difference was not quite enough to make the ratio
below the threshold.
After reviewing some other floating point comparisons, they seem to be
more forgiving of numbers close to 0. Thus, I changed this comparison to
pass if the difference between the numbers was below the threshold.
Because this makes the comparison a lot more forgiving for small
numbers, I lowered the default threshold by an order of magnitude. So
far it looks like the tests are passing, but we should look out for
occasional failures.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.
When the intel compiler has vectorization enabled ( -O2/-O3 ) it converts the
`adjacent/hypotenuse` divide operation into reciprocal (rcpps) and
multiply (mulps) operations. This causes a change in the expected result that
is larger than the default tolerance of test_equal. So to resolve the problem
we increase the tolerance for acos when using icc.
When the intel compiler has vectorization enabled ( -O2/-O3 ) it converts the
`adjacent/hypotenuse` divide operation into reciprocal (rcpps) and
multiply (mulps) operations. This causes a change in the expected result that
is larger than the tolerance of test_equal.