We need call PrepareForInput on the input argument before invoking a function.
The order of execution of parameters of a function is undefined, so we need to
make sure input is called before output, or else in-place use case breaks.
Instead of having a single specialization for sort and zip handles,
we know handle any fancy handles being passed to the cuda device adapter.
This was done by reworking how we represent fancy iterators inside thrust,
and instead of using a transform iterator + counting iterator we just use
a iterator_facade.
This is required so that we can use ArrayHandleZip with Sort/Reduce and
custom comparison operators. ArrayZip and PortalValue don't combine
well together, when used with DeviceAlgorithm that has a custom operator.
The custom operator is actually passed the PortalValue instead of
the real values, and by that point we can't fix anything since we
don't know what the original operator is.
Fix compile warnings that come up with the flags
-Wconversion -Wno-sign-conversion
This catches several instances (mostly in the testing framework) where
types are implicitly converted. I expect these changes to fix some of
the warnings we are seeing in MSVC.
I was going to add these flags to the list of extra warning flags, but
unfortunately the Thrust library has several warnings of these types,
and I don't know a good way to turn on the warnings for our code but
turn them off for Thrust.
Per a discussion with Robert Maynard, we no longer have plans to bind
textures or use texture objects. Instead, we are now using the __ldg
command to load from global memory inside the execution portal, which
gives us most of the benefits of textures, but doesn't incur any
bookkeeping.
Previously ArrayTransfer and ArrayManagerExecution received a reference
to a Storage class in their constructor and held the reference as an
ivar. In retrospect, this is just asking for trouble. First, it is way
too easy to pass by value when you mean to pass by reference. Second, if
there ever is a bug where the Storage goes out of scope before the
classes holding a reference, it is that much harder to debug.
This includes changing methods like LoadDataForInput to PrepareForInput.
It also changed the interface a bit to save a reference to the storage
object. (Maybe it would be better to save a pointer?) These changes also
extend up to the ArrayManagerExecution class, so it can effect device
adapter implementations.
This API change effects both ArrayTransfer and ArrayManagerExecution.
This is in preparation for a future change to make the API more
consistent with ArrayHandle.
Our approach of using the underlying allocator inside thrust was a bad approach,
for some reason it fails to properly allocate uint8's or int8's on the correct
boundaries. I expect that this logic is somewhere else in the code and
instead we should use thrust::system::cuda::vector which does this properly.
The namespaces need to be different for each test, or else only the first
implementation of the function will be used for all tests that call that
function.
Also updated the test to verify that we can count starting from a non zero
number.
Porting the dax device adapter over to vtkm. Unlike the dax version, doesn't
use the thrust::device_vector, but instead uses thrust::system calls so that
we can support multiple thrust based backends.
Also this has Texture Memory support for input array handles. Some more work
will need to be done to ArrayHandle so that everything works when using an
ArrayHandle inplace with texture memory bindings.