The Invoker is a control side object that handles the construction
of the relevant worklet dispatcher. Moving it to control makes it
obvious that it isn't an algorithm itself but a way to launch
worklets.
how did any of this work?
match other CellSet file layouts.
???
compile in CUDA.
unit tests.
also only serial.
make error message accurate
Well, this compiles and works now.
Did it ever?
use CellShapeTagGeneric
UnitTest matches previous changes.
whoops
Fix linking problems.
Need the same interface
as other ThreadIndices.
add filter test
okay, let's try duplicating CellSetStructure.
okay
inching...
change to wedge in CellSetListTag
Means changing these to support it.
switch back to wedge from generic
compiles and runs
remove ExtrudedType
need vtkm_worklet
vtkm_worklet needs to be included
fix segment count for wedge specialization
need to actually save the index
for the other constructor.
specialize on Explicit
clean up warning
angled brackets not quotes.
formatting
This adds an ExecutionSignature tag named Device that passes the
DeviceAdapterTag as an argument to the worklet's operator(). This allows
worklets to specialize their code based on the device.
The timer class now is asynchronous and device independent. it's using an
similiar API as vtkOpenGLRenderTimer with Start(), Stop(), Reset(), Ready(),
and GetElapsedTime() function. For convenience and backward compability, Each
Start() function call will call Reset() internally and each GetElapsedTime()
function call will call Stop() function if it hasn't been called yet for keeping
backward compatibility purpose.
Bascially it can be used in two modes:
* Create a Timer without any device info. vtkm::cont::Timer time;
* It would enable timers for all enabled devices on the machine. Users can get a
specific elapsed time by passing a device id into the GetElapsedtime function.
If no device is provided, it would pick the maximum of all timer results - the
logic behind this decision is that if cuda is disabled, openmp, serial and tbb
roughly give the same results; if cuda is enabled it's safe to return the
maximum elapsed time since users are more interested in the device execution
time rather than the kernal launch time. The Ready function can be handy here
to query the status of the timer.
* Create a Timer with a device id. vtkm::cont::Timer time((vtkm::cont::DeviceAdapterTagCuda()));
* It works as the old timer that times for a specific device id.
The script fixed up most of the issues. However, there were some
instances that the script was not able to pick up on. There were
also some instances that still needed a means to select types.
Rather than force all dispatchers to be templated on a device adapter,
instead use a TryExecute internally within the invoke to select a device
adapter.
Because this removes the need to declare a device when invoking a
worklet, this commit also removes the need to declare a device in
several other areas of the code.
Found via `codespell` and `grep`
more typos
includes source typo change and a typo that needs further review
follow-up typos
Follow-up typos
Revert a commit
183bcf109 Add initial version of an OpenMP backend.
7b5ad3e80 Expand device scheduler test to check for overlap.
e621b6ba3 Generalize the TBB radix sort implementation.
d60278434 Specialize swap for ArrayPortalValueReference types.
761f8986f Cache inputs to SSI::Unique benchmark.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Merge-request: !1099
- Use tao::tuple instead of FunctionInterface to hold array/portal
collections.
- Type signatures are simplified. Now just use:
- ArrayHandleCompositeVector<ArrayT1, ArrayT2, ...>
- make_ArrayHandleCompositeVector(array1, array2, ...)
instead of relying on helper structs to determine types.
- No longer support component selection from an input array. All
input arrays must have the same ValueType (See ArrayHandleSwizzle
and ArrayHandleExtractComponent as the replacements for these
usecases.
0da55730 Fix scaling of x/y field of views
39b347db Fix raytrace using wrong fov with standard camera
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1185
Prevously, the x field of view was computed by scaling the y field of
view by the aspect ratio of the canvas. However, this is wrong because
the field of view angles do not scale linearly with the aspect ratio.
Fix this issue by using trig functions to convert the angle to distance,
scale the distance, and then convert back to angle.
The raytracing code has its own version of camera that maintains two
field of view (fov) parameters: one for the x direction and one for the
y. The standard vtkm::rendering::Camera contains only one fov. As is
consistent with OpenGL's gluPerspective and VTK's camera, the fov is
specified in the y direction. However, the raytracing code was
incorrectly using it in the x direction. That caused it to do a weird
rescaling when the aspect ratio was not 1.
The new and improved vtkm::cont::ColorTable provides a more feature complete
color table implementation that is modeled after
vtkDiscretizableColorTransferFunction. This class therefore supports different
color spaces ( rgb, lab, hsv, diverging ) and supports execution across all
device adapters.
These changes now allow VTK-m to compile on CUDA 7.5 by using const arrays,
when compiling with CUDA 8+ support we upgrade to static const arrays, and
lastly when CUDA is disabled we fully elevate to static constexpr.
c6565cde correcting compile error with CanvasGL
c57fad48 Depth no considered with annotations. Fix for volume renderer. Consistent color blending.
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !1068
Visual Studio default toolset uses 32bit executables for compiling which
means that if it uses more than 4GB of memory per instance it crashes.
By moving the ConnectivityTracer into a separate compilation unit we
can help out the compiler.
This also improved compilation times and library size:
Old:
build time (j8): 48.62 real
lib size: 6.2MB
New:
build time (j8): 41.31 real
lib size: 5.0MB
If a global static array is declared with VTKM_EXEC_CONSTANT and the code
is compiled by nvcc (for multibackend code) then the array is only accesible
on the GPU. If for some reason a worklet fails on the cuda backend and it is
re-executed on any of the CPU backends, it will continue to fail.
We couldn't find a simple way to declare the array once and have it available
on both CPU and GPU. The approach we are using here is to declare the arrays
as static inside some "Get" function which is marked as VTKM_EXEC_CONT.
Sandia National Laboratories recently changed management from the
Sandia Corporation to the National Technology & Engineering Solutions
of Sandia, LLC (NTESS). The copyright statements need to be updated
accordingly.
5429f120 fixed energy error and fix for ray camera
ac784330 Merge remote-tracking branch 'upstream/master' into x_ray_add_emmission
521e445b emission now functioning
52959d91 testing emission
5fc80517 initial version on absorption and emission integration
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !893
There was a case of bad casting where a Float32 that should be in the
range between 0 and 1 was cast to an Id and then multiplied by the size
of an array. The intention is to get a proportional index into the
array, but because the float was cast to an int first, you get either
the first or last index.
Instead, first cast the size of the array to a Float32, multiply it by
the fraction, and then cast back to an Id. That should give the desired
effect.
Change the VTKM_CONT_EXPORT to VTKM_CONT. (Likewise for EXEC and
EXEC_CONT.) Remove the inline from these macros so that they can be
applied to everything, including implementations in a library.
Because inline is not declared in these modifies, you have to add the
keyword to functions and methods where the implementation is not inlined
in the class.