Quite simple fix for now which only deals with this case. Maybe we want to do
some "clipping" on image load time so regular textures wouldn't give NaN as
well.
This allows us to use faster math and still have reliable
isnan/isfinite tests.
Only do it for host side, kernels stays unchanged.
Thanks Lukas Stockner for the tip!
The issue was caused by usage of non-initialized image user, which
could have different settings, causing some random image being loaded
or not loaded at all.
This caused non-deterministic behavior of Cycles image loading because
it was querying image information from several places.
This fixes crash reported in T50616, but it's not a complete fix
because preview rendering in material is wrong (same wrong as in
2.78a release).
We now assert that we now file version of libraries (needed for
do_version after linking step), so for missing libraries, set dummy
numbers (using version of main .blend file actually).
Works similar to regular Cycles tests, just does OpenGL render to
get output image.
Seems to work fine with the only funny effect: Blender window will
pop up for each of the tests. This is current limitation of our
OpenGL context. Might be changed in the future.
Seems CUDA failed to de-duplicate the array across multiple inlined
versions of the shadow_blocked(). Helped it a bit with that now.
Gives about 100MB memory improvement on a scenes after previous
commit and brings up memory "regression" to only 100MB comparing to
the master branch now.
This commit enables record-all behavior of transparent shadows
rays.
Render times difference goes as following:
GTX 1080 render time
BMW -0.5%
Fishy Cat -0.0%
Pabellon Barcelona -11.6%
Classroom +1.2%
Koro -58.6%
Kernel will now use some extra VRAM memory to store the intersection
array (200MB on my configuration). This we can optimize out with some
further commits.
The idea is to record all possible transparent intersections when
shooting transparent ray on GPU (similar to what we were doing on
CPU already).
This avoids need of doing whole ray-to-scene intersections queries
for each intersection and speeds up a lot cases like transparent
hair in the cost of extra memory.
This commit is a base ground for now and this feature is kept
disabled for until some further tweaks.
Now we break the traversal cycle and then perform volume attenuation
and check with zero throughput. Not sure it makes any measurable sense
at this moment, but in the future it might help de-duplicating some
extra logic here.
Removed unnecessary call to DM_update_tessface_data(). This call is
already performed by DM_ensure_tessface(dm). The call being performed
twice caused a failing BLI_assert().
Reviewed by: Kévin Dietrich
With the new names the arguments (yup, zup) are in the same order as
they appear in the function name. The old names used copy_src_dst(dst,
src), which I found very confusing. Furthermore, now it is clear from
where to where the copy is made.
This makes the function names a little bit longer, though. If that is
a real issue, we can just name them zup_from_yup(zup, yup).
Reviewed by: Kévin Dietrich
Speedup is mainly gained by multi-threading. Gives about 3x
fps gain on an edit shot file.
There is still some room for improvements, will happen in one
of the upcoming commits.