The title says it all actually. Gives up to 10% speedup on test scenes here
on i7-6800K.
Render times on GPU are unreliable here, but there might be some slowdown
caused by watertight nature of intersections.
This is a preparation work for the followup commit which wil l move
remaining parts of Woop intersection logic to an utility file.
Doing it as a separate commit to keep changes more atomic and easier
to bisect when/if needed.
Decoupled ray marching is not supported yet.
Transparent shadows are always enabled for volume rendering.
Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.
Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.
There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).
Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself
Differential Revision: https://developer.blender.org/D2428
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.
This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.
On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
While they prevent legit write past the array boundary error
those fixes introduced regression in behavior when having exact
max_hits transparent intersections and nothing else.
Previous code would have considered such case a totally opaque,
but it's not correct.
Fixes T48941: Some materials don't get transparent shadows anymore
It was possible to miss bounces termination criteria in this functions,
mainly when max_hits was set to 0.
Made the check more robust in traversal functions (which should not
affect performance, it's an operation of same complexity AFAIK).
Also avoid doing ray-scene intersection from shadow_blocked when
limit of transparent bounces was already reached.
Using camel case for variables is something what didn't came from our original
code, but rather from third party libraries. Let's avoid those as much as possible.
BVH traversal is not really that much a geometry and we've got
quite some traversals now. Makes sense to keep them separate in
the name of source structure clarity.