Commit Graph

12 Commits

Author SHA1 Message Date
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Sergey Sharybin
e2354e64d2 Cycles: Cleanup, spaces around assignment operator
Did some bad spacing in recent commits, better to get rid of those so
they does not confuse those who're working on sources.
2015-04-07 00:25:54 +05:00
Sergey Sharybin
394b947a50 Cycles: Remove unused direction from triangle intersection functions
This argument was unused and got nicely optimized out. But once it
starts to be using registers are getting stressed really crazy,
causing slow down of render.
2015-04-01 21:08:12 +05:00
Sergey Sharybin
7da4c2637d Cycles: Fix typo in distance heuristic for shadow rays
It's not that bad because this typo could only caused not really
efficient BVH traversal, causing higher render times. Not as if
it was causing render artifacts.
2015-03-31 19:52:14 +05:00
Sergey Sharybin
298d8681a0 Fix T43596: Refraction BSDF crashes blender on pre-sse4 CPU
This is the same issue T43475: SSE4 code is more robust to non-finite values
in the ray origin/direction. So for now added a check before doing BVH traversal
for pre-SSE4 CPUs.

For sure actual root of the issue is a bit different and much more tricky to
solve, especially without disturbing render results too much. Still looking
into this.

In any case, it's kinda fine to have such a check, we might later make it to be
a kernel_assert() instead of just a return.
2015-02-10 17:36:05 +05:00
Sergey Sharybin
5719ed1225 Cycles: Add leaf primitives sanity check asserts to the kernel
This way we'll notice that leaf splitting didn't happen correct pretty easily
in debug builds.

There'll be absolutely no impact on release builds.
2015-01-12 15:05:14 +05:00
Sergey Sharybin
bc7ff3c2b4 Cycles: Enable leaf split by primitive type and adopt BVH traversal for this
This commit enables BVH leaf nodes split by the primitive type and makes it
so BVH traversal code is now aware and benefits from this.

As was mentioned in original commit, this change is crucial to be able to do
single ray to multiple triangle intersection. But it also appears to give
barely visible speedup in some scene.

In any case there should be no noticeable slowdown, and this change is what
we need to have anyway.
2015-01-12 15:04:52 +05:00
Sergey Sharybin
4088fad6dd Cycles: Add asserts around BVH stack pushes
This way we're kind of safer to troubleshoot possible stack overflow issues.
2014-12-29 14:02:15 +05:00
Sergey Sharybin
40517283ca Cycles: Bump stack size for QBVH traversal code
Traversal now can push up to 2x of nodes to the stack, so need some tweaks
to the stack size.
2014-12-29 13:37:18 +05:00
Sergey Sharybin
91bbaaa271 Cycles: Fix visibility check for instanced nodes
The issue is that only instance node contains proper visibility flags,
nodes from instanced BVH are not correct.
2014-12-27 23:33:50 +05:00
Sergey Sharybin
cd095aae13 Cycles: Distance optimization for QBVH
This commit implements heuristic which allows to skip nodes pushed to the stack
from intersection if distance to them is larger than the distance to the current
intersection.

This should solve speed regression which i didn't notice in the original QBVH
commit (which could have because i had WIP version of this patch applied in my
local branch).

From quick tests speed seems to be much closer to what is was with regular BVH.

There's still some possible code cleanup, but they'll need a bit of assembly
code check and now i want to make it so artists can happily use Cycles over the
holidays.
2014-12-25 22:40:02 +05:00
Sergey Sharybin
03f28553ff Cycles: Implement QBVH tree traversal
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.

This commit also does some changes to the loop inspired by Embree:

- Visibility flags are only checked for primitives.

  Doing visibility check for every node cost quite reasonable amount of time
  and in most cases those checks are true-positive.

  Other idea here would be to do visibility checks for leaf nodes only, but
  this would need to be investigated further.

- For minimum hair width we extend all the nodes' bounding boxes.

  Again doing curve visibility check is quite costly for each of the nodes and
  those checks returns truth for most of the hierarchy anyway.

There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.

Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).

The code is enabled for compilation in kernel. but blender wouldn't use it
still.
2014-12-25 02:50:49 +05:00