Commit Graph

83 Commits

Author SHA1 Message Date
Sergey Sharybin
c8548871ac Cycles: Use more explicit and commonly used names for BVH structures
This renames BinaryBVH to BVH2 and QBVH to BVH8. There is no user measurable
difference, but allows us to add more types of BVH trees such as BVH8.
2017-04-13 10:29:14 +02:00
Sergey Sharybin
9b1564a862 Cycles: Cleanup, rename RegularBVH to BinaryBVH
Makes it more explicit what the structure is from it's name.
2017-03-30 09:47:27 +02:00
Sergey Sharybin
270df9a60f Cycles: Cleanup, don't use m_ prefix for public properties 2017-03-29 14:45:49 +02:00
Sergey Sharybin
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
Sergey Sharybin
8b8c0d0049 Cycles: Don't calculate primitive time if BVH motion steps are not used
Solves memory regression by the default configuration.
2017-02-15 12:59:31 +01:00
Sergey Sharybin
dc7bbd731a Cycles: Fix wrong hair render results when using BVH motion steps
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.

This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.

What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.

Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.

For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:

- Uses more memory.

  which isn't surprising, and ANY solution to this problem will
  use more memory.

  What we still have to do is to avoid this memory increase for
  cases when we don't use BVH motion steps.

- Reduces number of maximum available textures on pre-kepler cards.

  There is not much we can do here, hardware gets old but we need
  to move forward on more modern hardware..
2017-02-15 12:45:04 +01:00
Sergey Sharybin
1ad04c7d65 Cycles: Store time in BVH nodes
This way we can stop traversing BVH node early on.

Gives about 2-2.5x times render time improvement with 3 BVH steps.
Hopefully this gives no measurable performance loss for scenes with
single BVH step.

Traversal is currently only implemented for QBVH, meaning old CPUs
and GPU do not benefit from this change.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
48997d2e40 Cycles: Cleanup, style 2016-10-24 12:26:12 +02:00
Sergey Sharybin
1e1811357d Cycles: Cleanup, spaces 2016-10-24 11:47:32 +02:00
Sergey Sharybin
ec54a08d30 Revert "Cycles: Tweak empty boundbox children"
This reverts commit ecbfa31caaadb03c53c0fe1459718b99613c8804.

Original commit broke logic in nodes re-fitting. That area can
access non-existing children momentarely. Not sure what would
be best solution here, for now simply reverting the change/
2016-09-15 09:39:33 +02:00
Sergey Sharybin
ecbfa31caa Cycles: Tweak empty boundbox children
The idea here is to make assert failure to fail sooner on an incorrect
node address rather than later with stack overflow.
2016-09-13 11:05:11 +02:00
Sergey Sharybin
52038fd8c7 Fix T49290: Specific .blend with hair crashes in MacOS 2.78 RC1 on render
The issue was caused by some false-positive empty non-AABB intersection.
Tried to tweak it a bit so it does not record intersection anymore.

Hopefully will work for all platforms. Tested here on iMac and Debian.
2016-09-13 10:59:48 +02:00
Sergey Sharybin
70e7c0829e Cycles: Deduplicate QBVH node packing across BVH build and refit 2016-09-09 11:32:05 +02:00
Sergey Sharybin
6de08f6cd1 Cycles: Fix regular BVH nodes refit
For proper indexing to work we need to use unaligned node with
identity transform instead of aligned nodes when doing refit.

To be backported to 2.78 release.
2016-09-08 15:08:35 +02:00
Sergey Sharybin
27e2317513 Cycles: Add asserts to BVH node packing 2016-09-08 15:03:55 +02:00
Sergey Sharybin
3598a3d1d5 Cycles: Cleanup: line wrapping 2016-09-08 14:26:10 +02:00
Sergey Sharybin
ac061de20d Cycles: Fix refitting of regular BVH
Was causing CUDA issues on viewport edits.
2016-07-15 18:12:34 +02:00
Sergey Sharybin
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
1a2012145d Cycles: Switch node address to absolute values in BVH tree
This seems to be straightforward way to support heterogeneous nodes
in the same tree.

There is some penalty related on 4gig limit of the address space now,
but here's are the thing:

Traversal code was already using ints to store final offset, so
there can't be regressions really.

This is a required commit to make it possible to encode both aligned
and unaligned nodes in the same array. Also, in the future we can use
this to get rid of __leaf_nodes array (which is a bit tricky to do since
trickery in pack_instances().
2016-07-07 17:25:48 +02:00
Sergey Sharybin
17e7454263 Cycles: Reduce memory usage by de-duplicating triangle storage
There are several internal changes for this:

First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().

In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.

Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.

Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.

Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,

On a positive site we've achieved:

- Few percent of memory save with triangle-only scenes. Actual save
  in this case is close to size of all vertices.

  On a more fine-subdivided scenes this benefit might become more
  obvious.

- Huge memory save of hairy scenes. For example, on koro.blend
  there is about 20% memory save. Similar figure for bunny.blend.

This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.

But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
1eacbf47e3 Cycles: Support visibility check for inner nodes of QBVH
It was initially unsupported because initial idea of checking visibility
of all children was slowing scenes down a lot. Now the idea has changed
and we only perform visibility check of current node. This avoids huge
slowdown (from tests here it seems to be withing 1-2%, but more tests
would never hurt) and gives nice speedup of ray traversal for complex
scenes which utilized ray visibility.

Here's timing of koro.blend:

                  Without visibility check         With visibility check
Original file           4min 20sec                      4min 23sec
Camera rays only        1min 43 sec                       55sec

Unfortunately, this doesn't come for free and requires extra data in
BVH node, which increases memory usage of BVH nodes by 15%. This we
can solve with some future trickery of avoiding __tri_storage created
for curve segments.
2016-07-07 17:25:48 +02:00
c96a4c8a2a Code refactor: modify mesh storage to use arrays rather than vectors, separate some arrays.
Differential Revision: https://developer.blender.org/D2016
2016-05-28 18:31:00 +02:00
Sergey Sharybin
6a7378f50f Cycles: Proper pack of leaves which are bigger than single float4 2016-04-25 18:57:37 +02:00
Sergey Sharybin
e4cdda548a Cycles: Remove unused SAH from BVH pack 2016-04-11 17:18:14 +02:00
Sergey Sharybin
6cd13a221f Cycles: Rename tri_woop to tri_storage
It's no longer a pre-computed data and just a storage of triangle
coordinates which are faster to access to.
2016-04-11 17:18:14 +02:00
Sergey Sharybin
0e47e0cc9e Cycles: Use dedicated BVH for subsurface ray casting
This commit makes it so casting subsurface rays will totally ignore all
the BVH nodes and primitives which do not belong to a current object,
making it much simpler traversal code and reduces number of intersection
tests.

Reviewers: brecht, juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1823
2016-03-25 13:42:13 +01:00
Thomas Dinges
97a3fa17d6 Cleanup: Remove some more BVH cache code, for reading/writing the cache. 2015-09-24 16:49:10 +02:00
Thomas Dinges
dfadf18659 Cleanup: Remove some underlying code for the BVH disk cache.
Notes:
- There is still some bvh cache code, but that is from the engines initial commit, we might clean this up further or keep it.
- Changes in util_cache.h/.c are kept, this might be re-used in the future.
2015-09-24 15:47:27 +02:00
Sergey Sharybin
68478aea01 Cycles: Avoid having duplication of BVH arrays during build
Previous idea behind having vector during building and array for actual storage
was needed in order to minimize amount of re-allocations happening during the
build, but it lead to double memory overhead used by those arrays at the vector
to array conversion stage.

Issue with such approach was that for BVH without spatial split size of arrays
is known in advance and it never changes, which made vector to array conversion
totally redundant.

Also after testing with several rather complex from spatial split scenes (such
as trees) it seems even conservative approach of reallocation (when we perform
re-allocation when leaf does not fit into the memory) doesn't give measurable
difference in time.

This makes it so we can switch to array, which will avoid unneeded memory
re-allocations when spatial split is disabled without harming other cases.

it's a bit difficult to measure exact benefit of this change on our production
files here, but depending on the scene it might give quite reasonable memory
save.
2015-06-28 18:15:25 +02:00
Sergey Sharybin
2e91bcfb9d Fix T44544: Cached BVH is broken since BVH leaf split
Still need to solve issues with reading old cache with new builds.
2015-04-29 15:38:07 +05:00
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
d4c1e98dd4 Fix T43484: Motion blur fails in certain circumstances
The issue was caused by mismatch in how aligned triangles storage was
filled in during BVH construction and how it was used during rendering.

Basically, i  was leaving uninitialized storage for triangles when
there was deformation motion blur detected for the mesh. Was likely
some sort of optimization, but in fact it's still possible that regular
triangles would be needed for rendering.

So now we're storing aligned storage for all triangle primitives and
only skipping motion triangles (the deformation motion blur flag from
mesh is now ignored).
2015-03-09 14:15:35 +05:00
Sergey Sharybin
3bc9ac19f5 Cycles: Free memory used by intermediate BVH vectors earlier
Ideally we should get rid of those temporary vectors anyway, but
it's not so trivial because of the alignment. For untl then we'll
just have a bit worse solution. This part of code is not the root
of the issue of memory spikes for now anyway.

But since we're getting rid of temporary memory earlier actual spike
is a bit smaller as now. For example in franck_sheep file it's now
5489.69MB vs. previously 5599.90MB.
2015-02-19 18:58:21 +05:00
Thomas Dinges
bd92168643 Cycles / BVH: Remove unused temp copy of prim_object.
This will save some memory during BVH Build.
2015-02-18 01:14:59 +01:00
Sergey Sharybin
cb2007906f Cycles: Use bool for is_lead array
This way we save 3 bytes per BVH node while building BVH, which overall
gives 100Mb memory save when preparing Frank for render.

It's not really much comparing to overall memory usage (which is 11Gb
during scene preparation here) but still doesn't harm to have solved.
2015-01-31 01:49:41 +05:00
Sergey Sharybin
1841b12900 Cycles: Add assert check to triangle packing
Handy for troubleshooting.
2015-01-22 14:27:13 +05:00
Sergey Sharybin
e6c79b7369 Cycles: Fix QBVH refit nodes not setting primitive type properly 2015-01-14 02:17:28 +05:00
Sergey Sharybin
51779d9407 Cycles: Fix crash after recent BVH changes on empty BVH trees
It's apparently not nice to access 0th element of zero-size vector in C++.
2015-01-12 19:11:32 +05:00
Sergey Sharybin
b56f5900dc Cycles: BVH params option to split leaf node by primitive types
The idea of this change is make it possible to split leaf nodes by primitive
type, making leaf containing primitives of the same type.

This would become handy when working on a single ray to multiple triangles
intersection code, plus with careful implementation it might give some extra
benefits on BVH traversal code by avoiding primitive type fetch and check for
each primitive in the node. But that's a bit tricky to have benefits on this
change only because depth of BVH increases.

This option is not exposed to the interface at all and not used even secretly,
the commit is only needed to help working further in this direction without
messing around with local patches and worrying of them running out of date.
2015-01-12 14:49:56 +05:00
Campbell Barton
4abe548527 cleanup: style 2015-01-02 19:29:00 +11:00
Sergey Sharybin
b11a2f7075 Cycles: Mark visibility TODO as resolved 2014-12-27 23:38:29 +05:00
Thomas Dinges
4ab821c675 Cleanup: Typo fixes for comments. 2014-12-25 02:42:06 +01:00
Sergey Sharybin
deb06c457d Cycles: Correction for node tail copy on packing BVH
This is harmless for now because tail of the node is zero in there, but better
to fix it early so in the case of extending BVH nodes this code doesn't give
issues.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
03f28553ff Cycles: Implement QBVH tree traversal
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.

This commit also does some changes to the loop inspired by Embree:

- Visibility flags are only checked for primitives.

  Doing visibility check for every node cost quite reasonable amount of time
  and in most cases those checks are true-positive.

  Other idea here would be to do visibility checks for leaf nodes only, but
  this would need to be investigated further.

- For minimum hair width we extend all the nodes' bounding boxes.

  Again doing curve visibility check is quite costly for each of the nodes and
  those checks returns truth for most of the hierarchy anyway.

There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.

Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).

The code is enabled for compilation in kernel. but blender wouldn't use it
still.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
788fb8321a Cycles: Store proper empty boundbox for missing child nodes in QBVH
The idea is to make sure those childs would never be intersected with a ray
in order to make it so kernel never worries about number of child nodes.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f770bc4757 Cycles: Implement watertight ray/triangle intersection
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection

  http://jcgt.org/published/0002/01/05/paper.pdf

This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.

Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.

But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.

Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.

Reviewers: brecht, juicyfruit

Subscribers: dingto, ton

Differential Revision: https://developer.blender.org/D819
2014-12-25 02:50:49 +05:00
Sergey Sharybin
57d235d9f4 Cycles: Optimize storage of QBVH node by one float4
The idea is to store visibility flags for leaf nodes only since visibility check
for inner nodes costs too much for QBVH hence it is not optimal to perform.

Leaf QBVH nodes have plenty of space to store all sort of flags, so we can make
nodes one element smaller, saving noticeable amount of memory.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
144096faad Cycles: Make it more clear offsets in BVH construction
Previously offsets were calculated based on the BVH node size,
which is wrong and real PITA in cases when some extra data is
to be added into (or removed from) the node.

Now use offsets which are not calculated form the node size.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f27d87d300 Cycles: Replace magic constant in the code with actual node size 2014-12-25 02:50:49 +05:00