Commit Graph

106 Commits

Author SHA1 Message Date
Thomas Dinges
97a3fa17d6 Cleanup: Remove some more BVH cache code, for reading/writing the cache. 2015-09-24 16:49:10 +02:00
Thomas Dinges
dfadf18659 Cleanup: Remove some underlying code for the BVH disk cache.
Notes:
- There is still some bvh cache code, but that is from the engines initial commit, we might clean this up further or keep it.
- Changes in util_cache.h/.c are kept, this might be re-used in the future.
2015-09-24 15:47:27 +02:00
Sergey Sharybin
4ac5859f05 Cycles: Avoid copying objects in some places of BVH build
Gives barely measurable speedup of Spatial Split BVH build.
2015-08-30 16:48:16 +02:00
Sergey Sharybin
2960630b7b Cycles: Use better policy for primitive array resize for spatial split
Gives around 50% of spatial split BVH build speedup with grass field from
cassette player shot from Gooseberry.
2015-08-24 09:46:41 +02:00
Sergey Sharybin
2230130099 Cycles: Speedup of Spatial BVH split code
Avoid memmove() happening on every insert of duplicated node to the references
list. Temporary pre-allocated vector is used for new references which is then
being inserted into actual array in one go later.

Gives around 4x speedup building spatially split BVH for the grass field in the
cassette player shot from Gooseberry.
2015-08-24 09:46:40 +02:00
Sergey Sharybin
334208e670 Cycles: Implementation of object reference nodes spatial split
This commit implements object reference node spatial split making it possible
to use spatial split for top-level BVH.

The code is not in use yet because enabling spatial split on top level BVH is
not coming for free and it needs to be investigated if it's worth in terms of
improved render times.
2015-08-24 09:46:40 +02:00
Sergey Sharybin
c88c5db360 Cycles: Make primitive split code easier for re-use by reference splitting function 2015-08-24 09:46:40 +02:00
Sergey Sharybin
99c1870ad5 Cycles: Move primitive splitting into own functions
This way it's easy to add more reference types allowed for splitting to the
BVH reference split function without making this function too much big. This
way it's possible to experiment with such features as splitting object instance
references.

So far should not be any functional changes.
2015-08-24 09:46:40 +02:00
Sergey Sharybin
9b57d70f3b Cycles: BVH statistics print was missing for spatially split BVH tree 2015-08-24 09:46:40 +02:00
Sergey Sharybin
68478aea01 Cycles: Avoid having duplication of BVH arrays during build
Previous idea behind having vector during building and array for actual storage
was needed in order to minimize amount of re-allocations happening during the
build, but it lead to double memory overhead used by those arrays at the vector
to array conversion stage.

Issue with such approach was that for BVH without spatial split size of arrays
is known in advance and it never changes, which made vector to array conversion
totally redundant.

Also after testing with several rather complex from spatial split scenes (such
as trees) it seems even conservative approach of reallocation (when we perform
re-allocation when leaf does not fit into the memory) doesn't give measurable
difference in time.

This makes it so we can switch to array, which will avoid unneeded memory
re-allocations when spatial split is disabled without harming other cases.

it's a bit difficult to measure exact benefit of this change on our production
files here, but depending on the scene it might give quite reasonable memory
save.
2015-06-28 18:15:25 +02:00
Campbell Barton
e59bd19fa7 Cleanup: style & const's 2015-05-05 05:19:49 +10:00
Sergey Sharybin
2e91bcfb9d Fix T44544: Cached BVH is broken since BVH leaf split
Still need to solve issues with reading old cache with new builds.
2015-04-29 15:38:07 +05:00
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Sergey Sharybin
dd0604c606 Fix T44193: Hair intersection with duplis causes flickering
It was an issue with what bounds to use for BVH node during construction.

Also corrected case when there are all 4 primitive types in the range and
also there're objects in the same range.
2015-03-31 00:24:43 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Sergey Sharybin
919a665497 Cycles: Avoid memcpy of intersecting memory
Could happen when assignment happens to self during sorting.
2015-03-20 21:14:50 +05:00
Sergey Sharybin
d4c1e98dd4 Fix T43484: Motion blur fails in certain circumstances
The issue was caused by mismatch in how aligned triangles storage was
filled in during BVH construction and how it was used during rendering.

Basically, i  was leaving uninitialized storage for triangles when
there was deformation motion blur detected for the mesh. Was likely
some sort of optimization, but in fact it's still possible that regular
triangles would be needed for rendering.

So now we're storing aligned storage for all triangle primitives and
only skipping motion triangles (the deformation motion blur flag from
mesh is now ignored).
2015-03-09 14:15:35 +05:00
Sergey Sharybin
3bc9ac19f5 Cycles: Free memory used by intermediate BVH vectors earlier
Ideally we should get rid of those temporary vectors anyway, but
it's not so trivial because of the alignment. For untl then we'll
just have a bit worse solution. This part of code is not the root
of the issue of memory spikes for now anyway.

But since we're getting rid of temporary memory earlier actual spike
is a bit smaller as now. For example in franck_sheep file it's now
5489.69MB vs. previously 5599.90MB.
2015-02-19 18:58:21 +05:00
Thomas Dinges
7f7413bce2 Cleanup: Use bools in BVHParams class. 2015-02-18 12:05:59 +01:00
Thomas Dinges
bd92168643 Cycles / BVH: Remove unused temp copy of prim_object.
This will save some memory during BVH Build.
2015-02-18 01:14:59 +01:00
Sergey Sharybin
cb2007906f Cycles: Use bool for is_lead array
This way we save 3 bytes per BVH node while building BVH, which overall
gives 100Mb memory save when preparing Frank for render.

It's not really much comparing to overall memory usage (which is 11Gb
during scene preparation here) but still doesn't harm to have solved.
2015-01-31 01:49:41 +05:00
Sergey Sharybin
fa46f5a289 Fix T43357: Cycles crash with spatial splits after recent changes
When doing BVH leaf node split we can't rely on leaf size limit from
BVH parameters in case there's spatial split enabled.

This commit basically reverts previous optimization change here which
used stack-allocated memory and uses heap-allocated vector now.

It's possible to boost this code up again by using own allocator.
2015-01-22 14:56:00 +05:00
Sergey Sharybin
1841b12900 Cycles: Add assert check to triangle packing
Handy for troubleshooting.
2015-01-22 14:27:13 +05:00
Sergey Sharybin
2f4aef9f3b Cycles: Avoid crash in statistics when canceling BVH build
Also add missing render_time initialization in progress.
2015-01-19 13:39:35 +05:00
Sergey Sharybin
5d5077957e Cycles; Correction to previous debug print to survive prints from multiple threads
This commit basically makes it so statistics print from different BVH trees are not
being interleaved with each other. Glog ensures this when debug print is done as a
single put to stream operator.
2015-01-16 16:39:02 +05:00
Sergey Sharybin
5684ad8072 Cycles: Report BVH statistics after build 2015-01-16 15:05:53 +05:00
Sergey Sharybin
3f60d665bb Cycles: Fix stupid typo in the previous commit 2015-01-16 02:21:35 +05:00
Sergey Sharybin
146eb7947e Cycles: Tweak to leaf creation criteria in all BVH types
Since leaf node gets split further into per-primitive type leaves old check
for number of curves became a bit ridiculous -- it might lead to two leaf nodes
each of which would contain only one curve primitive (one motion curve and one
regular curve).

This lead to quite dramatic slowdown for Victor model -- around 40%, which is
totally unacceptable.

This commit is aimed to prevent such situation and from quick render test it
seems victor is now back to normal render time. Further testing is needed tho.

There are also other ideas about splitting the node, will need to look into
them next.
2015-01-16 01:42:58 +05:00
Sergey Sharybin
e6c79b7369 Cycles: Fix QBVH refit nodes not setting primitive type properly 2015-01-14 02:17:28 +05:00
Sergey Sharybin
51779d9407 Cycles: Fix crash after recent BVH changes on empty BVH trees
It's apparently not nice to access 0th element of zero-size vector in C++.
2015-01-12 19:11:32 +05:00
Sergey Sharybin
e8730af87f Cycles: Fix compilation error on platforms without SSE support
Overview this in one of the previous BVH commits.
2015-01-12 17:14:40 +05:00
Sergey Sharybin
bc7ff3c2b4 Cycles: Enable leaf split by primitive type and adopt BVH traversal for this
This commit enables BVH leaf nodes split by the primitive type and makes it
so BVH traversal code is now aware and benefits from this.

As was mentioned in original commit, this change is crucial to be able to do
single ray to multiple triangle intersection. But it also appears to give
barely visible speedup in some scene.

In any case there should be no noticeable slowdown, and this change is what
we need to have anyway.
2015-01-12 15:04:52 +05:00
Sergey Sharybin
c707b91ce6 Cycles: Optimize leaf splitting code by avoid vector allocation
Use variables allocated in the stack and avoid heap allocation which should make
leaf splitting code a bit faster.
2015-01-12 14:49:59 +05:00
Sergey Sharybin
b56f5900dc Cycles: BVH params option to split leaf node by primitive types
The idea of this change is make it possible to split leaf nodes by primitive
type, making leaf containing primitives of the same type.

This would become handy when working on a single ray to multiple triangles
intersection code, plus with careful implementation it might give some extra
benefits on BVH traversal code by avoiding primitive type fetch and check for
each primitive in the node. But that's a bit tricky to have benefits on this
change only because depth of BVH increases.

This option is not exposed to the interface at all and not used even secretly,
the commit is only needed to help working further in this direction without
messing around with local patches and worrying of them running out of date.
2015-01-12 14:49:56 +05:00
Campbell Barton
4abe548527 cleanup: style 2015-01-02 19:29:00 +11:00
Sergey Sharybin
b11a2f7075 Cycles: Mark visibility TODO as resolved 2014-12-27 23:38:29 +05:00
Thomas Dinges
4ab821c675 Cleanup: Typo fixes for comments. 2014-12-25 02:42:06 +01:00
Sergey Sharybin
deb06c457d Cycles: Correction for node tail copy on packing BVH
This is harmless for now because tail of the node is zero in there, but better
to fix it early so in the case of extending BVH nodes this code doesn't give
issues.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
03f28553ff Cycles: Implement QBVH tree traversal
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.

This commit also does some changes to the loop inspired by Embree:

- Visibility flags are only checked for primitives.

  Doing visibility check for every node cost quite reasonable amount of time
  and in most cases those checks are true-positive.

  Other idea here would be to do visibility checks for leaf nodes only, but
  this would need to be investigated further.

- For minimum hair width we extend all the nodes' bounding boxes.

  Again doing curve visibility check is quite costly for each of the nodes and
  those checks returns truth for most of the hierarchy anyway.

There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.

Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).

The code is enabled for compilation in kernel. but blender wouldn't use it
still.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
788fb8321a Cycles: Store proper empty boundbox for missing child nodes in QBVH
The idea is to make sure those childs would never be intersected with a ray
in order to make it so kernel never worries about number of child nodes.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f770bc4757 Cycles: Implement watertight ray/triangle intersection
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection

  http://jcgt.org/published/0002/01/05/paper.pdf

This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.

Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.

But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.

Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.

Reviewers: brecht, juicyfruit

Subscribers: dingto, ton

Differential Revision: https://developer.blender.org/D819
2014-12-25 02:50:49 +05:00
Sergey Sharybin
57d235d9f4 Cycles: Optimize storage of QBVH node by one float4
The idea is to store visibility flags for leaf nodes only since visibility check
for inner nodes costs too much for QBVH hence it is not optimal to perform.

Leaf QBVH nodes have plenty of space to store all sort of flags, so we can make
nodes one element smaller, saving noticeable amount of memory.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
144096faad Cycles: Make it more clear offsets in BVH construction
Previously offsets were calculated based on the BVH node size,
which is wrong and real PITA in cases when some extra data is
to be added into (or removed from) the node.

Now use offsets which are not calculated form the node size.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f27d87d300 Cycles: Replace magic constant in the code with actual node size 2014-12-25 02:50:49 +05:00
Sergey Sharybin
f4a959f734 Cycles: Avoid over-allocation in packing BVH instances
This solves quite an over-allocation in BVH instances packing code,
unfortunately, it's not a magic bullet to solve memory bump caused
by the recent QBVH changes.

For that we'll likely need to decouple storage for leaf and inner
nodes. However, it's not really clear for now if it's something
important since that'd still be just a fraction of memory comparing
to all the hi-res textures.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
8cfac731a5 Cycles: Implement refit_nodes for QBVH
Title says it all, quite straightforward implementation.

Would only mention that there's a bit of code duplication around packing node
into pack.nodes. Trying to de-duplicate it ends up in quite hairy code (like
functions with loads of arguments some of which could be NULL in certain
circumstances etc..). Leaving solving this duplication for later.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
fe4905288d Cycles: Use proper node counter to allocate QBVH nodes
Before all the nodes were counted and allocated, leading to situations when
bunch of allocated memory is not used because reasonable amount of nodes are
simply ignored.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
e77b25fabb Cycles: Code cleanup, typo 2014-12-10 00:08:33 +05:00
Sergey Sharybin
9311a5be04 Cycles: Speedup BVH build for certain compilers
The issue was noticed with gcc-4.7 (used by the release build environment)
which didn't generate optimal enough code for BVH references swap. Seems it
looked up for the assign operator for each of the reference structure members
even though nothing special was required for assignment.

Forcing compiler to use simple memory copy gives speedup of like 2-3 times.

The issue doesn't happen with OSX's clang and new gcc-4.9, but since we're
gonna to stick to gcc-4.7 for official releases for quite some time still it's
nice to have performance issues resolved for all the compilers.

Didn't put the code into #ifdef so if in the future some issues appears with
alignment or assignment which need to happen as an operator we notice this
earlier.
2014-11-24 18:50:46 +05:00