Commit Graph

24 Commits

Author SHA1 Message Date
Sergey Sharybin
70e7c0829e Cycles: Deduplicate QBVH node packing across BVH build and refit 2016-09-09 11:32:05 +02:00
Sergey Sharybin
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
17e7454263 Cycles: Reduce memory usage by de-duplicating triangle storage
There are several internal changes for this:

First idea is to make __tri_verts to behave similar to __tri_storage,
meaning, __tri_verts array now contains all vertices of all triangles
instead of just mesh vertices. This saves some lookup when reading
triangle coordinates in functions like triangle_normal().

In order to make it efficient needed to store global triangle offset
somewhere. So no __tri_vindex.w contains a global triangle index which
can be used to read triangle vertices.

Additionally, the order of vertices in that array is aligned with
primitives from BVH. This is needed to keep cache as much coherent as
possible for BVH traversal. This causes some extra tricks needed to
fill the array in and deal with True Displacement but those trickery
is fully required to prevent noticeable slowdown.

Next idea was to use this __tri_verts instead of __tri_storage in
intersection code. Unfortunately, this is quite tricky to do without
noticeable speed loss. Mainly this loss is caused by extra lookup
happening to access vertex coordinate.

Fortunately, tricks here and there (i,e, some types changes to avoid
casts which are not really coming for free) reduces those losses to
an acceptable level. So now they are within couple of percent only,

On a positive site we've achieved:

- Few percent of memory save with triangle-only scenes. Actual save
  in this case is close to size of all vertices.

  On a more fine-subdivided scenes this benefit might become more
  obvious.

- Huge memory save of hairy scenes. For example, on koro.blend
  there is about 20% memory save. Similar figure for bunny.blend.

This memory save was the main goal of this commit to move forward
with Hair BVH which required more memory per BVH node. So while
this sounds exciting, this memory optimization will become invisible
by upcoming Hair BVH work.

But again on a positive side, we can add an option to NOT use Hair
BVH and then we'll have same-ish render times as we've got currently
but will have this 20% memory benefit on hairy scenes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
1eacbf47e3 Cycles: Support visibility check for inner nodes of QBVH
It was initially unsupported because initial idea of checking visibility
of all children was slowing scenes down a lot. Now the idea has changed
and we only perform visibility check of current node. This avoids huge
slowdown (from tests here it seems to be withing 1-2%, but more tests
would never hurt) and gives nice speedup of ray traversal for complex
scenes which utilized ray visibility.

Here's timing of koro.blend:

                  Without visibility check         With visibility check
Original file           4min 20sec                      4min 23sec
Camera rays only        1min 43 sec                       55sec

Unfortunately, this doesn't come for free and requires extra data in
BVH node, which increases memory usage of BVH nodes by 15%. This we
can solve with some future trickery of avoiding __tri_storage created
for curve segments.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
e4cdda548a Cycles: Remove unused SAH from BVH pack 2016-04-11 17:18:14 +02:00
Sergey Sharybin
6cd13a221f Cycles: Rename tri_woop to tri_storage
It's no longer a pre-computed data and just a storage of triangle
coordinates which are faster to access to.
2016-04-11 17:18:14 +02:00
Thomas Dinges
97a3fa17d6 Cleanup: Remove some more BVH cache code, for reading/writing the cache. 2015-09-24 16:49:10 +02:00
Thomas Dinges
dfadf18659 Cleanup: Remove some underlying code for the BVH disk cache.
Notes:
- There is still some bvh cache code, but that is from the engines initial commit, we might clean this up further or keep it.
- Changes in util_cache.h/.c are kept, this might be re-used in the future.
2015-09-24 15:47:27 +02:00
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Thomas Dinges
bd92168643 Cycles / BVH: Remove unused temp copy of prim_object.
This will save some memory during BVH Build.
2015-02-18 01:14:59 +01:00
Sergey Sharybin
cb2007906f Cycles: Use bool for is_lead array
This way we save 3 bytes per BVH node while building BVH, which overall
gives 100Mb memory save when preparing Frank for render.

It's not really much comparing to overall memory usage (which is 11Gb
during scene preparation here) but still doesn't harm to have solved.
2015-01-31 01:49:41 +05:00
Sergey Sharybin
f770bc4757 Cycles: Implement watertight ray/triangle intersection
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection

  http://jcgt.org/published/0002/01/05/paper.pdf

This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.

Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.

But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.

Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.

Reviewers: brecht, juicyfruit

Subscribers: dingto, ton

Differential Revision: https://developer.blender.org/D819
2014-12-25 02:50:49 +05:00
Sergey Sharybin
57d235d9f4 Cycles: Optimize storage of QBVH node by one float4
The idea is to store visibility flags for leaf nodes only since visibility check
for inner nodes costs too much for QBVH hence it is not optimal to perform.

Leaf QBVH nodes have plenty of space to store all sort of flags, so we can make
nodes one element smaller, saving noticeable amount of memory.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
8cfac731a5 Cycles: Implement refit_nodes for QBVH
Title says it all, quite straightforward implementation.

Would only mention that there's a bit of code duplication around packing node
into pack.nodes. Trying to de-duplicate it ends up in quite hairy code (like
functions with loads of arguments some of which could be NULL in certain
circumstances etc..). Leaving solving this duplication for later.
2014-12-25 02:50:49 +05:00
0509553b5e Cycles code refactor: changes to make adding new primitive types easier. 2014-03-29 13:03:46 +01:00
Thomas Dinges
5ce3588c6c Cycles:
* Increase the maximum amount of closures per shader from 16 to 64, so more complex closure trees can be rendered.

I measured performance on CPU and GPU (Geforce 540M) and couldn't find a performance impact, but if someone encounters a noticeable impact on his system, please report.
2013-07-30 09:26:45 +00:00
Brecht Van Lommel
bf25f1ea96 Cycles Hair: refactoring to store curves with the index of the first key and the
number of keys in the curve, rather than curve segments with the indices of two
keys. ShaderData.segment now stores the segment number in the curve.
2013-01-03 12:09:09 +00:00
Stuart Broadfoot
e9ba345c46 New feature
Patch [#33445] - Experimental Cycles Hair Rendering (CPU only)

This patch allows hair data to be exported to cycles and introduces a new line segment primitive to render with.

The UI appears under the particle tab and there is a new hair info node available.

It is only available under the experimental feature set and for cpu rendering.
2012-12-28 14:21:30 +00:00
Campbell Barton
0fbb6bff27 style cleanup: block comments 2012-06-09 17:22:52 +00:00
Brecht Van Lommel
92764260d7 Cycles: add option to cache BVH's between subsequent renders, storing the BVH on
disk to be reused by the next render.

This is useful for rendering animations where only the camera or materials change.
Note that saving the BVH to disk only to be removed for the next frame is slower
if this is not the case and the meshes do actually change.

For a render, it will save bvh files to the cache user directory, and remove all
cache files from other renders. The files are named using a MD5 hash based on the
mesh, to verify if the meshes are still the same.
2012-01-16 13:13:37 +00:00
Brecht Van Lommel
df625253ac Cycles:
* Add max diffuse/glossy/transmission bounces
* Add separate min/max for transparent depth
* Updated/added some presets that use these options
* Add ray visibility options for objects, to hide them from
  camera/diffuse/glossy/transmission/shadow rays
* Is singular ray output for light path node

Details here:
http://wiki.blender.org/index.php/Dev:2.5/Source/Render/Cycles/LightPaths
2011-09-01 15:53:36 +00:00
Brecht Van Lommel
18d709022e Cycles: clang build fixes. 2011-08-10 19:45:08 +00:00
Brecht Van Lommel
2996f08f84 Cycles: first batch of windows build fixes, not quite there yet. 2011-05-03 18:29:11 +00:00
Ton Roosendaal
da376e0237 Cycles render engine, initial commit. This is the engine itself, blender modifications and build instructions will follow later.
Cycles uses code from some great open source projects, many thanks them:

* BVH building and traversal code from NVidia's "Understanding the Efficiency of Ray Traversal on GPUs":
http://code.google.com/p/understanding-the-efficiency-of-ray-traversal-on-gpus/
* Open Shading Language for a large part of the shading system:
http://code.google.com/p/openshadinglanguage/
* Blender for procedural textures and a few other nodes.
* Approximate Catmull Clark subdivision from NVidia Mesh tools:
http://code.google.com/p/nvidia-mesh-tools/
* Sobol direction vectors from:
http://web.maths.unsw.edu.au/~fkuo/sobol/
* Film response functions from:
http://www.cs.columbia.edu/CAVE/software/softlib/dorf.php
2011-04-27 11:58:34 +00:00