This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.
This translates into following memory save using 01.01.01.G rendered
without hair:
Device memory size Device memory peak Global memory peak
Before the patch: 4957 5051 7668
With the patch: 4467 4562 7332
The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.
Reviewers: brecht, juicyfruit
Subscribers: venomgfx, eyecandy
Differential Revision: https://developer.blender.org/D1236
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.
This commit also does some changes to the loop inspired by Embree:
- Visibility flags are only checked for primitives.
Doing visibility check for every node cost quite reasonable amount of time
and in most cases those checks are true-positive.
Other idea here would be to do visibility checks for leaf nodes only, but
this would need to be investigated further.
- For minimum hair width we extend all the nodes' bounding boxes.
Again doing curve visibility check is quite costly for each of the nodes and
those checks returns truth for most of the hierarchy anyway.
There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.
Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).
The code is enabled for compilation in kernel. but blender wouldn't use it
still.
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
These are internally stored as a 3D image textures, but accessible like e.g.
UV coordinates though the attribute node and getattribute().
This is convenient for rendering e.g. smoke objects where data like density is
really a property of the mesh, and it avoids having to specify the smoke object
in a texture node, instead the material will work with any smoke domain.