blender/intern/cycles/bvh/bvh.h

205 lines
5.8 KiB
C
Raw Normal View History

/*
* Adapted from code copyright 2009-2010 NVIDIA Corporation
* Modifications Copyright 2011, Blender Foundation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#ifndef __BVH_H__
#define __BVH_H__
#include "bvh/bvh_params.h"
#include "util/util_types.h"
#include "util/util_vector.h"
CCL_NAMESPACE_BEGIN
class BVHNode;
struct BVHStackEntry;
class BVHParams;
class BoundBox;
class LeafNode;
class Object;
class Progress;
#define BVH_NODE_SIZE 4
#define BVH_NODE_LEAF_SIZE 1
#define BVH_QNODE_SIZE 8
#define BVH_QNODE_LEAF_SIZE 1
#define BVH_ALIGN 4096
#define TRI_NODE_SIZE 3
#define BVH_UNALIGNED_NODE_SIZE 7
#define BVH_UNALIGNED_QNODE_SIZE 14
/* Packed BVH
*
* BVH stored as it will be used for traversal on the rendering device. */
struct PackedBVH {
/* BVH nodes storage, one node is 4x int4, and contains two bounding boxes,
* and child, triangle or object indexes depending on the node type */
array<int4> nodes;
/* BVH leaf nodes storage. */
array<int4> leaf_nodes;
/* object index to BVH node index mapping for instances */
array<int> object_node;
Cycles: Reduce memory usage by de-duplicating triangle storage There are several internal changes for this: First idea is to make __tri_verts to behave similar to __tri_storage, meaning, __tri_verts array now contains all vertices of all triangles instead of just mesh vertices. This saves some lookup when reading triangle coordinates in functions like triangle_normal(). In order to make it efficient needed to store global triangle offset somewhere. So no __tri_vindex.w contains a global triangle index which can be used to read triangle vertices. Additionally, the order of vertices in that array is aligned with primitives from BVH. This is needed to keep cache as much coherent as possible for BVH traversal. This causes some extra tricks needed to fill the array in and deal with True Displacement but those trickery is fully required to prevent noticeable slowdown. Next idea was to use this __tri_verts instead of __tri_storage in intersection code. Unfortunately, this is quite tricky to do without noticeable speed loss. Mainly this loss is caused by extra lookup happening to access vertex coordinate. Fortunately, tricks here and there (i,e, some types changes to avoid casts which are not really coming for free) reduces those losses to an acceptable level. So now they are within couple of percent only, On a positive site we've achieved: - Few percent of memory save with triangle-only scenes. Actual save in this case is close to size of all vertices. On a more fine-subdivided scenes this benefit might become more obvious. - Huge memory save of hairy scenes. For example, on koro.blend there is about 20% memory save. Similar figure for bunny.blend. This memory save was the main goal of this commit to move forward with Hair BVH which required more memory per BVH node. So while this sounds exciting, this memory optimization will become invisible by upcoming Hair BVH work. But again on a positive side, we can add an option to NOT use Hair BVH and then we'll have same-ish render times as we've got currently but will have this 20% memory benefit on hairy scenes.
2016-06-10 14:13:50 +00:00
/* Mapping from primitive index to index in triangle array. */
array<uint> prim_tri_index;
/* Continuous storage of triangle vertices. */
array<float4> prim_tri_verts;
/* primitive type - triangle or strand */
array<int> prim_type;
/* visibility visibilitys for primitives */
array<uint> prim_visibility;
/* mapping from BVH primitive index to true primitive index, as primitives
2012-06-09 17:22:52 +00:00
* may be duplicated due to spatial splits. -1 for instances. */
array<int> prim_index;
/* mapping from BVH primitive index, to the object id of that primitive. */
array<int> prim_object;
Cycles: Fix wrong hair render results when using BVH motion steps The issue here was mainly coming from minimal pixel width feature which is quite commonly enabled in production shots. This feature will use some probabilistic heuristic in the curve intersection function to check whether we need to return intersection or not. This probability is calculated for every intersection check. Now, when we use multiple BVH nodes for curve primitives we increase probability of that primitive to be considered a good intersection for us. This is similar to increasing minimal width of curve. What is worst here is that change in the intersection probability fully depends on exact layout of BVH, meaning probability might change differently depending on a view angle, the way how builder binned the primitives and such. This makes it impossible to do simple check like dividing probability by number of BVH steps. Other solution might have been to split BVH into fully independent trees, but that will increase memory usage of all the static objects in the scenes, which is also not something desirable. For now used most simple but robust approach: store BVH primitives time and test it in curve intersection functions. This solves the regression, but has two downsides: - Uses more memory. which isn't surprising, and ANY solution to this problem will use more memory. What we still have to do is to avoid this memory increase for cases when we don't use BVH motion steps. - Reduces number of maximum available textures on pre-kepler cards. There is not much we can do here, hardware gets old but we need to move forward on more modern hardware..
2017-02-15 09:56:54 +00:00
/* Time range of BVH primitive. */
array<float2> prim_time;
/* index of the root node. */
int root_index;
PackedBVH()
{
root_index = 0;
}
};
/* BVH */
class BVH
{
public:
PackedBVH pack;
BVHParams params;
vector<Object*> objects;
static BVH *create(const BVHParams& params, const vector<Object*>& objects);
2011-08-10 19:45:08 +00:00
virtual ~BVH() {}
void build(Progress& progress);
void refit(Progress& progress);
protected:
BVH(const BVHParams& params, const vector<Object*>& objects);
Cycles: Reduce memory usage by de-duplicating triangle storage There are several internal changes for this: First idea is to make __tri_verts to behave similar to __tri_storage, meaning, __tri_verts array now contains all vertices of all triangles instead of just mesh vertices. This saves some lookup when reading triangle coordinates in functions like triangle_normal(). In order to make it efficient needed to store global triangle offset somewhere. So no __tri_vindex.w contains a global triangle index which can be used to read triangle vertices. Additionally, the order of vertices in that array is aligned with primitives from BVH. This is needed to keep cache as much coherent as possible for BVH traversal. This causes some extra tricks needed to fill the array in and deal with True Displacement but those trickery is fully required to prevent noticeable slowdown. Next idea was to use this __tri_verts instead of __tri_storage in intersection code. Unfortunately, this is quite tricky to do without noticeable speed loss. Mainly this loss is caused by extra lookup happening to access vertex coordinate. Fortunately, tricks here and there (i,e, some types changes to avoid casts which are not really coming for free) reduces those losses to an acceptable level. So now they are within couple of percent only, On a positive site we've achieved: - Few percent of memory save with triangle-only scenes. Actual save in this case is close to size of all vertices. On a more fine-subdivided scenes this benefit might become more obvious. - Huge memory save of hairy scenes. For example, on koro.blend there is about 20% memory save. Similar figure for bunny.blend. This memory save was the main goal of this commit to move forward with Hair BVH which required more memory per BVH node. So while this sounds exciting, this memory optimization will become invisible by upcoming Hair BVH work. But again on a positive side, we can add an option to NOT use Hair BVH and then we'll have same-ish render times as we've got currently but will have this 20% memory benefit on hairy scenes.
2016-06-10 14:13:50 +00:00
/* triangles and strands */
void pack_primitives();
void pack_triangle(int idx, float4 storage[3]);
/* merge instance BVH's */
void pack_instances(size_t nodes_size, size_t leaf_nodes_size);
/* for subclasses to implement */
virtual void pack_nodes(const BVHNode *root) = 0;
virtual void refit_nodes() = 0;
};
/* Binary BVH
*
* Typical BVH with each node having two children. */
class BinaryBVH : public BVH {
protected:
/* constructor */
friend class BVH;
BinaryBVH(const BVHParams& params, const vector<Object*>& objects);
/* pack */
void pack_nodes(const BVHNode *root);
void pack_leaf(const BVHStackEntry& e,
const LeafNode *leaf);
void pack_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1);
void pack_aligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1);
void pack_aligned_node(int idx,
const BoundBox& b0,
const BoundBox& b1,
int c0, int c1,
uint visibility0, uint visibility1);
void pack_unaligned_inner(const BVHStackEntry& e,
const BVHStackEntry& e0,
const BVHStackEntry& e1);
void pack_unaligned_node(int idx,
const Transform& aligned_space0,
const Transform& aligned_space1,
const BoundBox& b0,
const BoundBox& b1,
int c0, int c1,
uint visibility0, uint visibility1);
/* refit */
void refit_nodes();
void refit_node(int idx, bool leaf, BoundBox& bbox, uint& visibility);
};
/* QBVH
*
* Quad BVH, with each node having four children, to use with SIMD instructions. */
class QBVH : public BVH {
protected:
/* constructor */
friend class BVH;
QBVH(const BVHParams& params, const vector<Object*>& objects);
/* pack */
void pack_nodes(const BVHNode *root);
void pack_leaf(const BVHStackEntry& e, const LeafNode *leaf);
void pack_inner(const BVHStackEntry& e, const BVHStackEntry *en, int num);
void pack_aligned_inner(const BVHStackEntry& e,
const BVHStackEntry *en,
int num);
void pack_aligned_node(int idx,
const BoundBox *bounds,
const int *child,
const uint visibility,
const float time_from,
const float time_to,
const int num);
void pack_unaligned_inner(const BVHStackEntry& e,
const BVHStackEntry *en,
int num);
void pack_unaligned_node(int idx,
const Transform *aligned_space,
const BoundBox *bounds,
const int *child,
const uint visibility,
const float time_from,
const float time_to,
const int num);
/* refit */
void refit_nodes();
void refit_node(int idx, bool leaf, BoundBox& bbox, uint& visibility);
};
CCL_NAMESPACE_END
#endif /* __BVH_H__ */