blender/source/gameengine/Rasterizer/RAS_MaterialBucket.cpp

673 lines
16 KiB
C++
Raw Normal View History

/*
2002-10-12 11:37:38 +00:00
* $Id$
* ***** BEGIN GPL LICENSE BLOCK *****
2002-10-12 11:37:38 +00:00
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
2002-10-12 11:37:38 +00:00
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
2010-02-12 13:34:04 +00:00
* Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
2002-10-12 11:37:38 +00:00
*
* The Original Code is Copyright (C) 2001-2002 by NaN Holding BV.
* All rights reserved.
*
* The Original Code is: all of this file.
*
* Contributor(s): none yet.
*
* ***** END GPL LICENSE BLOCK *****
2002-10-12 11:37:38 +00:00
*/
2011-02-25 13:38:24 +00:00
/** \file gameengine/Rasterizer/RAS_MaterialBucket.cpp
* \ingroup bgerast
*/
2002-10-12 11:37:38 +00:00
#include "RAS_MaterialBucket.h"
#if defined(WIN32) && !defined(FREE_WINDOWS)
2002-10-12 11:37:38 +00:00
#pragma warning (disable:4786)
#endif
#ifdef WIN32
2002-10-12 11:37:38 +00:00
#include <windows.h>
#endif // WIN32
#include "RAS_Polygon.h"
#include "RAS_TexVert.h"
#include "RAS_IRasterizer.h"
#include "RAS_IRenderTools.h"
#include "RAS_MeshObject.h"
#include "RAS_Deformer.h" // __NLA
/* mesh slot */
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A(*) all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s (*) : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
RAS_MeshSlot::RAS_MeshSlot() : SG_QList()
{
m_clientObj = NULL;
m_pDeformer = NULL;
m_OpenGLMatrix = NULL;
m_mesh = NULL;
m_bucket = NULL;
m_bVisible = false;
m_bCulled = true;
m_bObjectColor = false;
m_RGBAcolor = MT_Vector4(0.0, 0.0, 0.0, 0.0);
m_DisplayList = NULL;
m_bDisplayList = true;
m_joinSlot = NULL;
BGE: Support mesh modifiers in the game engine. Realtime modifiers applied on mesh objects will be supported in the game engine with the following limitations: - Only real time modifiers are supported (basically all of them!) - Virtual modifiers resulting from parenting are not supported: armature, curve, lattice. You can still use these modifiers (armature is really not recommended) but in non parent mode. The BGE has it's own parenting capability for armature. - Modifiers are computed on the host (using blender modifier stack). - Modifiers are statically evaluated: any possible time dependency in the modifiers is not supported (don't know enough about modifiers to be more specific). - Modifiers are reevaluated if the underlying mesh is deformed due to shape action or armature action. Beware that this is very CPU intensive; modifiers should really be used for static objects only. - Physics is still based on the original mesh: if you have a mirror modifier, the physic shape will be limited to one half of the resulting object. Therefore, the modifiers should preferably be used on graphic objects. - Scripts have no access to the modified mesh. - Modifiers that are based on objects interaction (boolean,..) will not be dependent on the objects position in the GE. What you see in the 3D view is what you get in the GE regardless on the object position, velocity, etc. Besides that, the feature is compatible with all the BGE features that affect meshes: armature action, shape action, relace mesh, VideoTexture, add object, dupligroup. Known problems: - This feature is a bit hacky: the BGE uses the derived mesh draw functions to display the object. This drawing method is a bit slow and is not 100% compatible with the BGE. There may be some problems in multi-texture mode: the multi-texture coordinates are not sent to the GPU. Texface and GLSL on the other hand should be fully supported. - Culling is still based on the extend of the original mesh. If you have a modifer that extends the size of the mesh, the object may disappear while still in the view frustrum. - Derived mesh is not shared between replicas. The derived mesh is allocated and computed for each object with modifiers, regardless if they are static replicas. - Display list are not created on objects with modifiers. I should be able to fix the above problems before release. However, the feature is already useful for game development. Once you are ready to release the game, you can apply the modifiers to get back display list support and mesh sharing capability. MSVC, scons, Cmake, makefile updated. Enjoy /benoit
2009-04-21 11:01:09 +00:00
m_pDerivedMesh = NULL;
}
RAS_MeshSlot::~RAS_MeshSlot()
{
RAS_DisplayArrayList::iterator it;
#ifdef USE_SPLIT
Split(true);
while(m_joinedSlots.size())
m_joinedSlots.front()->Split(true);
#endif
2002-10-12 11:37:38 +00:00
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
(*it)->m_users--;
if((*it)->m_users == 0)
delete *it;
}
if (m_DisplayList) {
m_DisplayList->Release();
m_DisplayList = NULL;
}
}
2002-10-12 11:37:38 +00:00
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A(*) all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s (*) : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
RAS_MeshSlot::RAS_MeshSlot(const RAS_MeshSlot& slot) : SG_QList()
2002-10-12 11:37:38 +00:00
{
RAS_DisplayArrayList::iterator it;
m_clientObj = NULL;
m_pDeformer = NULL;
BGE: Support mesh modifiers in the game engine. Realtime modifiers applied on mesh objects will be supported in the game engine with the following limitations: - Only real time modifiers are supported (basically all of them!) - Virtual modifiers resulting from parenting are not supported: armature, curve, lattice. You can still use these modifiers (armature is really not recommended) but in non parent mode. The BGE has it's own parenting capability for armature. - Modifiers are computed on the host (using blender modifier stack). - Modifiers are statically evaluated: any possible time dependency in the modifiers is not supported (don't know enough about modifiers to be more specific). - Modifiers are reevaluated if the underlying mesh is deformed due to shape action or armature action. Beware that this is very CPU intensive; modifiers should really be used for static objects only. - Physics is still based on the original mesh: if you have a mirror modifier, the physic shape will be limited to one half of the resulting object. Therefore, the modifiers should preferably be used on graphic objects. - Scripts have no access to the modified mesh. - Modifiers that are based on objects interaction (boolean,..) will not be dependent on the objects position in the GE. What you see in the 3D view is what you get in the GE regardless on the object position, velocity, etc. Besides that, the feature is compatible with all the BGE features that affect meshes: armature action, shape action, relace mesh, VideoTexture, add object, dupligroup. Known problems: - This feature is a bit hacky: the BGE uses the derived mesh draw functions to display the object. This drawing method is a bit slow and is not 100% compatible with the BGE. There may be some problems in multi-texture mode: the multi-texture coordinates are not sent to the GPU. Texface and GLSL on the other hand should be fully supported. - Culling is still based on the extend of the original mesh. If you have a modifer that extends the size of the mesh, the object may disappear while still in the view frustrum. - Derived mesh is not shared between replicas. The derived mesh is allocated and computed for each object with modifiers, regardless if they are static replicas. - Display list are not created on objects with modifiers. I should be able to fix the above problems before release. However, the feature is already useful for game development. Once you are ready to release the game, you can apply the modifiers to get back display list support and mesh sharing capability. MSVC, scons, Cmake, makefile updated. Enjoy /benoit
2009-04-21 11:01:09 +00:00
m_pDerivedMesh = NULL;
m_OpenGLMatrix = NULL;
m_mesh = slot.m_mesh;
m_bucket = slot.m_bucket;
m_bVisible = slot.m_bVisible;
m_bCulled = slot.m_bCulled;
m_bObjectColor = slot.m_bObjectColor;
m_RGBAcolor = slot.m_RGBAcolor;
m_DisplayList = NULL;
m_bDisplayList = slot.m_bDisplayList;
m_joinSlot = NULL;
m_currentArray = slot.m_currentArray;
m_displayArrays = slot.m_displayArrays;
m_joinedSlots = slot.m_joinedSlots;
m_startarray = slot.m_startarray;
m_startvertex = slot.m_startvertex;
m_startindex = slot.m_startindex;
m_endarray = slot.m_endarray;
m_endvertex = slot.m_endvertex;
m_endindex = slot.m_endindex;
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
// don't copy display arrays for now because it breaks python
// access to vertices, but we'll need a solution if we want to
// join display arrays for reducing draw calls.
//*it = new RAS_DisplayArray(**it);
//(*it)->m_users = 1;
(*it)->m_users++;
}
2002-10-12 11:37:38 +00:00
}
void RAS_MeshSlot::init(RAS_MaterialBucket *bucket, int numverts)
{
m_bucket = bucket;
SetDisplayArray(numverts);
2002-10-12 11:37:38 +00:00
m_startarray = 0;
m_startvertex = 0;
m_startindex = 0;
m_endarray = 0;
m_endvertex = 0;
m_endindex = 0;
}
2002-10-12 11:37:38 +00:00
void RAS_MeshSlot::begin(RAS_MeshSlot::iterator& it)
2002-10-12 11:37:38 +00:00
{
int startvertex, endvertex;
int startindex, endindex;
it.array = (m_displayArrays.size() > 0)? m_displayArrays[m_startarray]: NULL;
if(it.array == NULL || it.array->m_index.size() == 0 || it.array->m_vertex.size() == 0) {
it.array = NULL;
it.vertex = NULL;
it.index = NULL;
it.startvertex = 0;
it.endvertex = 0;
it.totindex = 0;
}
else {
startvertex = m_startvertex;
endvertex = (m_startarray == m_endarray)? m_endvertex: it.array->m_vertex.size();
startindex = m_startindex;
endindex = (m_startarray == m_endarray)? m_endindex: it.array->m_index.size();
it.vertex = &it.array->m_vertex[0];
it.index = &it.array->m_index[startindex];
it.startvertex = startvertex;
it.endvertex = endvertex;
it.totindex = endindex-startindex;
it.arraynum = m_startarray;
}
2002-10-12 11:37:38 +00:00
}
void RAS_MeshSlot::next(RAS_MeshSlot::iterator& it)
2002-10-12 11:37:38 +00:00
{
int startvertex, endvertex;
int startindex, endindex;
if(it.arraynum == (size_t)m_endarray) {
it.array = NULL;
it.vertex = NULL;
it.index = NULL;
it.startvertex = 0;
it.endvertex = 0;
it.totindex = 0;
}
else {
it.arraynum++;
it.array = m_displayArrays[it.arraynum];
startindex = 0;
endindex = (it.arraynum == (size_t)m_endarray)? m_endindex: it.array->m_index.size();
startvertex = 0;
endvertex = (it.arraynum == (size_t)m_endarray)? m_endvertex: it.array->m_vertex.size();
it.vertex = &it.array->m_vertex[0];
it.index = &it.array->m_index[startindex];
it.startvertex = startvertex;
it.endvertex = endvertex;
it.totindex = endindex-startindex;
}
2002-10-12 11:37:38 +00:00
}
bool RAS_MeshSlot::end(RAS_MeshSlot::iterator& it)
{
return (it.array == NULL);
}
2002-10-12 11:37:38 +00:00
RAS_DisplayArray *RAS_MeshSlot::CurrentDisplayArray()
{
return m_currentArray;
}
2002-10-12 11:37:38 +00:00
void RAS_MeshSlot::SetDisplayArray(int numverts)
2002-10-12 11:37:38 +00:00
{
RAS_DisplayArrayList::iterator it;
RAS_DisplayArray *darray = NULL;
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
darray = *it;
if(darray->m_type == numverts) {
if(darray->m_index.size()+numverts >= RAS_DisplayArray::BUCKET_MAX_INDEX)
darray = NULL;
else if(darray->m_vertex.size()+numverts >= RAS_DisplayArray::BUCKET_MAX_VERTEX)
darray = NULL;
else
break;
}
else
darray = NULL;
}
if(!darray) {
darray = new RAS_DisplayArray();
darray->m_users = 1;
if(numverts == 2) darray->m_type = RAS_DisplayArray::LINE;
else if(numverts == 3) darray->m_type = RAS_DisplayArray::TRIANGLE;
else darray->m_type = RAS_DisplayArray::QUAD;
m_displayArrays.push_back(darray);
if(numverts == 2)
darray->m_type = RAS_DisplayArray::LINE;
else if(numverts == 3)
darray->m_type = RAS_DisplayArray::TRIANGLE;
else if(numverts == 4)
darray->m_type = RAS_DisplayArray::QUAD;
m_endarray = m_displayArrays.size()-1;
m_endvertex = 0;
m_endindex = 0;
}
m_currentArray = darray;
2002-10-12 11:37:38 +00:00
}
void RAS_MeshSlot::AddPolygon(int numverts)
{
SetDisplayArray(numverts);
}
2002-10-12 11:37:38 +00:00
int RAS_MeshSlot::AddVertex(const RAS_TexVert& tv)
{
RAS_DisplayArray *darray;
int offset;
darray = m_currentArray;
darray->m_vertex.push_back(tv);
offset = darray->m_vertex.size()-1;
2002-10-12 11:37:38 +00:00
if(darray == m_displayArrays[m_endarray])
m_endvertex++;
return offset;
2002-10-12 11:37:38 +00:00
}
void RAS_MeshSlot::AddPolygonVertex(int offset)
{
RAS_DisplayArray *darray;
darray = m_currentArray;
darray->m_index.push_back(offset);
if(darray == m_displayArrays[m_endarray])
m_endindex++;
}
2002-10-12 11:37:38 +00:00
BGE: Support mesh modifiers in the game engine. Realtime modifiers applied on mesh objects will be supported in the game engine with the following limitations: - Only real time modifiers are supported (basically all of them!) - Virtual modifiers resulting from parenting are not supported: armature, curve, lattice. You can still use these modifiers (armature is really not recommended) but in non parent mode. The BGE has it's own parenting capability for armature. - Modifiers are computed on the host (using blender modifier stack). - Modifiers are statically evaluated: any possible time dependency in the modifiers is not supported (don't know enough about modifiers to be more specific). - Modifiers are reevaluated if the underlying mesh is deformed due to shape action or armature action. Beware that this is very CPU intensive; modifiers should really be used for static objects only. - Physics is still based on the original mesh: if you have a mirror modifier, the physic shape will be limited to one half of the resulting object. Therefore, the modifiers should preferably be used on graphic objects. - Scripts have no access to the modified mesh. - Modifiers that are based on objects interaction (boolean,..) will not be dependent on the objects position in the GE. What you see in the 3D view is what you get in the GE regardless on the object position, velocity, etc. Besides that, the feature is compatible with all the BGE features that affect meshes: armature action, shape action, relace mesh, VideoTexture, add object, dupligroup. Known problems: - This feature is a bit hacky: the BGE uses the derived mesh draw functions to display the object. This drawing method is a bit slow and is not 100% compatible with the BGE. There may be some problems in multi-texture mode: the multi-texture coordinates are not sent to the GPU. Texface and GLSL on the other hand should be fully supported. - Culling is still based on the extend of the original mesh. If you have a modifer that extends the size of the mesh, the object may disappear while still in the view frustrum. - Derived mesh is not shared between replicas. The derived mesh is allocated and computed for each object with modifiers, regardless if they are static replicas. - Display list are not created on objects with modifiers. I should be able to fix the above problems before release. However, the feature is already useful for game development. Once you are ready to release the game, you can apply the modifiers to get back display list support and mesh sharing capability. MSVC, scons, Cmake, makefile updated. Enjoy /benoit
2009-04-21 11:01:09 +00:00
void RAS_MeshSlot::SetDeformer(RAS_Deformer* deformer)
{
if (deformer && m_pDeformer != deformer) {
RAS_DisplayArrayList::iterator it;
if (deformer->ShareVertexArray()) {
// this deformer uses the base vertex array, first release the current ones
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
BGE: Support mesh modifiers in the game engine. Realtime modifiers applied on mesh objects will be supported in the game engine with the following limitations: - Only real time modifiers are supported (basically all of them!) - Virtual modifiers resulting from parenting are not supported: armature, curve, lattice. You can still use these modifiers (armature is really not recommended) but in non parent mode. The BGE has it's own parenting capability for armature. - Modifiers are computed on the host (using blender modifier stack). - Modifiers are statically evaluated: any possible time dependency in the modifiers is not supported (don't know enough about modifiers to be more specific). - Modifiers are reevaluated if the underlying mesh is deformed due to shape action or armature action. Beware that this is very CPU intensive; modifiers should really be used for static objects only. - Physics is still based on the original mesh: if you have a mirror modifier, the physic shape will be limited to one half of the resulting object. Therefore, the modifiers should preferably be used on graphic objects. - Scripts have no access to the modified mesh. - Modifiers that are based on objects interaction (boolean,..) will not be dependent on the objects position in the GE. What you see in the 3D view is what you get in the GE regardless on the object position, velocity, etc. Besides that, the feature is compatible with all the BGE features that affect meshes: armature action, shape action, relace mesh, VideoTexture, add object, dupligroup. Known problems: - This feature is a bit hacky: the BGE uses the derived mesh draw functions to display the object. This drawing method is a bit slow and is not 100% compatible with the BGE. There may be some problems in multi-texture mode: the multi-texture coordinates are not sent to the GPU. Texface and GLSL on the other hand should be fully supported. - Culling is still based on the extend of the original mesh. If you have a modifer that extends the size of the mesh, the object may disappear while still in the view frustrum. - Derived mesh is not shared between replicas. The derived mesh is allocated and computed for each object with modifiers, regardless if they are static replicas. - Display list are not created on objects with modifiers. I should be able to fix the above problems before release. However, the feature is already useful for game development. Once you are ready to release the game, you can apply the modifiers to get back display list support and mesh sharing capability. MSVC, scons, Cmake, makefile updated. Enjoy /benoit
2009-04-21 11:01:09 +00:00
(*it)->m_users--;
if((*it)->m_users == 0)
delete *it;
}
m_displayArrays.clear();
// then hook to the base ones
RAS_MeshMaterial *mmat = m_mesh->GetMeshMaterial(m_bucket->GetPolyMaterial());
if (mmat && mmat->m_baseslot) {
m_displayArrays = mmat->m_baseslot->m_displayArrays;
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
(*it)->m_users++;
}
}
}
else {
// no sharing
// we create local copy of RAS_DisplayArray when we have a deformer:
// this way we can avoid conflict between the vertex cache of duplicates
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
if (deformer->UseVertexArray()) {
// the deformer makes use of vertex array, make sure we have our local copy
if ((*it)->m_users > 1) {
// only need to copy if there are other users
// note that this is the usual case as vertex arrays are held by the material base slot
RAS_DisplayArray *newarray = new RAS_DisplayArray(*(*it));
newarray->m_users = 1;
(*it)->m_users--;
*it = newarray;
}
} else {
// the deformer is not using vertex array (Modifier), release them
(*it)->m_users--;
if((*it)->m_users == 0)
delete *it;
}
}
if (!deformer->UseVertexArray()) {
m_displayArrays.clear();
m_startarray = 0;
m_startvertex = 0;
m_startindex = 0;
m_endarray = 0;
m_endvertex = 0;
m_endindex = 0;
}
BGE: Support mesh modifiers in the game engine. Realtime modifiers applied on mesh objects will be supported in the game engine with the following limitations: - Only real time modifiers are supported (basically all of them!) - Virtual modifiers resulting from parenting are not supported: armature, curve, lattice. You can still use these modifiers (armature is really not recommended) but in non parent mode. The BGE has it's own parenting capability for armature. - Modifiers are computed on the host (using blender modifier stack). - Modifiers are statically evaluated: any possible time dependency in the modifiers is not supported (don't know enough about modifiers to be more specific). - Modifiers are reevaluated if the underlying mesh is deformed due to shape action or armature action. Beware that this is very CPU intensive; modifiers should really be used for static objects only. - Physics is still based on the original mesh: if you have a mirror modifier, the physic shape will be limited to one half of the resulting object. Therefore, the modifiers should preferably be used on graphic objects. - Scripts have no access to the modified mesh. - Modifiers that are based on objects interaction (boolean,..) will not be dependent on the objects position in the GE. What you see in the 3D view is what you get in the GE regardless on the object position, velocity, etc. Besides that, the feature is compatible with all the BGE features that affect meshes: armature action, shape action, relace mesh, VideoTexture, add object, dupligroup. Known problems: - This feature is a bit hacky: the BGE uses the derived mesh draw functions to display the object. This drawing method is a bit slow and is not 100% compatible with the BGE. There may be some problems in multi-texture mode: the multi-texture coordinates are not sent to the GPU. Texface and GLSL on the other hand should be fully supported. - Culling is still based on the extend of the original mesh. If you have a modifer that extends the size of the mesh, the object may disappear while still in the view frustrum. - Derived mesh is not shared between replicas. The derived mesh is allocated and computed for each object with modifiers, regardless if they are static replicas. - Display list are not created on objects with modifiers. I should be able to fix the above problems before release. However, the feature is already useful for game development. Once you are ready to release the game, you can apply the modifiers to get back display list support and mesh sharing capability. MSVC, scons, Cmake, makefile updated. Enjoy /benoit
2009-04-21 11:01:09 +00:00
}
}
m_pDeformer = deformer;
}
bool RAS_MeshSlot::Equals(RAS_MeshSlot *target)
{
if(!m_OpenGLMatrix || !target->m_OpenGLMatrix)
return false;
if(m_pDeformer || target->m_pDeformer)
return false;
if(m_bVisible != target->m_bVisible)
return false;
if(m_bObjectColor != target->m_bObjectColor)
return false;
if(m_bObjectColor && !(m_RGBAcolor == target->m_RGBAcolor))
return false;
2002-10-12 11:37:38 +00:00
return true;
}
bool RAS_MeshSlot::Join(RAS_MeshSlot *target, MT_Scalar distance)
2002-10-12 11:37:38 +00:00
{
RAS_DisplayArrayList::iterator it;
iterator mit;
size_t i;
// verify if we can join
if(m_joinSlot || m_joinedSlots.size() || target->m_joinSlot)
return false;
if(!Equals(target))
return false;
MT_Vector3 co(&m_OpenGLMatrix[12]);
MT_Vector3 targetco(&target->m_OpenGLMatrix[12]);
if((co - targetco).length() > distance)
return false;
MT_Matrix4x4 mat(m_OpenGLMatrix);
MT_Matrix4x4 targetmat(target->m_OpenGLMatrix);
targetmat.invert();
MT_Matrix4x4 transform = targetmat*mat;
// m_mesh, clientobj
m_joinSlot = target;
m_joinInvTransform = transform;
m_joinInvTransform.invert();
target->m_joinedSlots.push_back(this);
MT_Matrix4x4 ntransform = m_joinInvTransform.transposed();
ntransform[0][3]= ntransform[1][3]= ntransform[2][3]= 0.0f;
for(begin(mit); !end(mit); next(mit))
for(i=mit.startvertex; i<mit.endvertex; i++)
mit.vertex[i].Transform(transform, ntransform);
/* We know we'll need a list at least this big, reserve in advance */
target->m_displayArrays.reserve(target->m_displayArrays.size() + m_displayArrays.size());
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
target->m_displayArrays.push_back(*it);
target->m_endarray++;
target->m_endvertex = target->m_displayArrays.back()->m_vertex.size();
target->m_endindex = target->m_displayArrays.back()->m_index.size();
}
if (m_DisplayList) {
m_DisplayList->Release();
m_DisplayList = NULL;
}
if (target->m_DisplayList) {
target->m_DisplayList->Release();
target->m_DisplayList = NULL;
}
return true;
#if 0
return false;
#endif
2002-10-12 11:37:38 +00:00
}
bool RAS_MeshSlot::Split(bool force)
{
list<RAS_MeshSlot*>::iterator jit;
RAS_MeshSlot *target = m_joinSlot;
RAS_DisplayArrayList::iterator it, jt;
iterator mit;
size_t i, found0 = 0, found1 = 0;
if(target && (force || !Equals(target))) {
m_joinSlot = NULL;
for(jit=target->m_joinedSlots.begin(); jit!=target->m_joinedSlots.end(); jit++) {
if(*jit == this) {
target->m_joinedSlots.erase(jit);
found0 = 1;
break;
}
}
if(!found0)
abort();
for(it=m_displayArrays.begin(); it!=m_displayArrays.end(); it++) {
found1 = 0;
for(jt=target->m_displayArrays.begin(); jt!=target->m_displayArrays.end(); jt++) {
if(*jt == *it) {
target->m_displayArrays.erase(jt);
target->m_endarray--;
found1 = 1;
break;
}
}
if(!found1)
abort();
}
if(target->m_displayArrays.size()) {
target->m_endvertex = target->m_displayArrays.back()->m_vertex.size();
target->m_endindex = target->m_displayArrays.back()->m_index.size();
}
else {
target->m_endvertex = 0;
target->m_endindex = 0;
}
MT_Matrix4x4 ntransform = m_joinInvTransform.inverse().transposed();
ntransform[0][3]= ntransform[1][3]= ntransform[2][3]= 0.0f;
2002-10-12 11:37:38 +00:00
for(begin(mit); !end(mit); next(mit))
for(i=mit.startvertex; i<mit.endvertex; i++)
mit.vertex[i].Transform(m_joinInvTransform, ntransform);
if (target->m_DisplayList) {
target->m_DisplayList->Release();
target->m_DisplayList = NULL;
}
return true;
}
2002-10-12 11:37:38 +00:00
return false;
}
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A(*) all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s (*) : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
#ifdef USE_SPLIT
bool RAS_MeshSlot::IsCulled()
2002-10-12 11:37:38 +00:00
{
if(m_joinSlot)
return true;
if(!m_bCulled)
return false;
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A(*) all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s (*) : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
list<RAS_MeshSlot*>::iterator it;
for(it=m_joinedSlots.begin(); it!=m_joinedSlots.end(); it++)
if(!(*it)->m_bCulled)
return false;
return true;
2002-10-12 11:37:38 +00:00
}
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A(*) all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s (*) : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
#endif
2002-10-12 11:37:38 +00:00
/* material bucket sorting */
struct RAS_MaterialBucket::less
{
bool operator()(const RAS_MaterialBucket* x, const RAS_MaterialBucket* y) const
{
return *x->GetPolyMaterial() < *y->GetPolyMaterial();
}
};
2002-10-12 11:37:38 +00:00
/* material bucket */
2002-10-12 11:37:38 +00:00
RAS_MaterialBucket::RAS_MaterialBucket(RAS_IPolyMaterial* mat)
{
m_material = mat;
}
RAS_MaterialBucket::~RAS_MaterialBucket()
2002-10-12 11:37:38 +00:00
{
}
RAS_IPolyMaterial* RAS_MaterialBucket::GetPolyMaterial() const
{
return m_material;
2002-10-12 11:37:38 +00:00
}
bool RAS_MaterialBucket::IsAlpha() const
{
return (m_material->IsAlpha());
}
bool RAS_MaterialBucket::IsZSort() const
2002-10-12 11:37:38 +00:00
{
return (m_material->IsZSort());
2002-10-12 11:37:38 +00:00
}
RAS_MeshSlot* RAS_MaterialBucket::AddMesh(int numverts)
2002-10-12 11:37:38 +00:00
{
RAS_MeshSlot *ms;
2002-10-12 11:37:38 +00:00
m_meshSlots.push_back(RAS_MeshSlot());
ms = &m_meshSlots.back();
ms->init(this, numverts);
2002-10-12 11:37:38 +00:00
return ms;
}
2002-10-12 11:37:38 +00:00
RAS_MeshSlot* RAS_MaterialBucket::CopyMesh(RAS_MeshSlot *ms)
2002-10-12 11:37:38 +00:00
{
m_meshSlots.push_back(RAS_MeshSlot(*ms));
return &m_meshSlots.back();
2002-10-12 11:37:38 +00:00
}
void RAS_MaterialBucket::RemoveMesh(RAS_MeshSlot* ms)
{
list<RAS_MeshSlot>::iterator it;
for(it=m_meshSlots.begin(); it!=m_meshSlots.end(); it++) {
if(&*it == ms) {
m_meshSlots.erase(it);
return;
}
}
}
2002-10-12 11:37:38 +00:00
list<RAS_MeshSlot>::iterator RAS_MaterialBucket::msBegin()
{
return m_meshSlots.begin();
}
2002-10-12 11:37:38 +00:00
list<RAS_MeshSlot>::iterator RAS_MaterialBucket::msEnd()
2002-10-12 11:37:38 +00:00
{
return m_meshSlots.end();
}
2002-10-12 11:37:38 +00:00
bool RAS_MaterialBucket::ActivateMaterial(const MT_Transform& cameratrans, RAS_IRasterizer* rasty,
RAS_IRenderTools *rendertools)
{
bool uselights;
if(!rasty->SetMaterial(*m_material))
return false;
2002-10-12 11:37:38 +00:00
uselights= m_material->UsesLighting(rasty);
rendertools->ProcessLighting(rasty, uselights, cameratrans);
return true;
}
void RAS_MaterialBucket::RenderMeshSlot(const MT_Transform& cameratrans, RAS_IRasterizer* rasty,
RAS_IRenderTools* rendertools, RAS_MeshSlot &ms)
{
m_material->ActivateMeshSlot(ms, rasty);
if (ms.m_pDeformer)
{
ms.m_pDeformer->Apply(m_material);
// KX_ReInstanceShapeFromMesh(ms.m_mesh); // Recompute the physics mesh. (Can't call KX_* from RAS_)
}
if(IsZSort() && rasty->GetDrawingMode() >= RAS_IRasterizer::KX_SOLID)
ms.m_mesh->SortPolygons(ms, cameratrans*MT_Transform(ms.m_OpenGLMatrix));
rendertools->PushMatrix();
if (!ms.m_pDeformer || !ms.m_pDeformer->SkipVertexTransform())
{
rendertools->applyTransform(rasty,ms.m_OpenGLMatrix,m_material->GetDrawingMode());
}
if(rasty->QueryLists())
if(ms.m_DisplayList)
ms.m_DisplayList->SetModified(ms.m_mesh->MeshModified());
// verify if we can use display list, not for deformed object, and
// also don't create a new display list when drawing shadow buffers,
// then it won't have texture coordinates for actual drawing. also
// for zsort we can't make a display list, since the polygon order
// changes all the time.
if(ms.m_pDeformer && ms.m_pDeformer->IsDynamic())
ms.m_bDisplayList = false;
else if(!ms.m_DisplayList && rasty->GetDrawingMode() == RAS_IRasterizer::KX_SHADOW)
ms.m_bDisplayList = false;
else if (IsZSort())
ms.m_bDisplayList = false;
else if(m_material->UsesObjectColor() && ms.m_bObjectColor)
ms.m_bDisplayList = false;
else
ms.m_bDisplayList = true;
// for text drawing using faces
if (m_material->GetDrawingMode() & RAS_IRasterizer::RAS_RENDER_3DPOLYGON_TEXT)
rasty->IndexPrimitives_3DText(ms, m_material, rendertools);
// for multitexturing
else if((m_material->GetFlag() & (RAS_MULTITEX|RAS_BLENDERGLSL)))
rasty->IndexPrimitivesMulti(ms);
// use normal IndexPrimitives
else
rasty->IndexPrimitives(ms);
if(rasty->QueryLists())
if(ms.m_DisplayList)
ms.m_mesh->SetMeshModified(false);
rendertools->PopMatrix();
}
void RAS_MaterialBucket::Optimize(MT_Scalar distance)
{
/* TODO: still have to check before this works correct:
* - lightlayer, frontface, text, billboard
* - make it work with physics */
#if 0
list<RAS_MeshSlot>::iterator it;
list<RAS_MeshSlot>::iterator jt;
// greed joining on all following buckets
for(it=m_meshSlots.begin(); it!=m_meshSlots.end(); it++)
for(jt=it, jt++; jt!=m_meshSlots.end(); jt++)
jt->Join(&*it, distance);
#endif
2002-10-12 11:37:38 +00:00
}