Commit Graph

14 Commits

Author SHA1 Message Date
Benoit Bolsee
42557f90bd BGE performance, 3rd round: culling and rasterizer.
This commit extend the technique of dynamic linked list to the mesh
slots so as to eliminate dumb scan or map lookup. It provides massive 
performance improvement in the culling and in the rasterizer when 
the majority of objects are static.

Other improvements:
- Compute the opengl matrix only for objects that are visible.
- Simplify hash function for GEN_HasedPtr
- Scan light list instead of general object list to render shadows
- Remove redundant opengl calls to set specularity, shinyness and diffuse
  between each mesh slots.
- Cache GPU material to avoid frequent call to GPU_material_from_blender
- Only set once the fixed elements of mesh slot
- Use more inline function

The following table shows the performance increase between 2.48, 1st round
and this round of improvement. The test was done with a scene containing 
40000 objects, of which 1000 are in the view frustrum approximately. The
object are simple textured cube to make sure the GPU is not the bottleneck.
As some of the rasterizer processing time has moved under culling, I present
the sum of scenegraph(includes culling)+rasterizer time

Scenegraph+rasterizer(ms)       2.48      1st round       3rd round

All objects static,            323.0           86.0             7.2
all visible, 1000 in 
the view frustrum

All objects static,            219.0           49.7             N/A(*)
all invisible.

All objects moving,            323.0          105.6            34.7
all visible, 1000 in 
the view frustrum

Scene destruction              40min          40min              4s

(*) : this time is not representative because the frame rate was at 60fps.
      In that case, the GPU holds down the GE by frame sync. By design, the
      overhead of the rasterizer is 0 when the the objects are invisible. 

This table shows a global speed up between 9x and 45x compared to 2.48a
for scenegraph, culling and rasterizer overhead. The speed up goes much
higher when objects are invisible.

An additional 2-4x speed up is possible in the scenegraph by upgrading
the Moto library to use Eigen2 BLAS library instead of C++ classes but
the scenegraph is already so fast that it is not a priority right now.

Next speed up in logic: many things to do there...
2009-05-07 09:13:01 +00:00
Benoit Bolsee
362202cc14 Fix an undefined variable bug detected by valgrind. 2009-05-05 22:32:15 +00:00
Benoit Bolsee
3abb8e8e68 BGE performance: second round of scenegraph improvement.
Use dynamic linked list to handle scenegraph rather than dumb scan
of the whole tree. The performance improvement depends on the fraction
of moving objects. If most objects are static, the speed up is 
considerable. The following table compares the time spent on 
scenegraph before and after this commit on a scene with 10000 objects
in various configuratons:

Scenegraph time (ms)              Before         After
(includes culling)

All objects static,               8.8            1.7  
all visible but small fraction          
in the view frustrum

All objects static,               7,5            0.01
all invisible.

All objects moving,               14.1           8.4
all visible but small fraction
in the view frustrum

This tables shows that static and invisible objects take no CPU at all
for scenegraph and culling. In the general case, this commit will 
speed up the scenegraph between 2x and 5x. Compared to 2.48a, it should
be between 4x and 10x faster. Further speed up is possible by making
the scenegraph cache-friendly.

Next round of performance improvement will be on the rasterizer: use
the same dynamic linked list technique for the mesh slots.
2009-05-03 22:29:00 +00:00
Benoit Bolsee
51b4145841 BGE Scenegraph and View frustrum culling improvement.
This commit contains a number of performance improvements for the
BGE in the Scenegraph (parent relation between objects in the
scene) and view frustrum culling.

The scenegraph improvement consists in avoiding position update
if the object has not moved since last update and the removal
of redundant updates and synchronization with the physics engine.

The view frustrum culling improvement consists in using the DBVT
broadphase facility of Bullet to build a tree of graphical objects
in the scene. The elements of the tree are Aabb boxes (Aligned 
Axis Bounding Boxes) enclosing the objects. This provides good
precision in closed and opened scenes. This new culling system
is enabled by default but just in case, it can be disabled with
a button in the World settings. There is no do_version in this
commit but it will be added before the 2.49 release. For now you
must manually enable the DBVT culling option in World settings
when you open an old file.

The above improvements speed up scenegraph and culling up to 5x.
However, this performance improvement is only visible when
you have hundreds or thousands of objects.

The main interest of the DBVT tree is to allow easy occlusion
culling and automatic LOD system. This will be the object of further
improvements.
2009-04-07 22:14:06 +00:00
Campbell Barton
c77af31166 Minor speedups for the BGE
* Where possible use vec.setValue(x,y,z) to assign values to a vector instead of vec= MT_Vector3(x,y,z), for MT_Point and MT_Matrix types too.
* Comparing TexVerts was creating 10 MT_Vector types - instead compare as floats.
* Added SG_Spatial::SetWorldFromLocalTransform() since the local transform is use for world transform in some cases.
* removed some unneeded vars from UpdateChildCoordinates functions 
* Py API - Mouse, Ray, Radar sensors - use PyObjectFrom(vec) rather then filling the lists in each function. Use METH_NOARGS for get*() functions.
2009-02-25 06:43:03 +00:00
Campbell Barton
fc7a83b458 Added access for adjusting timeOffset value at runtime, used for apricot (Franky climbing walls) 2008-06-14 17:12:49 +00:00
Chris Want
5d0a207ecb Patch from GSR that a) fixes a whole bunch of GPL/BL license
blocks that were previously missed; and b) greatly increase my
ohloh stats!
2008-04-16 22:40:48 +00:00
Kester Maddock
7b2567924b Switch fixed time system. Logic updates should now happen at 30Hz, physics at 60Hz. (By default, use Python to set.) Some actuators still run at framerate (IPO, Action) for nice smooth animation, and an excuse to buy high end hardware.
Keyboard sensors can now hook escape key.  Ctrl-Break can be used from within blender if you've forgotten an end game actuator.

Fixed a stupid bug preventing some actuators working (like TrackTo).
2004-10-16 11:41:50 +00:00
Kester Maddock
3dd18c5c34 Added an UpdateTransform callback from SceneGraph -> Physics.
Profiling revealed that the SceneGraph updated every physics object, whether it moved or not, even though the physics object was at the right place.  This would cause SOLID to go and update its bounding boxes, overlap tests etc.
This callback handles the special case (parented objects) where the physics scene needs to be informed of changes to the scenegraph.

Added Python attributes (mass, parent, visible, position, orientation, scaling) to the KX_GameObject module.
Make KX_GameObject use the KX_PyMath Python <-> Moto conversion.
2004-05-26 12:06:41 +00:00
Kester Maddock
e957b12f0e Frustum sphere culling.
Do a sphere<->camera sphere and a sphere<->frustum before the box<->frustum test.
2004-05-21 09:21:15 +00:00
Kester Maddock
c50055204d SceneGraph support for bounding boxs 2004-05-16 12:54:44 +00:00
Kester Maddock
63048b6cf4 Synchronise game engine with Tuhopuu2 tree. 2004-04-24 06:40:15 +00:00
Kent Mein
209a2ede2c Last of the config.h mods...
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

added to these files.

Kent
--
mein@cs.umn.edu
2002-11-25 15:29:57 +00:00
Hans Lambermont
12315f4d0e Initial revision 2002-10-12 11:37:38 +00:00