This commit fixes two different issues, which were caused by
how weights are being calculated for relative shapekeys.
Weights for key block used to saved in KeyBlock DNA structure,
which lead to situations when different objects could start
writing to the same weights array if they're sharing the same
key datablock.
Solved this in a way so weights are never stored in KeyBlock
and being passed to shapekeys routines as an array of pointers.
This way weights are still computed run-time (meaning they're
calculated before shapekey evaluation and freed afterwards).
This required some changes to GameEngine as well, to make it
never cache weights in the key blocks.
Another aspect of this commit makes it so weight for a given
vertex group is only computed once. So if multiple key blocks
are using the same influence vertex group, they'll share the
same exact weights array. This gave around 1.7x speedup in
test chinchilla file which is close enough to if we've been
caching weights permanently in DNA (test machine is dual-code
4 threads laptop, speedup measured in depsgraph_mt branch,
trunk might be not so much high speedup).
Some further speed is optimization possible, but it could be
done later as well.
Thanks Brecht for idea of how the things might be solved in
really clear way.
--
svn merge -r58786:58787 ^/branches/soc-2013-depsgraph_mt
* Cleaning up the conversion code to avoid a per-face material conversion. Materials are now stored in buckets and only converted if a new material is found. This replaces some of Campbell's earlier work on the subject. His work wasn't as thorough, but it was much safer for a release.
* Shaders are only compiled for LibLoaded materials once. Before they could be compiled twice, which could really slow things down.
* Refactoring the rasterizer code to use a strategy design pattern to handle different geometry rendering methods such as immediate mode, vertex arrays and vertex buffer objects. VBOs are added, but they will be disabled in a following commit since they are still slower than vertex arrays with display lists. However, VBOs are still useful for mobile, so it's good to keep them around.
* Better multi-uv support. The BGE should now be able to handle more than two UV layers, which should help it better match the viewport.
- ignore MSVC warnings when FREE_WINDOWS is defined to quiet warnings.
- the CMake flags were not being set correctly making blender have weirdo colors (no -funsigned-char).
Even a static mesh can be used as replacement: the mesh
will be instantiated with the soft body settings of the
object. The position and orientation of the soft body
is preserved after the replacement.
Known limitation: the velocity of the soft body is reset
aftet the replacement. This is because soft body don't
have a well defined velocity.
Add support back for reinstancePhysics mesh, a frequently requested feature in the BGE forums.
from what I can tell Sumo supported this but bullet never did.
Currently only accessible via python at the moment.
- rigid body, dynamic, static types work.
- instanced physics meshes are modified too.
- compound shapes are not supported.
Physics mesh can be re-instanced from...
* shape keys & armature deformations
* subsurf (any other modifiers too)
* RAS_TexVert's (can be modified from python)
Moved the reinstancePhysicsMesh functions from RAS_MeshObject into KX_GameObject since the physics data is stored here.
video and blend file demo.
http://www.graphicall.org/ftp/ideasman42/reinstance.ogvhttp://www.graphicall.org/ftp/ideasman42/reinstance_demo.blend
SCA_RandomActuator: The random generator was shared between replicas and not deleted. Added ref counting between replicas to allow deletion at the end.
KX_Camera: The scenegraph node was not deleted for temporary cameras (ImageMirror and shadow), causing 500 bytes leak per frame and per shadow light.
KX_GameActuator: Global dictionary buffer was not deleted after saving.
KX_MotionState: The motion state for compound child was not deleted
KX_ReplaceMeshActuator: The mesh was unnecessarily converted for each actuator and not deleted, causing large memleak.
After these fix, YoFrankie runs without memleak.
This commit extends the technique of dynamic linked list to the logic
system to eliminate as much as possible temporaries, map lookup or
full scan. The logic engine is now free of memory allocation, which is
an important stability factor.
The overhead of the logic system is reduced by a factor between 3 and 6
depending on the logic setup. This is the speed-up you can expect on
a logic setup using simple bricks. Heavy bricks like python controllers
and ray sensors will still take about the same time to execute so the
speed up will be less important.
The core of the logic engine has been much reworked but the functionality
is still the same except for one thing: the priority system on the
execution of controllers. The exact same remark applies to actuators but
I'll explain for controllers only:
Previously, it was possible, with the "executePriority" attribute to set
a controller to run before any other controllers in the game. Other than
that, the sequential execution of controllers, as defined in Blender was
guaranteed by default.
With the new system, the sequential execution of controllers is still
guaranteed but only within the controllers of one object. the user can
no longer set a controller to run before any other controllers in the
game. The "executePriority" attribute controls the execution of controllers
within one object. The priority is a small number starting from 0 for the
first controller and incrementing for each controller.
If this missing feature is a must, a special method can be implemented
to set a controller to run before all other controllers.
Other improvements:
- Systematic use of reference in parameter passing to avoid unnecessary data copy
- Use pre increment in iterator instead of post increment to avoid temporary allocation
- Use const char* instead of STR_String whenever possible to avoid temporary allocation
- Fix reference counting bugs (memory leak)
- Fix a crash in certain cases of state switching and object deletion
- Minor speed up in property sensor
- Removal of objects during the game is a lot faster
This commit extend the technique of dynamic linked list to the mesh
slots so as to eliminate dumb scan or map lookup. It provides massive
performance improvement in the culling and in the rasterizer when
the majority of objects are static.
Other improvements:
- Compute the opengl matrix only for objects that are visible.
- Simplify hash function for GEN_HasedPtr
- Scan light list instead of general object list to render shadows
- Remove redundant opengl calls to set specularity, shinyness and diffuse
between each mesh slots.
- Cache GPU material to avoid frequent call to GPU_material_from_blender
- Only set once the fixed elements of mesh slot
- Use more inline function
The following table shows the performance increase between 2.48, 1st round
and this round of improvement. The test was done with a scene containing
40000 objects, of which 1000 are in the view frustrum approximately. The
object are simple textured cube to make sure the GPU is not the bottleneck.
As some of the rasterizer processing time has moved under culling, I present
the sum of scenegraph(includes culling)+rasterizer time
Scenegraph+rasterizer(ms) 2.48 1st round 3rd round
All objects static, 323.0 86.0 7.2
all visible, 1000 in
the view frustrum
All objects static, 219.0 49.7 N/A(*)
all invisible.
All objects moving, 323.0 105.6 34.7
all visible, 1000 in
the view frustrum
Scene destruction 40min 40min 4s
(*) : this time is not representative because the frame rate was at 60fps.
In that case, the GPU holds down the GE by frame sync. By design, the
overhead of the rasterizer is 0 when the the objects are invisible.
This table shows a global speed up between 9x and 45x compared to 2.48a
for scenegraph, culling and rasterizer overhead. The speed up goes much
higher when objects are invisible.
An additional 2-4x speed up is possible in the scenegraph by upgrading
the Moto library to use Eigen2 BLAS library instead of C++ classes but
the scenegraph is already so fast that it is not a priority right now.
Next speed up in logic: many things to do there...
Realtime modifiers applied on mesh objects will be supported in
the game engine with the following limitations:
- Only real time modifiers are supported (basically all of them!)
- Virtual modifiers resulting from parenting are not supported:
armature, curve, lattice. You can still use these modifiers
(armature is really not recommended) but in non parent mode.
The BGE has it's own parenting capability for armature.
- Modifiers are computed on the host (using blender modifier
stack).
- Modifiers are statically evaluated: any possible time dependency
in the modifiers is not supported (don't know enough about
modifiers to be more specific).
- Modifiers are reevaluated if the underlying mesh is deformed
due to shape action or armature action. Beware that this is
very CPU intensive; modifiers should really be used for static
objects only.
- Physics is still based on the original mesh: if you have a
mirror modifier, the physic shape will be limited to one half
of the resulting object. Therefore, the modifiers should
preferably be used on graphic objects.
- Scripts have no access to the modified mesh.
- Modifiers that are based on objects interaction (boolean,..)
will not be dependent on the objects position in the GE.
What you see in the 3D view is what you get in the GE regardless
on the object position, velocity, etc.
Besides that, the feature is compatible with all the BGE features
that affect meshes: armature action, shape action, relace mesh,
VideoTexture, add object, dupligroup.
Known problems:
- This feature is a bit hacky: the BGE uses the derived mesh draw
functions to display the object. This drawing method is a
bit slow and is not 100% compatible with the BGE. There may
be some problems in multi-texture mode: the multi-texture
coordinates are not sent to the GPU.
Texface and GLSL on the other hand should be fully supported.
- Culling is still based on the extend of the original mesh.
If you have a modifer that extends the size of the mesh,
the object may disappear while still in the view frustrum.
- Derived mesh is not shared between replicas.
The derived mesh is allocated and computed for each object
with modifiers, regardless if they are static replicas.
- Display list are not created on objects with modifiers.
I should be able to fix the above problems before release.
However, the feature is already useful for game development.
Once you are ready to release the game, you can apply the modifiers
to get back display list support and mesh sharing capability.
MSVC, scons, Cmake, makefile updated.
Enjoy
/benoit
Was adding each face with a remove doubles option that made conversion increasingly slower for larger meshes, this would often hang blender when starting with the BGE with larger meshes.
Replace btTriangleMesh()->addTriangle() with btTriangleIndexVertexArray()
YoFrankie level_1_home.blend starts a third faster, level_nut about twice as fast.
- previous commit was also incorrect using the original meshes vert locations rather then the vert locations that came from the derived mesh.
- Softbody is relying on removing doubles at 0.01 to give stable results, this no longer works but seems a bit dodgy anyway. Maybe some post-processing filter could fix up a mesh for bullet softbody.
the features that are needed to run the game. Compile tested with
scons, make, but not cmake, that seems to have an issue not related
to these changes. The changes include:
* GLSL support in the viewport and game engine, enable in the game
menu in textured draw mode.
* Synced and merged part of the duplicated blender and gameengine/
gameplayer drawing code.
* Further refactoring of game engine drawing code, especially mesh
storage changed a lot.
* Optimizations in game engine armatures to avoid recomputations.
* A python function to get the framerate estimate in game.
* An option take object color into account in materials.
* An option to restrict shadow casters to a lamp's layers.
* Increase from 10 to 18 texture slots for materials, lamps, word.
An extra texture slot shows up once the last slot is used.
* Memory limit for undo, not enabled by default yet because it
needs the .B.blend to be changed.
* Multiple undo for image painting.
* An offset for dupligroups, so not all objects in a group have to
be at the origin.
rayCast(to,from,dist,prop,face,xray,poly):
The face paremeter determines the orientation of the normal:
0 or omitted => hit normal is always oriented towards the ray origin (as if you casted the ray from outside)
1 => hit normal is the real face normal (only for mesh object, otherwise face has no effect)
The ray has X-Ray capability if xray parameter is 1, otherwise the first object hit (other than self object) stops the ray.
The prop and xray parameters interact as follow:
prop off, xray off: return closest hit or no hit if there is no object on the full extend of the ray.
prop off, xray on : idem.
prop on, xray off: return closest hit if it matches prop, no hit otherwise.
prop on, xray on : return closest hit matching prop or no hit if there is no object matching prop on the full extend of the ray.
if poly is 0 or omitted, returns a 3-tuple with object reference, hit point and hit normal or (None,None,None) if no hit.
if poly is 1, returns a 4-tuple with in addition a KX_PolyProxy as 4th element.
The KX_PolyProxy object holds information on the polygon hit by the ray: the index of the vertex forming the poylgon, material, etc.
Attributes (read-only):
matname: The name of polygon material, empty if no material.
material: The material of the polygon
texture: The texture name of the polygon.
matid: The material index of the polygon, use this to retrieve vertex proxy from mesh proxy
v1: vertex index of the first vertex of the polygon, use this to retrieve vertex proxy from mesh proxy
v2: vertex index of the second vertex of the polygon, use this to retrieve vertex proxy from mesh proxy
v3: vertex index of the third vertex of the polygon, use this to retrieve vertex proxy from mesh proxy
v4: vertex index of the fourth vertex of the polygon, 0 if polygon has only 3 vertex
use this to retrieve vertex proxy from mesh proxy
visible: visible state of the polygon: 1=visible, 0=invisible
collide: collide state of the polygon: 1=receives collision, 0=collision free.
Methods:
getMaterialName(): Returns the polygon material name with MA prefix
getMaterial(): Returns the polygon material
getTextureName(): Returns the polygon texture name
getMaterialIndex(): Returns the material bucket index of the polygon.
getNumVertex(): Returns the number of vertex of the polygon.
isVisible(): Returns whether the polygon is visible or not
isCollider(): Returns whether the polygon is receives collision or not
getVertexIndex(vertex): Returns the mesh vertex index of a polygon vertex
getMesh(): Returns a mesh proxy
New methods of KX_MeshProxy have been implemented to retrieve KX_PolyProxy objects:
getNumPolygons(): Returns the number of polygon in the mesh.
getPolygon(index): Gets the specified polygon from the mesh.
More details in PyDoc.
=======================================
Alpha blending + sorting was revised, to fix bugs and get it
to work more predictable.
* A new per texture face "Sort" setting defines if the face
is alpha sorted or not, instead of abusing the "ZTransp"
setting as it did before.
* Existing files are converted to hopefully match the old
behavior as much as possible with a version patch.
* On new meshes the Sort flag is disabled by the default, to
avoid unexpected and hard to find slowdowns.
* Alpha sorting for faces was incredibly slow. Sorting faces
in a mesh with 600 faces lowered the framerate from 200 to
70 fps in my test.. the sorting there case goes about 15x
faster now, but it is still advised to use Clip Alpha if
possible instead of regular Alpha.
* There still various limitations in the alpha sorting code,
I've added some comments to the code about this.
Some docs at the bottom of the page:
http://www.blender.org/development/current-projects/changes-since-246/realtime-glsl-materials/
Merged some fixes from the apricot branch, most important
change is that tangents are now exactly the same as the rest
of Blender, instead of being computed in the game engine with a
different algorithm.
Also, the subversion was bumped to 1.
=============================
* Clean up and optimizations in skinned/deformed mesh code.
* Compatibility fixes and clean up in the rasterizer.
* Changes related to GLSL shadow buffers which should have no
effect, to keep the code in sync with apricot.
Shape Action are now supported in the BGE. A new type of actuator "Shape Action" is available on mesh objects. It can be combined with Action actuator on parent armature. Only relative keys are supported. All the usual action options are available: type, blending, priority, Python API. Only actions with shape channels should be specified of course, otherwise the actuator has no effect. Shape action will still work after a mesh replacement provided that the new mesh has compatible shape keys.
GLEW
====
Added the GLEW opengl extension library into extern/, always compiled
into Blender now. This is much nicer than doing this kind of extension
management manually, and will be used in the game engine, for GLSL, and
other opengl extensions.
* According to the GLEW website it works on Windows, Linux, Mac OS X,
FreeBSD, Irix, and Solaris. There might still be platform specific
issues due to this commit, so let me know and I'll look into it.
* This means also that all extensions will now always be compiled in,
regardless of the glext.h on the platform where compilation happens.
Game Engine
===========
Refactoring of the use of opengl extensions and other drawing code
in the game engine, and cleaning up some hacks related to GLSL
integration. These changes will be merged into trunk too after this.
The game engine graphics demos & apricot level survived my tests,
but this could use some good testing of course.
For users: please test with the options "Generate Display Lists" and
"Vertex Arrays" enabled, these should be the fastest and are supposed
to be "unreliable", but if that's the case that's probably due to bugs
that can be fixed.
* The game engine now also uses GLEW for extensions, replacing the
custom opengl extensions code that was there. Removes a lot of
#ifdef's, but the runtime checks stay of course.
* Removed the WITHOUT_GLEXT environment variable. This was added to
work around a specific bug and only disabled multitexturing anyway.
It might also have caused a slowdown since it was retrieving the
environment variable for every vertex in immediate mode (bug #13680).
* Refactored the code to allow drawing skinned meshes with vertex
arrays too, removing some specific immediate mode drawing functions
for this that only did extra normal calculation. Now it always splits
vertices of flat faces instead.
* Refactored normal recalculation with some minor optimizations,
required for the above change.
* Removed some outdated code behind the __NLA_OLDDEFORM #ifdef.
* Fixed various bugs in setting of multitexture coordinates and vertex
attributes for vertex arrays. These were not being enabled/disabled
correct according to the opengl spec, leading to crashes. Also tangent
attributes used an immediate mode call for vertex arrays, which can't
work.
* Fixed use of uninitialized variable in RAS_TexVert.
* Exporting skinned meshes was doing O(n^2) lookups for vertices and
deform weights, now uses same trick as regular meshes.
Depth sorting for Transparent polygons. Use ZTransp in Material buttons to enable.
This will cause an object's polygons to be sorted (back to front for alpha polygons, front to back for solid polygons.)
[SCons] Build with Solid as default when enabling the gameengine in the build process
[SCons] Build solid and qhull from the extern directory and link statically against them
That was about it.
There are a few things that needs double checking:
* Makefiles
* Projectfiles
* All the other systems than Linux and Windows on which the build (with scons) has been successfully tested.
(adding)
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif
also the Makefile.in's were from previous patch adding
the system depend stuff to configure.ac
Kent
--
mein@cs.umn.edu