blender/source/kernel/gen_system/GEN_HashedPtr.cpp

/*
 * $Id$
 *
 * ***** BEGIN GPL LICENSE BLOCK *****
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License
 * as published by the Free Software Foundation; either version 2
 * of the License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software Foundation,
 * Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
 *
 * The Original Code is Copyright (C) 2001-2002 by NaN Holding BV.
 * All rights reserved.
 *
 * The Original Code is: all of this file.
 *
 * Contributor(s): none yet.
 *
 * ***** END GPL LICENSE BLOCK *****
 *
 */
#include "GEN_HashedPtr.h"

#ifdef HAVE_CONFIG_H
#include <config.h>
#endif

#include "BLO_sys_types.h" // for intptr_t support

//
// Build hash index from pointer.  Even though the final result
// is a 32-bit integer, use all the bits of the pointer as long
// as possible.
//
#if 1
unsigned int GEN_Hash(void * inDWord)
{
	uintptr_t key = (uintptr_t)inDWord;
#if 0
	// this is way too complicated
	key += ~(key << 16);
	key ^=  (key >>  5);
	key +=  (key <<  3);
	key ^=  (key >> 13);
	key += ~(key <<  9);
	key ^=  (key >> 17);

  	return (unsigned int)(key & 0xffffffff);
#else
	return (unsigned int)(key ^ (key>>4));
#endif
}
#endif
Initial revision 2002-10-12 11:37:38 +00:00			`/*`
			`* $Id$`
			`*`
Patch from GSR that a) fixes a whole bunch of GPL/BL license blocks that were previously missed; and b) greatly increase my ohloh stats! 2008-04-16 22:40:48 +00:00			`* *** BEGIN GPL LICENSE BLOCK ***`
Initial revision 2002-10-12 11:37:38 +00:00			`*`
			`* This program is free software; you can redistribute it and/or`
			`* modify it under the terms of the GNU General Public License`
			`* as published by the Free Software Foundation; either version 2`
Patch from GSR that a) fixes a whole bunch of GPL/BL license blocks that were previously missed; and b) greatly increase my ohloh stats! 2008-04-16 22:40:48 +00:00			`* of the License, or (at your option) any later version.`
Initial revision 2002-10-12 11:37:38 +00:00			`*`
			`* This program is distributed in the hope that it will be useful,`
			`* but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`* GNU General Public License for more details.`
			`*`
			`* You should have received a copy of the GNU General Public License`
			`* along with this program; if not, write to the Free Software Foundation,`
			`* Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.`
			`*`
			`* The Original Code is Copyright (C) 2001-2002 by NaN Holding BV.`
			`* All rights reserved.`
			`*`
			`* The Original Code is: all of this file.`
			`*`
			`* Contributor(s): none yet.`
			`*`
Patch from GSR that a) fixes a whole bunch of GPL/BL license blocks that were previously missed; and b) greatly increase my ohloh stats! 2008-04-16 22:40:48 +00:00			`* *** END GPL LICENSE BLOCK ***`
Initial revision 2002-10-12 11:37:38 +00:00			`*`
			`*/`
			`#include "GEN_HashedPtr.h"`

Last of the config.h mods... #ifdef HAVE_CONFIG_H #include <config.h> #endif added to these files. Kent -- mein@cs.umn.edu 2002-11-25 15:29:57 +00:00			`#ifdef HAVE_CONFIG_H`
			`#include <config.h>`
			`#endif`

Win64: please check my changes if you ran across them ;) But should be fine since no additional crashes were reported! 2008-08-17 17:08:00 +00:00			`#include "BLO_sys_types.h" // for intptr_t support`

-- Change to make blender with game engine disabled build without errors on 64-bit machines. This code only seems to be used by the game engine anyway; maybe it's only linux which always compiles it regardless of whether game engine is enabled? 2005-11-27 03:47:17 +00:00			`//`
			`// Build hash index from pointer. Even though the final result`
			`// is a 32-bit integer, use all the bits of the pointer as long`
			`// as possible.`
			`//`
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A() all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s () : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there... 2009-05-07 09:13:01 +00:00			`#if 1`
-- Change to make blender with game engine disabled build without errors on 64-bit machines. This code only seems to be used by the game engine anyway; maybe it's only linux which always compiles it regardless of whether game engine is enabled? 2005-11-27 03:47:17 +00:00			`unsigned int GEN_Hash(void * inDWord)`
Initial revision 2002-10-12 11:37:38 +00:00			`{`
Win64: please check my changes if you ran across them ;) But should be fine since no additional crashes were reported! 2008-08-17 17:08:00 +00:00			`uintptr_t key = (uintptr_t)inDWord;`
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A() all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s () : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there... 2009-05-07 09:13:01 +00:00			`#if 0`
			`// this is way too complicated`
Initial revision 2002-10-12 11:37:38 +00:00			`key += ~(key << 16);`
			`key ^= (key >> 5);`
			`key += (key << 3);`
			`key ^= (key >> 13);`
			`key += ~(key << 9);`
			`key ^= (key >> 17);`

-- Change to make blender with game engine disabled build without errors on 64-bit machines. This code only seems to be used by the game engine anyway; maybe it's only linux which always compiles it regardless of whether game engine is enabled? 2005-11-27 03:47:17 +00:00			`return (unsigned int)(key & 0xffffffff);`
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A() all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s () : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there... 2009-05-07 09:13:01 +00:00			`#else`
			`return (unsigned int)(key ^ (key>>4));`
			`#endif`
Initial revision 2002-10-12 11:37:38 +00:00			`}`
BGE performance, 3rd round: culling and rasterizer. This commit extend the technique of dynamic linked list to the mesh slots so as to eliminate dumb scan or map lookup. It provides massive performance improvement in the culling and in the rasterizer when the majority of objects are static. Other improvements: - Compute the opengl matrix only for objects that are visible. - Simplify hash function for GEN_HasedPtr - Scan light list instead of general object list to render shadows - Remove redundant opengl calls to set specularity, shinyness and diffuse between each mesh slots. - Cache GPU material to avoid frequent call to GPU_material_from_blender - Only set once the fixed elements of mesh slot - Use more inline function The following table shows the performance increase between 2.48, 1st round and this round of improvement. The test was done with a scene containing 40000 objects, of which 1000 are in the view frustrum approximately. The object are simple textured cube to make sure the GPU is not the bottleneck. As some of the rasterizer processing time has moved under culling, I present the sum of scenegraph(includes culling)+rasterizer time Scenegraph+rasterizer(ms) 2.48 1st round 3rd round All objects static, 323.0 86.0 7.2 all visible, 1000 in the view frustrum All objects static, 219.0 49.7 N/A() all invisible. All objects moving, 323.0 105.6 34.7 all visible, 1000 in the view frustrum Scene destruction 40min 40min 4s () : this time is not representative because the frame rate was at 60fps. In that case, the GPU holds down the GE by frame sync. By design, the overhead of the rasterizer is 0 when the the objects are invisible. This table shows a global speed up between 9x and 45x compared to 2.48a for scenegraph, culling and rasterizer overhead. The speed up goes much higher when objects are invisible. An additional 2-4x speed up is possible in the scenegraph by upgrading the Moto library to use Eigen2 BLAS library instead of C++ classes but the scenegraph is already so fast that it is not a priority right now. Next speed up in logic: many things to do there... 2009-05-07 09:13:01 +00:00			`#endif`