This gives around 30% of speedup for gaussian blur node.
Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:
- Aligned malloc. It's needed to load data onto SSE registers
faster. based on the aligned_malloc() from Libmv with
some additional trickery going on to support arbitrary
alignment (this magic is needed because of MemHead).
In the practice only 16bit alignment is supported because
of the lack of aligned malloc with arbitrary alignment
for OSX. Not a bit deal for now because we need 16 bytes
alignment at this moment only. Could be tweaked further
later.
- Memory buffers in compositor are now aligned to 16 bytes.
Should be harmless for non-SSE cases too. just mentioning.
Reviewers: campbellbarton, lukastoenne, jbakker
Reviewed By: campbellbarton
CC: lockal
Differential Revision: https://developer.blender.org/D564
Release builds will now use lock-free allocator by
default without any internal locks happening.
MemHead is also reduces to as minimum as it's possible.
It still need to be size_t stored in a MemHead in order
to make us keep track on memory we're requesting from
the system, not memory which system is allocating. This
is probably also faster than using a malloc's usable
size function.
Lock-free guarded allocator will say you whether all
the blocks were freed, but wouldn't give you a list
of unfreed blocks list. To have such a list use a
--debug or --debug-memory command line arguments.
Debug builds does have the same behavior as release
builds. This is so tools like valgrind are not
screwed up by guarded allocator as they're currently
are.
--
svn merge -r59941:59942 -r60072:60073 -r60093:60094 \
-r60095:60096 ^/branches/soc-2013-depsgraph_mt
- Tweaked typedefs in stdint so they match
what we've got in BLI_sys_types (needed to
explicitly tell sign to MSVC).
Not so much harmful to be more explicit here,
but we really better to have single stdint
int blender.
- Tweaked allocations macros so MSVC is happy
with structures allocation.
Also made libmv-capi use guarded objetc allocation.
Run into some suspecious cases when it was not so
clear whether memory is being freed or not.
Now we'll know for sure whether there're leaks or not :)
Having this macros in a guardedalloc header helps
using them in other areas (for now it's OCIO and libmv,
but in the future it'll be more places).
- enabling/disabling no longer prints in the terminal unless in debug mode.
- remove 'header' struct from BLI_storage_types.h, from revision 2 and is not used.
- Add GCC property to guardedalloc to warn if the return value from allocation functions isn't used.
double click didnt check mouse distance moved so you could click twice in different areas of the screen very fast and generate a double click event which had old mouse coords copied into it but was sent to an operator set to run on single click (because the double click wasnt handled).
Also added MEM_name_ptr function (included in debug mode only), prints the name of allocated memory.
used for debugging where events came from.
the features that are needed to run the game. Compile tested with
scons, make, but not cmake, that seems to have an issue not related
to these changes. The changes include:
* GLSL support in the viewport and game engine, enable in the game
menu in textured draw mode.
* Synced and merged part of the duplicated blender and gameengine/
gameplayer drawing code.
* Further refactoring of game engine drawing code, especially mesh
storage changed a lot.
* Optimizations in game engine armatures to avoid recomputations.
* A python function to get the framerate estimate in game.
* An option take object color into account in materials.
* An option to restrict shadow casters to a lamp's layers.
* Increase from 10 to 18 texture slots for materials, lamps, word.
An extra texture slot shows up once the last slot is used.
* Memory limit for undo, not enabled by default yet because it
needs the .B.blend to be changed.
* Multiple undo for image painting.
* An offset for dupligroups, so not all objects in a group have to
be at the origin.
WINDOWS CRASH EMULATION!
If you use the -d (debug) argument for starting blender, it will now:
- set all freed memory to 0xFFFFFFFF
- set all malloced memory to 0xFFFFFFFF
The first option will give nice crashers when you read from freed memory.
The second option is for OSX especially, it has the nasty habit to give
zeroed mallocs.