On Windows we can only do mmap memory allocation up to 4 GB, which causes a
crash when doing very large renders on 64 bit systems with a lot of memory.
As far as I can tell the reason to use mmap is to get around address space
limitation on some 32 bit operating systems, and I can't see a reason to use
it on 64 bit. For the original explanation see here:
http://orange.blender.org/blog/stupid-memory-problems
Fixes T37841.
Release builds will now use lock-free allocator by
default without any internal locks happening.
MemHead is also reduces to as minimum as it's possible.
It still need to be size_t stored in a MemHead in order
to make us keep track on memory we're requesting from
the system, not memory which system is allocating. This
is probably also faster than using a malloc's usable
size function.
Lock-free guarded allocator will say you whether all
the blocks were freed, but wouldn't give you a list
of unfreed blocks list. To have such a list use a
--debug or --debug-memory command line arguments.
Debug builds does have the same behavior as release
builds. This is so tools like valgrind are not
screwed up by guarded allocator as they're currently
are.
--
svn merge -r59941:59942 -r60072:60073 -r60093:60094 \
-r60095:60096 ^/branches/soc-2013-depsgraph_mt
- Tweaked typedefs in stdint so they match
what we've got in BLI_sys_types (needed to
explicitly tell sign to MSVC).
Not so much harmful to be more explicit here,
but we really better to have single stdint
int blender.
- Tweaked allocations macros so MSVC is happy
with structures allocation.
Also made libmv-capi use guarded objetc allocation.
Run into some suspecious cases when it was not so
clear whether memory is being freed or not.
Now we'll know for sure whether there're leaks or not :)
Having this macros in a guardedalloc header helps
using them in other areas (for now it's OCIO and libmv,
but in the future it'll be more places).
- replace numbers with defines for allocation increments and default array size.
- move array reallocation into a static function (deduplicate 2x).
also fix own mistake with uninitialized slop-space var in memory printing statistics.
This commit attempts to fix the following error:
intern\guardedalloc\intern\mallocn.c: In function 'rem_memblock':
intern\guardedalloc\intern\mallocn.c:977:48: error: conversion to 'intptr_t' from 'size_t' may change the sign of the result [-Werror=sign-conversion]
From the references I've managed to find, it appears that
the second arg to munmap() should be size_t not intptr_t.
Fortunately though, we don't use this arg anyways atm, so
this should be quite harmless...
- Re-arrange locks, so no actual memory allocation
(which is relatively slow) happens from inside
the lock. operation system will take care of locks
which might be needed there on it's own.
- Use spin lock instead of mutex, since it's just
list operations happens from inside lock, no need
in mutex here.
- Use atomic operations for memory in use and total
used blocks counters.
This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.
There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization
Added an option to show backtrace from where
non-freed datablock was allocated from.
To enable this feature, simply enable DEBUG_BACKTRACE
in mallocn.c file and all unfreed datablocks will
be followed up by a backtrace.
Currently works on linux and osx only,
windows support is on TODO.
This feature is for sure disabled by default,
so does not affect any builds which don't
explicitly define DEBUG_BACKTRACE.
non-threadsafe usage of guarded allocator.
Also added small chunk of code to check consistency of begin/end
threaded malloc.
All this additional checks are commented and wouldn't affect on
builds, however found them helpful to troubleshoot issues so
decided to commit it to SVN.
This implements AO baking directly from multi-resolution mesh with much
less memory overhead than regular baker.
Uses rays distribution implementation from Morten Mikkelsen, raycast
is based on RayObject also used by Blender Internal.
Works in single-thread yet, multi-threading would be implemented later.