On Windows we can only do mmap memory allocation up to 4 GB, which causes a
crash when doing very large renders on 64 bit systems with a lot of memory.
As far as I can tell the reason to use mmap is to get around address space
limitation on some 32 bit operating systems, and I can't see a reason to use
it on 64 bit. For the original explanation see here:
http://orange.blender.org/blog/stupid-memory-problems
Fixes T37841.
Release builds will now use lock-free allocator by
default without any internal locks happening.
MemHead is also reduces to as minimum as it's possible.
It still need to be size_t stored in a MemHead in order
to make us keep track on memory we're requesting from
the system, not memory which system is allocating. This
is probably also faster than using a malloc's usable
size function.
Lock-free guarded allocator will say you whether all
the blocks were freed, but wouldn't give you a list
of unfreed blocks list. To have such a list use a
--debug or --debug-memory command line arguments.
Debug builds does have the same behavior as release
builds. This is so tools like valgrind are not
screwed up by guarded allocator as they're currently
are.
--
svn merge -r59941:59942 -r60072:60073 -r60093:60094 \
-r60095:60096 ^/branches/soc-2013-depsgraph_mt
- Tweaked typedefs in stdint so they match
what we've got in BLI_sys_types (needed to
explicitly tell sign to MSVC).
Not so much harmful to be more explicit here,
but we really better to have single stdint
int blender.
- Tweaked allocations macros so MSVC is happy
with structures allocation.
Also made libmv-capi use guarded objetc allocation.
Run into some suspecious cases when it was not so
clear whether memory is being freed or not.
Now we'll know for sure whether there're leaks or not :)
Having this macros in a guardedalloc header helps
using them in other areas (for now it's OCIO and libmv,
but in the future it'll be more places).
- replace numbers with defines for allocation increments and default array size.
- move array reallocation into a static function (deduplicate 2x).
also fix own mistake with uninitialized slop-space var in memory printing statistics.
This commit attempts to fix the following error:
intern\guardedalloc\intern\mallocn.c: In function 'rem_memblock':
intern\guardedalloc\intern\mallocn.c:977:48: error: conversion to 'intptr_t' from 'size_t' may change the sign of the result [-Werror=sign-conversion]
From the references I've managed to find, it appears that
the second arg to munmap() should be size_t not intptr_t.
Fortunately though, we don't use this arg anyways atm, so
this should be quite harmless...
- Re-arrange locks, so no actual memory allocation
(which is relatively slow) happens from inside
the lock. operation system will take care of locks
which might be needed there on it's own.
- Use spin lock instead of mutex, since it's just
list operations happens from inside lock, no need
in mutex here.
- Use atomic operations for memory in use and total
used blocks counters.
This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.
There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization
Added an option to show backtrace from where
non-freed datablock was allocated from.
To enable this feature, simply enable DEBUG_BACKTRACE
in mallocn.c file and all unfreed datablocks will
be followed up by a backtrace.
Currently works on linux and osx only,
windows support is on TODO.
This feature is for sure disabled by default,
so does not affect any builds which don't
explicitly define DEBUG_BACKTRACE.
non-threadsafe usage of guarded allocator.
Also added small chunk of code to check consistency of begin/end
threaded malloc.
All this additional checks are commented and wouldn't affect on
builds, however found them helpful to troubleshoot issues so
decided to commit it to SVN.
This implements AO baking directly from multi-resolution mesh with much
less memory overhead than regular baker.
Uses rays distribution implementation from Morten Mikkelsen, raycast
is based on RayObject also used by Blender Internal.
Works in single-thread yet, multi-threading would be implemented later.
http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Personal%20Builds/ray_linn/GCC-4.7.0-with-ada/mingw-w64-gcc-4.7.0-runtime-2.0.1-static-ada-20120330.7z/download
Other builds may also work but due to the constantly changing nature of the compiler this cannot be guaranteed. I often had to change compilers while building the libraries and this one is the one that did the job for most of them.
This first support is experimental and considered "advanced". To enable pass -DWITH_MINGW64 during cmake configuration. Also make sure to extract the compiler on C:/MinGW and that MinGW/bin is in your path. To build check out lib/mingw64.
Initially the support is lacking until I get every library compiled correctly. For now you should disable WITH_CYCLES(sorry, I know some people are dying to do benchmarks, but still a few libs to go), WITH_IMAGE_OPENEXR, WITH_OPENCOLLADA, WITH_LIBMV and WITH_CODEC_FFMPEG(links but hangs on startup).
Still the tools are working, the memory limit is increased and due to the experimental nature of the setup, full optimization with SSE2 is available, which makes the build quite fast. Also the compiler and especially, the linker are way faster than regular MinGW.
The wiki docs have also updated. Happy testing!
- define __BIG_ENDIAN__ or __LITTLE_ENDIAN__ with cmake & scons.
- ENDIAN_ORDER is now a define rather than a global short.
- replace checks like this with single ifdef: #if defined(__sgi) || defined (__sparc) || defined (__sparc__) || defined (__PPC__) || defined (__ppc__) || defined (__hppa__) || defined (__BIG_ENDIAN__)
- remove BKE_endian.h which isn't used
blender_add_lib now takes a separate include argument to suppress warnings in system includes (mostly ffmpeg & python).
also only build wm_apple.c on apple+carbon configuration.
- modifier code was using sizeof() without knowing the sizeof the array when clearing the modifier type array.
- use BLI_snprintf rather then sprintf where the size of the string is known.
- particle drawing code kept a reference to stack float values (not a problem at the moment but would crash if accessed later).
- enabling/disabling no longer prints in the terminal unless in debug mode.
- remove 'header' struct from BLI_storage_types.h, from revision 2 and is not used.
- Add GCC property to guardedalloc to warn if the return value from allocation functions isn't used.
double click didnt check mouse distance moved so you could click twice in different areas of the screen very fast and generate a double click event which had old mouse coords copied into it but was sent to an operator set to run on single click (because the double click wasnt handled).
Also added MEM_name_ptr function (included in debug mode only), prints the name of allocated memory.
used for debugging where events came from.
replaces another so it can do updates (e.g. dopesheet editor can
sync channel selection).
Also coded a simple optimization for allocating small objects,
based on mempools. It's #ifdef'd out, you can enabled it by
defining OPTIMIZE_SMALL_BLOCKS (e.g. adding -DDOPTIMIZE_SMALL_BLOCKS to
your compiler flags).
We suffer from a great deal of performance loss from the system allocator
(vgroups, ghash, edgehash, the singly-linked list implementation in blenlib,
editmesh, and likely a great many areas I'm forgetting), and this is the
common solution for handling the many-small-objects problem. It's not
really production-ready yet (it's long-term memory consequencers need to
be profiled first, and the implementation tweaked as necassary), but for
people on systems with slow system allocators it's worth trying.
Note that since this creates a guardedalloc<->blenlib link, the build systems
need to be updated accordingly (I've already done this for scons, though I'm
not sure if the player builds).
* Rendering twice or more could crash layer/pass buttons.
* Compositing would crash while drawing the image.
* Rendering animations could also crash drawing the image.
* Compositing could crash
* Starting to rendering while preview render / compo was
still running could crash.
* Exiting while rendering an animation would not abort the
renderer properly, making Blender seemingly freeze.
* Fixes theoretically possible issue with setting malloc
lock with nested threads.
* Drawing previews inside nodes could crash when those nodes
were being rendered at the same time.
There's more crashes, manipulating the scene data or undo can
still crash, this commit only focuses on making sure the image
buffer and render result access is thread safe.
Implementation:
* Rather than assuming the render result does not get freed
during render, which seems to be quite difficult to do given
that e.g. the compositor is allowed to change the size of
the buffer or output different passes, the render result is
now protected with a read/write mutex.
* The read/write mutex allows multiple readers (and pixel
writers) at the same time, but only allows one writer to
manipulate the data structure.
* Added BKE_image_acquire_ibuf/BKE_image_release_ibuf to access
images being rendered, cases where this is not needed (most
code) can still use BKE_image_get_ibuf.
* The job manager now allows only one rendering job at the same
time, rather than the G.rendering check which was not reliable.
* bring back 'player' libtype, after investigation with ideasman.
scons/mingw works nicely, for some reason msvc fails to link still, will look further into it.