Commit Graph

252 Commits

Author SHA1 Message Date
Campbell Barton
8d752141ce Support for platforms /wo malloc_usable_size
Was only used for stats, netbsd doesn't define this function.
2015-06-21 12:33:55 +10:00
Sergey Sharybin
6298632bfa Guardedalloc: Don't use aligned blocks to calculate memory sloppyness
Aligned memory is allocated with memalign() and malloc_usable_size() can't be
used to measure this block.
2015-04-20 19:23:25 +05:00
Campbell Barton
17d96ca2aa GuardedAlloc: safer MEM_SAFE_FREE
only instantiate the argument once,
so MEM_SAFE_FREE(array[i++]), won't cause incorrect behavior.
2015-03-12 23:49:15 +11:00
Sergey Sharybin
9fc2c37328 Guardedalloc: Reset peak memory should set peak to currently allocated memory
Otherwise statistics could be really funny looking.
2015-02-19 13:14:06 +05:00
Sergey Sharybin
6c5f63b476 Guardedalloc: Add extra logging and checks in MEM_freeN()
We don't like when NULL is send to MEM_freeN(), but there was some
differences between lockfree and guarded allocators:

- Lockfree would have silently crash, in both release and debug modes
- Guarded allocator would have printed error message, abort in debug
  but keep working in release build.

This commit makes lockfree allocator behavior to match guarded one.
2015-02-19 01:58:49 +05:00
Dan Horák
c86c9297dc Fix inconsistent types in guardealloc
This basically fixes mix of size_t and uintptr_t usages which might be different size.
2014-10-14 16:11:20 +02:00
Sergey Sharybin
f2280661cb Enable atomic peak memory detection
This gives more precise information about memory usage which might be real handy
when doing memory optimization.

It works good here for as long as i can tell but if for some reason you'll be
experiencing some weird slowdown please let me know.
2014-10-10 01:55:57 +06:00
32f83a298c Fix build errors in atomic ops and warning in aligned malloc on OS X. 2014-09-25 23:59:38 +02:00
Sergey Sharybin
faf4f29cc0 Guardedalloc: Implement atomic peak memory update
Updating maximum requires a bit of a cycle which usually does 1 iteration only,
sometimes needs a bit more but seems there's no speed regressions.

For now the code is commented out. This way it's easier for others to verify
there's no speed regressions.

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D626
2014-09-26 00:40:53 +06:00
Campbell Barton
88ee650263 Comments 2014-08-16 10:51:07 +10:00
Tamito Kajiyama
af585e843b Fix inconsistent use of print_error() and fprintf(stderr, ...) in MEM_guarded_printmemlist_internal().
Also extended the size of buf[] in print_error() to prevent mem_printmemlist_pydict_script[]
from getting truncated when MEM_printmemlist_pydict() is used.

Differential revision: https://developer.blender.org/D675

Reviewed by: Campbell Barton
2014-07-25 19:24:24 +09:00
Sergey Sharybin
ecfc2db6e2 I'd tend to declare dead code is forbidden
All this code blocks commented out with UNUSED comment are
really useless.
2014-06-16 14:08:22 +06:00
Campbell Barton
788f4858d7 Comment unused macro 2014-06-14 16:27:13 +10:00
Sergey Sharybin
7e20583688 Attempt to fix sign conversion error happening on buildbot 2014-06-14 03:35:22 +06:00
Sergey Sharybin
d0573ce905 Attempt to fix guardedalloc on OSX 2014-06-14 01:52:35 +06:00
Sergey Sharybin
a87fb34eda Use advantage of SSE2 instructions in gaussian blur node
This gives around 30% of speedup for gaussian blur node.

Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:

- Aligned malloc. It's needed to load data onto SSE registers
  faster. based on the aligned_malloc() from Libmv with
  some additional trickery going on to support arbitrary
  alignment (this magic is needed because of MemHead).

  In the practice only 16bit alignment is supported because
  of the lack of aligned malloc with arbitrary alignment
  for OSX. Not a bit deal for now because we need 16 bytes
  alignment at this moment only. Could be tweaked further
  later.

- Memory buffers in compositor are now aligned to 16 bytes.
  Should be harmless for non-SSE cases too. just mentioning.

Reviewers: campbellbarton, lukastoenne, jbakker

Reviewed By: campbellbarton

CC: lockal

Differential Revision: https://developer.blender.org/D564
2014-06-14 00:38:07 +06:00
Matteo F. Vescovi
9b23d9acec Fix compilation error non non-linux architectures 2014-06-02 16:26:38 +06:00
Sebastian Ramacher
76f7a5bd6b Fix compilation error on kFreeBSD 2014-05-19 16:35:24 +02:00
Campbell Barton
43fb105ff1 Move LIKELY/UNLIKELY into header 2014-04-06 17:25:50 +10:00
Campbell Barton
43a201662a Guarded Alloc: use UNLIKELY for debug memset 2014-04-06 12:58:10 +10:00
Campbell Barton
c16bd951cd Enable GCC pedantic warnings with strict flags,
also modify MIN/MAX macros to prevent shadowing.
2014-03-30 15:04:20 +11:00
Campbell Barton
a99a8a6070 Code cleanup: style and warnings 2014-03-26 07:53:56 +11:00
Campbell Barton
8480bb64ec Code cleanup: style 2014-03-17 21:48:13 +11:00
Sergey Sharybin
a81cf3182f Fix typo in mmap commit from a while ago 2014-01-23 18:41:38 +06:00
282ad434a8 Memory allocation: do not use mmap for memory allocation on 64 bit.
On Windows we can only do mmap memory allocation up to 4 GB, which causes a
crash when doing very large renders on 64 bit systems with a lot of memory.

As far as I can tell the reason to use mmap is to get around address space
limitation on some 32 bit operating systems, and I can't see a reason to use
it on 64 bit. For the original explanation see here:
http://orange.blender.org/blog/stupid-memory-problems

Fixes T37841.
2014-01-23 01:13:46 +01:00
Campbell Barton
a5183d7a87 Code Cleanup: use NULL for pointer checks and remove joke. 2013-11-22 10:43:42 +11:00
Campbell Barton
50d1129a57 add atomic_ops.h to cmake's source code listing. 2013-10-31 14:09:01 +00:00
Sergey Sharybin
4f6dd555b7 Fix for wrong implementation of mmap in lock-free allocator
- Freeing was not using proper block length
- Duplicating memory block was not aware of
  mmaped blocks.
2013-10-20 00:12:54 +00:00
Brecht Van Lommel
1760f5fdcc Fix FreeBSD build with recent malloc changes, patch by Shane Ambler. 2013-10-11 14:41:00 +00:00
Campbell Barton
e220d3228f add MEM_SIZE_OPTIMAL to avoid memory fragmentation & waste lost to slop-space. 2013-10-10 18:18:13 +00:00
Thomas Dinges
223c637a93 * Fix Windows compiler errors after recent Lock-free memory allocator commit.
Patch by Sergey, thanks. :)
2013-10-10 16:11:57 +00:00
Brecht Van Lommel
b880b01db5 Fix OS X build error in malloc code, and warning in rna. 2013-10-10 15:44:47 +00:00
Sergey Sharybin
4bd4037276 Lock-free memory allocator
Release builds will now use lock-free allocator by
default without any internal locks happening.

MemHead is also reduces to as minimum as it's possible.
It still need to be size_t stored in a MemHead in order
to make us keep track on memory we're requesting from
the system, not memory which system is allocating. This
is probably also faster than using a malloc's usable
size function.

Lock-free guarded allocator will say you whether all
the blocks were freed, but wouldn't give you a list
of unfreed blocks list. To have such a list use a
--debug or --debug-memory command line arguments.

Debug builds does have the same behavior as release
builds. This is so tools like valgrind are not
screwed up by guarded allocator as they're currently
are.

--
svn merge -r59941:59942 -r60072:60073 -r60093:60094 \
          -r60095:60096 ^/branches/soc-2013-depsgraph_mt
2013-10-10 11:58:01 +00:00
Sergey Sharybin
ccd2e4375a Fix compilation error after recent libmv change
- Tweaked typedefs in stdint so they match
  what we've got in BLI_sys_types (needed to
  explicitly tell sign to MSVC).

  Not so much harmful to be more explicit here,
  but we really better to have single stdint
  int blender.

- Tweaked allocations macros so MSVC is happy
  with structures allocation.
2013-10-09 19:49:09 +00:00
Sergey Sharybin
49bc310671 Move guarded objetc allocation to a guardedalloc header
Also made libmv-capi use guarded objetc allocation.
Run into some suspecious cases when it was not so
clear whether memory is being freed or not.

Now we'll know for sure whether there're leaks or not :)

Having this macros in a guardedalloc header helps
using them in other areas (for now it's OCIO and libmv,
but in the future it'll be more places).
2013-10-09 08:46:02 +00:00
Sergey Sharybin
44ff79c432 Added a brie instruction how to build simple memtest 2013-09-05 16:32:44 +00:00
Campbell Barton
2dc988df8c reorder BLI_strict_flags.h include so its not conflicting with stdio.h on apple. 2013-09-03 04:39:12 +00:00
Campbell Barton
fe427f0561 kd-tree,
- replace numbers with defines for allocation increments and default array size.
- move array reallocation into a static function (deduplicate 2x).

also fix own mistake with uninitialized slop-space var in memory printing statistics.
2013-09-01 08:58:46 +00:00
Joshua Leung
33c68846de Mingw/Windows Compiling Fix
This commit attempts to fix the following error:

intern\guardedalloc\intern\mallocn.c: In function 'rem_memblock':
intern\guardedalloc\intern\mallocn.c:977:48: error: conversion to 'intptr_t' from 'size_t' may change the sign of the result [-Werror=sign-conversion]

From the references I've managed to find, it appears that
the second arg to munmap() should be size_t not intptr_t.
Fortunately though, we don't use this arg anyways atm, so 
this should be quite harmless...
2013-09-01 05:12:36 +00:00
Campbell Barton
9ad5f32fc0 use strict flags for guarded alloc 2013-09-01 02:46:34 +00:00
Brecht Van Lommel
9135425607 Attempted fix for #36569: couldn't unmap memory errors on Windows. The guardedalloc optimizations were not entirely thread safe for mmap. 2013-08-29 23:46:44 +00:00
Campbell Barton
1ac57ccbc8 correct own recent commit, malloc_usable_size() isn't valid for mmap()'d memory. 2013-08-28 22:12:40 +00:00
Campbell Barton
1a6b364c28 should fix builds for osx 2013-08-28 11:22:29 +00:00
Campbell Barton
d1d6a13297 include slop-space in debug statistics (gcc/clang only) 2013-08-28 10:17:26 +00:00
Campbell Barton
b97334f992 add GPL header to treehash.c and add missing includes to cmake. 2013-08-24 03:17:28 +00:00
Sergey Sharybin
c0f8e15295 Speedup for guarded allocator
- Re-arrange locks, so no actual memory allocation
  (which is relatively slow) happens from inside
  the lock. operation system will take care of locks
  which might be needed there on it's own.

- Use spin lock instead of mutex, since it's just
  list operations happens from inside lock, no need
  in mutex here.

- Use atomic operations for memory in use and total
  used blocks counters.

This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.

There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization
2013-08-19 10:51:40 +00:00
Sergey Sharybin
018ab045e3 Added check for whether thread lock is being removed while thread is using guarded alloc.
--
svn merge -r58788:58789 ^/branches/soc-2013-depsgraph_mt
2013-08-19 10:38:27 +00:00
Sergey Sharybin
58d7ae891d Blender might be compiled without guardedalloc again
This is useful for benchmark tests, to make CPU cache
utilization as good as we could with current design.
2013-08-15 07:36:56 +00:00
Campbell Barton
ce2e2b141e use gcc malloc attribute for low level allocation functions, prevents gcc from checking if resulting pointers alias existing pointers, also use sentinel attribute for uiButGetStrInfo so incorrect usage gives a warning. 2013-08-05 20:57:13 +00:00
Campbell Barton
bd89bd9e1c avoid using MEM_reallocN_id directly, add utility macro for freeing. 2013-08-04 03:00:04 +00:00