- Re-arrange locks, so no actual memory allocation
(which is relatively slow) happens from inside
the lock. operation system will take care of locks
which might be needed there on it's own.
- Use spin lock instead of mutex, since it's just
list operations happens from inside lock, no need
in mutex here.
- Use atomic operations for memory in use and total
used blocks counters.
This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.
There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization.
TODO: We need smarter 32/64bit compile-time check,
currently i'm afraid only x86 CPU family is
detecting reliably.
Now sounds that stopped playing but are still kept in the device can be differentiated from paused sounds with this state.
This should also fix the performance issues mentioned in [#36466] End of SequencerEntrys not set correctly.
Please test if sound pausing, resuming and stopping works fine in the BGE and sequencer, my tests all worked fine, but there might be a use case that needs some fixing.
This replaces code (pseudo-code):
spin_lock();
update_child_dag_nodes();
schedule_new_nodes();
spin_unlock();
with:
update_child_dag_nodes_with_atomic_ops();
schedule_new_nodes();
The reason for this is that scheduling new nodes implies
mutex lock, and having spin around it is a bad idea.
Alternatives could have been to use spinlock around
child nodes update only, but that would either imply having
either per-node spin-lock or using array to put nodes
ready for update to an array.
Didn't like an alternatives, using atomic operations makes
code much easier to follow, keeps data-flow on cpu nice.
Same atomic ops might be used in other performance-critical
areas later.
Using atomic ops implementation from jemalloc project.
* Remove code for the unused Wave texture variations.
We have quite some unused code in the texture area, I guess it doesn't harm to clean a bit up here.
We can always get the code back from SVN if we need something.
* GPU kernel can now be compiled without __NON_PROGRESSIVE__ again, was broken after my last commit. Also add a check for have_error(), in case the GPU kernel comes without Non-Progressive, to avoid a crash.
* Don't compile progressive kernel twice on CPU, if __NON_PROGRESSIVE__ would be disabled there.
* Non-Progressive integrator is now available on the GPU (CUDA, sm_20 and above).
Implementation details:
* kernel_path_trace() has been split up into two functions:
kernel_path_trace_non_progressive() and kernel_path_trace_progressive().
* We compile two CUDA kernel entry functions (in kernel.cu) for the two integrators, they are still inside one .cubin file but due to the kernel separation there should be no performance problem. I tested with the BMW file on my Geforce 540M and the render times were the same for 100 samples (1.57 min in my case).
This is part of my GSoC project, SVN merge of r59032 + manual merge of UI changes for this from my branch.
for texture system in advance. Patch by Martijn Berger, with some tweaks.
There was about a 10% performance improvement on OS X in my tests with the
images.blend test file. This may be less on other platforms because OS X has
particularly slow mutex locks.
* Render Passes are now available for Subsurface Scattering (Direct, Indirect and Color pass).
This is part of my GSoC project, SVN merge of r58587, r58828 and r58835.
* After some feedback decided to remove this option from the Progressive integrator, it only makes sense for Non-Progressive where we have different values for the sample types.
* Added a node to convert a temperature in Kelvin to an RGB color. This can be used e.g. for lights, to easily find the right color temperature.
= Some common temperatures =
Candle light: 1500 Kelvin
Sunset/Sunrise: 1850 Kelvin
Studio lamps: 3200 Kelvin
Horizon daylight: 5000 Kelvin
Documentation: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/More#Blackbody
Thanks to Philipp Oeser (lichtwerk), who essentially contributed to this with a patch! :)
This is part of my GSoC 2013 project. SVN merge of r57424, r57487, r57507, r57525, r58253 and r58774
* Added a Ray Depth output to the Light Path node, which gives the user access to the current bounce.
This can be used to limit the maximum ray bounce on a per shader basis. Another use case is to restrict light influence with this, to have a lamp only contribute to the direct lighting.
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/More#Light_Path
This is part of my GSoC 2013 project. SVN merge of r58091 and r58772 from soc-2013-dingto.