MSVC 2008 ignores alignement attribute when assigning from unaligned
float4 vector, returned from other function. Now Cycles uses unaligned
loads instead of casts for win32 in x86 mode.
The AVX kernel functions for reading image textures could be get used from non-AVX
kernels. These are C++ class methods and need to be marked for inlining, all other
functions are static so they don't leak into other kernels.
* Change Info in header, put more important info to the left
* API: Move Camera width/height to camera, add some film properties
* Add ESC key to help menu
This is very basic for now, but can be extended with more info (available devices for example) later.
Thanks to Bastien and Sergey for some help with the glRect coordinates stuff.
* AVX is available on Intel Sandy Bridge and newer and AMD Bulldozer and newer.
* We don't use dedicated AVX intrinsics yet, but gcc auto vectorization gives a 3% performance improvement for Caminandes. Tested on an i5-3570, Linux x64.
* No change for Windows yet, MSVC 2008 does not support AVX.
Reviewed by: brecht
Differential Revision: https://developer.blender.org/D216
Gives up to 15% speedup scenes with voronoi-based textures (up to 25% with volumes) on Haswell. The performance change for other CPUs is much smaller: 1-2%.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D203
This actually works somewhat now, although viewport rendering is broken and any
kind of network error or connection failure will kill Blender.
* Experimental WITH_CYCLES_NETWORK cmake option
* Networked Device is shown as an option next to CPU and GPU Compute
* Various updates to work with the latest Cycles code
* Locks and thread safety for RPC calls and tiles
* Refactored pointer mapping code
* Fix error in CPU brand string retrieval code
This includes work by Doug Gale, Martijn Berger and Brecht Van Lommel.
Reviewers: brecht
Differential Revision: http://developer.blender.org/D36
This code can't actually be enabled for building and is incomplete, but it's
here because we know we want to support this at some point and there's not much
reason to have it in a separate branch if a simple #ifdef can disable it.
This is mostly work towards enabling the __KERNEL_SSE__ option to start using
SIMD operations for vector math operations. This 4.1 kernel performes about 8%
faster with that option but overall is still slower than without the option.
WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel.
Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2%
speedup on tested systems with the current kernel already, so is enabled now.
This to avoids build conflicts with libc++ on FreeBSD, these __ prefixed values
are reserved for compilers. I apologize to anyone who has patches or branches
and has to go through the pain of merging this change, it may be easiest to do
these same replacements in your code and then apply/merge the patch.
Ref T37477.
* 32 bit GCC builds now have the SSE BVH optimizations turned off, but still
compile with SSE flags for better performance.
* White color when rendering on Windows seems to have been unrelated to SSE,
rather it was a graphics driver not supporting half float textures, added a
check for that now.
There is some sort of problem with the SSE2 code path, but I couldn't find
the cause, maybe a compiler bug due to the large amount of inlining? For
now I've disabled SSE2 optimizatons in 32 bit GCC builds.
except for curves, that's still missing from the OpenColorIO GLSL shader.
The pixels are stored in a half float texture, converterd from full float with
native GPU instructions and SIMD on the CPU, so it should be pretty quick.
Using a GLSL shader is useful for GPU render because it avoids a copy through
CPU memory.