This way util_simd.cpp would not require modifications
if/when SSE2 is suddenly supported on 32bit platforms.
This also allowed to unleash some issues with util_simd.h
related on the fact that there size_t and int are actually
the same types.
This makes the code a bit easier to understand, and might come in handy
if we want to reuse more Embree code.
Differential Revision: https://developer.blender.org/D482
Code by Brecht, with fixes by Lockal, Sergey and myself.
Gives ~11% speedup for hair.blend, ~10% for koro_final.blend
Also extract few common subexpressions in hair calculation.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D318
MSVC 2008 ignores alignement attribute when assigning from unaligned
float4 vector, returned from other function. Now Cycles uses unaligned
loads instead of casts for win32 in x86 mode.
Gives up to 15% speedup scenes with voronoi-based textures (up to 25% with volumes) on Haswell. The performance change for other CPUs is much smaller: 1-2%.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D203