Gives ~11% speedup for hair.blend, ~10% for koro_final.blend
Also extract few common subexpressions in hair calculation.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D318
MSVC 2008 ignores alignement attribute when assigning from unaligned
float4 vector, returned from other function. Now Cycles uses unaligned
loads instead of casts for win32 in x86 mode.
Gives up to 15% speedup scenes with voronoi-based textures (up to 25% with volumes) on Haswell. The performance change for other CPUs is much smaller: 1-2%.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D203