forked from bartvdbraak/blender
e3a79258d1
This is mostly work towards enabling the __KERNEL_SSE__ option to start using SIMD operations for vector math operations. This 4.1 kernel performes about 8% faster with that option but overall is still slower than without the option. WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel. Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2% speedup on tested systems with the current kernel already, so is enabled now. |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
device_cpu.cpp | ||
device_cuda.cpp | ||
device_intern.h | ||
device_memory.h | ||
device_multi.cpp | ||
device_network.cpp | ||
device_network.h | ||
device_opencl.cpp | ||
device_task.cpp | ||
device_task.h | ||
device.cpp | ||
device.h |