forked from bartvdbraak/blender
6bf4115c13
Reduce thread divergence in kernel_shader_eval. Rays are sorted in blocks of 2048 according to shader->id. On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster. No sorting for CUDA split kernel. Reviewers: sergey, maiself Reviewed By: maiself Differential Revision: https://developer.blender.org/D2598 |
||
---|---|---|
.. | ||
kernel_avx2.cpp | ||
kernel_avx.cpp | ||
kernel_cpu_image.h | ||
kernel_cpu_impl.h | ||
kernel_cpu.h | ||
kernel_split_avx2.cpp | ||
kernel_split_avx.cpp | ||
kernel_split_sse2.cpp | ||
kernel_split_sse3.cpp | ||
kernel_split_sse41.cpp | ||
kernel_split.cpp | ||
kernel_sse2.cpp | ||
kernel_sse3.cpp | ||
kernel_sse41.cpp | ||
kernel.cpp |