blender/intern/cycles/util
Sergey Sharybin 064caae7b2 Cycles: BVH-related SSE optimization
Several ideas here:

- Optimize calculation of near_{x,y,z} in a way that does not require
  3 if() statements per update, which avoids negative effect of wrong
  branch prediction.

- Optimization of direction clamping for BVH.

- Optimization of point/direction transform.

Brings ~1.5% speedup again depending on a scene (unfortunately, this
speedup can't be sum across all previous commits because speedup of
each of the changes varies from scene to scene, but it still seems to
be nice solid speedup of few percent on Linux and bigger speedup was
reported on Windows).

Once again ,thanks Maxym for inspiration!

Still TODO: We have multiple places where we need to calculate near
x,y,z indices in BVH, for now it's only done for main BVH traversal.
Will try to move this calculation to an utility function and see if
that can be easily re-used across all the BVH flavors.
2016-10-25 14:47:34 +02:00
..
CMakeLists.txt Cycles: Add new avxf vectorized data type 2016-10-12 13:54:13 +02:00
util_algorithm.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_aligned_malloc.cpp Cycles: Some cleanup, should be no functional changes 2016-02-16 15:33:00 +01:00
util_aligned_malloc.h Cycles: Use size_t for aligned allocator 2015-02-19 22:19:29 +05:00
util_args.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_atomic.h Cycles: Code cleanup, spaces around keyword and brace 2015-06-01 19:49:52 +05:00
util_avxf.h Cycles: Add new avxf vectorized data type 2016-10-12 13:54:13 +02:00
util_boundbox.h Cycles: Implement unaligned nodes BVH builder 2016-07-07 17:25:48 +02:00
util_color.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_debug.cpp Cycles: Make CUDA adaptive feature compile a Debug flag. 2016-05-06 23:13:33 +02:00
util_debug.h Cycles: Use static assert to control structures alignment 2016-08-11 10:12:06 +02:00
util_foreach.h Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though 2015-03-29 22:12:40 +02:00
util_function.h Cycles: Correction to previous commit: non-msvc compilers also should use nullptr 2015-03-30 15:17:09 +05:00
util_guarded_allocator.cpp Cycles: Fix static initialization order fiasco 2016-10-24 13:47:39 +02:00
util_guarded_allocator.h Cycles: Stop rendering when bad_alloc happens 2016-04-20 16:19:49 +02:00
util_half.h Cycles: Enable half float support (4 channels and 1 channel) on CUDA. 2016-08-11 22:47:53 +02:00
util_hash.h Cycles: Code cleanup, spaces around keywords 2015-03-28 00:15:15 +05:00
util_image.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_list.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_logging.cpp Cycles: Be ready for gflags namespace auto-detect 2015-01-01 01:31:08 +05:00
util_logging.h Cycles: Log whch optimizations are used for CPU kernels 2016-01-06 20:25:19 +05:00
util_map.h Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though 2015-03-29 22:12:40 +02:00
util_math_cdf.cpp Cycles: Support user-defined shutter curve 2015-10-28 02:43:06 +05:00
util_math_cdf.h Cycles: Fix compilation error with MSVC 2015-10-28 17:33:31 +05:00
util_math_fast.h Cycles: Fix three numerical issues in the fresnel, normal map and Beckmann code 2016-07-16 20:54:14 +02:00
util_math.h Cycles: Implement SSE-optimized path of util_max_axis() 2016-10-25 13:54:17 +02:00
util_md5.cpp Cycles: Cleanup, indentation and braces 2016-02-03 15:00:55 +01:00
util_md5.h Cycles: add utility function to calculate MD5 hash of a given string 2015-11-21 22:07:59 +05:00
util_opengl.h Cycles: Post-reintegration tweaks to ensure things do compile 2015-01-01 01:31:08 +05:00
util_optimization.h Cycles: Minor cleanup, whitespace around keyword and preprocessor indent 2016-04-13 08:58:52 +02:00
util_param.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_path.cpp Cycles: Improve OpenCL line information handling 2016-09-29 10:20:24 +02:00
util_path.h Cycles: Improve OpenCL line information handling 2016-09-29 10:20:24 +02:00
util_progress.h Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
util_queue.h Cycles: Avoid recursion when doing constant fold 2015-12-02 16:19:39 +05:00
util_set.h Cycles: Re-implement some utilities to avoid use of boost 2016-02-06 19:19:20 +01:00
util_simd.cpp Cycles: Add an option to build single kernel only which fits current CPU 2016-03-25 16:09:05 +01:00
util_simd.h Cycles: Add new avxf vectorized data type 2016-10-12 13:54:13 +02:00
util_sky_model_data.h Cleanup: Move Cycles sky model data to util. 2016-02-13 13:41:40 +01:00
util_sky_model.cpp Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
util_sky_model.h Cleanup: Move Cycles sky model data to util. 2016-02-13 13:41:40 +01:00
util_sseb.h Cycles: add better specializations for SSE shuffle function and few more wrappers. 2015-03-07 17:25:21 +00:00
util_ssef.h Cycles: Cleanup, indentation and braces 2016-02-03 15:00:55 +01:00
util_ssei.h Cycles: add better specializations for SSE shuffle function and few more wrappers. 2015-03-07 17:25:21 +00:00
util_stack_allocator.h Cycles: Fix issues with stack allocator in MSVC 2016-04-25 13:50:27 +02:00
util_static_assert.h Fix T49286: Compilation error with XCode 7.0 2016-09-08 09:27:51 +02:00
util_stats.h Cycles: Fix static initialization order fiasco 2016-10-24 13:47:39 +02:00
util_string.cpp Cycles OpenCL: use #line directives for better error messages. 2016-07-30 18:25:52 +02:00
util_string.h Cycles OpenCL: use #line directives for better error messages. 2016-07-30 18:25:52 +02:00
util_system.cpp Cycles: Fix compilation error on OSX 2016-06-06 13:52:57 +02:00
util_system.h Cycles: Add support of processor groups 2016-06-06 09:14:37 +02:00
util_task.cpp Cycles: Add support of processor groups 2016-06-06 09:14:37 +02:00
util_task.h Cycles: Use explicit qualifier for single-argument constructors 2016-05-11 16:51:14 +02:00
util_texture.h Cycles: Add single channel texture support for OpenCL. 2016-08-14 20:21:08 +02:00
util_thread.cpp Cycles: Add support of processor groups 2016-06-06 09:14:37 +02:00
util_thread.h Cycles: Add support of processor groups 2016-06-06 09:14:37 +02:00
util_time.cpp Cycles: Re-implement some utilities to avoid use of boost 2016-02-06 19:19:20 +01:00
util_time.h Cycles: Use explicit qualifier for single-argument constructors 2016-05-11 16:51:14 +02:00
util_transform.cpp Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
util_transform.h Cycles: BVH-related SSE optimization 2016-10-25 14:47:34 +02:00
util_types.h Cycles: Use more SSE intrinsics for float3 type 2016-10-12 14:43:00 +02:00
util_vector.h Cycles microdisplacement: Support for Catmull-Clark subdivision via OpenSubdiv 2016-08-07 11:13:11 -04:00
util_version.h Cleanup string includes after versioning commits 2016-04-13 09:45:32 +02:00
util_view.cpp Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
util_view.h Cleanup: Fix Cycles Apache header. 2014-12-25 02:50:24 +01:00
util_windows.cpp Fix compilation error on 32 bit Windows 2016-06-06 14:01:49 +02:00
util_windows.h Cycles: Add support of processor groups 2016-06-06 09:14:37 +02:00
util_xml.h Cycles: Fix compilation error when OIIO is compiled with external PugiXML parser 2015-01-01 01:31:07 +05:00