blender

Author	SHA1	Message	Date
Sergey Sharybin	61db9ee27a	Cycles: Attempt to workaround compilation error on new CUDA toolkit and sm_2x	2017-03-29 11:50:17 +02:00
Sergey Sharybin	6ea54fe9ff	Cycles: Switch to reformulated Pluecker ray/triangle intersection The intention of this commit it to address issues mentioned in the reports T43865,T50164 and T50452. The code is based on Embree code with some extra vectorization to speed up single ray to single triangle intersection. Unfortunately, such a fix is not coming for free. There is some slowdown for AVX2 processors, mainly due to different vectorization code, which caused different number of instructions to be executed and different instructions-per-cycle counters. But on another hand this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit faster. The prerformance goes as following: 2.78c AVX2 2.78c AVX Patch AVX2 Patch AVX BMW 05:21.09 06:05.34 05:32.97 (+3.5%) 05:34.97 (-8.5%) Classroom 16:55.36 18:24.51 17:10.41 (+1.4%) 17:15.87 (-6.3%) Fishy Cat 08:08.49 08:36.26 08:09.19 (+0.2%) 08:12.25 (-4.7% Koro 11:22.54 11:45.24 11:13.25 (-1.5%) 11:43.81 (-0.3%) Barcelone 14:18.32 16:09.46 14:15.20 (-0.4%) 14:25.15 (-10.8%) On GPU the performance is about 1.5-2% slower in my tests on GTX1080 but afraid we can't do much as a part of this chaneg here and consider it a price to pay for more proper intersection check. Made in collaboration with Maxym Dmytrychenko, big thanks to him! Reviewers: brecht, juicyfruit, lukasstockner97, dingto Differential Revision: https://developer.blender.org/D1574	2017-03-28 17:26:47 +02:00
Sergey Sharybin	a96110e710	Cycles: Remove old non-optimized triangle intersection function It is unused now and if we want similar function we should use Pluecker intersection which is same performance with SSE optimization but which is more watertight.	2017-03-23 17:59:34 +01:00
Sergey Sharybin	a1348dde2e	Cycles: Fix speed regression on GPU Avoid construction of temporary array and make utility function force-inlined. Additionally avoid calling float4_to_float3 twice. This brings render times to the same values as before current patch series.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	a5b6742ed2	Cycles: Move watertight triangle intersection to an utility file This way the code can be reused more easily.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	f8a999c965	Cycles: Move triangle intersection precalc to an util file This is a preparation work for the followup commit which wil l move remaining parts of Woop intersection logic to an utility file. Doing it as a separate commit to keep changes more atomic and easier to bisect when/if needed.	2017-03-23 17:45:19 +01:00
Sergey Sharybin	aa0602130b	Cycles: Cleanup, code style and comments	2017-03-23 17:45:19 +01:00
Sergey Sharybin	1c5cceb7af	Cycles: Move intersection math to own header file There are following benefits: - Modifying intersection algorithm will not cause so much re-compilation. - It works around header dependency hell and allows us to use vectorization types much easier in there.	2017-03-23 17:45:19 +01:00

8 Commits