Commit Graph

3742 Commits

Author SHA1 Message Date
Sergey Sharybin
867d311307 Cycles: Fix warning with MSVC 2017-04-07 18:28:38 +02:00
Sergey Sharybin
7d77b3e813 Cycles: Fix compilation error with certain CUDA and host compiler configuration
This seems to happen on Windows only, happened to Thomas and Nathan already.

Similar patch Thomas was showing, but i do not see it committted. So comitting
now in order to get more developers and users happy.
2017-04-07 18:28:38 +02:00
lazydodo
b332fc8f23 [Cycles/msvc] Get cycles_kernel compile time under control.
Ever since we merged the extra texture types (half etc) and spit kernel the compile time for cycles_kernel has been going out of control.

It's currently sitting at a cool 1295.762 seconds with our standard compiler (2013/x64/release)

I'm not entirely sure why msvc gets upset with it, but the inlining of matrix near the bottom of the tri-cubic 3d interpolator is the source of the issue, this patch excludes it from being inlined.

This patch bring it back down to a manageable 186 seconds. (7x faster!!)

with the attached bzzt.blend that @sergey  kindly provided i got the following results with builds with identical hashes

58:51.73 buildbot
58:04.23 Patched

it's really close, the slight speedup could be explained by the switch instead of having multiple if's (switches do generate more optimal code than a chain of if/else/if/else statements) but in all honesty it might just have been pure luck (dev box,very polluted, bad for benchmarks) regardless, this patch doesn't seem to slow down anything with my limited testing.

{F532336}

{F532337}

Reviewers: brecht, lukasstockner97, juicyfruit, dingto, sergey

Reviewed By: brecht, dingto, sergey

Subscribers: InsigMathK, sergey

Tags: #cycles

Differential Revision: https://developer.blender.org/D2595
2017-04-07 10:26:55 -06:00
Sergey Sharybin
fd08570665 Cycles: Fix access of NULL pointer as array
Was confusing guarded allocator for some reason.
2017-04-07 15:08:00 +02:00
Sergey Sharybin
9706bfd25d Cycles: Fix corrupted mesh render when topology differs at the next frame 2017-04-07 12:49:10 +02:00
Mai Lavelle
91b9db0724 Cycles: Change work pool and global size of split CPU for easier debugging 2017-04-07 06:06:08 -04:00
Mai Lavelle
8f85ee2fc9 Cycles: Fix indentation 2017-04-07 06:06:08 -04:00
Mai Lavelle
5b45fff136 Cycles: Add missing flush 2017-04-07 06:06:08 -04:00
Mai Lavelle
d66ffaebef Cycles: Check ray state properly to avoid endless loop
The state mask wasnt applied before comparison giving false results. It
shouldnt really happen that a ray state contains any flags that need to
be masked away, but if it does happen its better to not get stuck.
2017-04-07 06:06:08 -04:00
Sergey Sharybin
52029e689c Cycles: Fix race condition in attributes creation during SVM compilation 2017-04-05 14:57:54 +02:00
Sergey Sharybin
3ce30823ff Cycles: Add utility class to simplify scoped spin locks 2017-04-05 14:57:34 +02:00
Sergey Sharybin
424901ad7b Cycles: Guard global write access in SVM compilation code 2017-04-05 14:21:49 +02:00
Sergey Sharybin
92aeb84fde Cycles: Tag shaders for update after the threading part is over
This avoids write access happening in non-atomic manner in
Shader::tag_update which modifies the global managers. Even
for 1 byte data types it's quite dangerous.
2017-04-04 15:43:12 +02:00
Sergey Sharybin
5ce95df2c6 Cycles: Fix uninitialized memory access when comparing curve mapping nodes
The issue is coming from the fact that float3 is actually 16 bytes aligned
data type and the "padding" was not initialized. This caused memcmp() to
access non-initialized memory.
2017-04-04 15:43:12 +02:00
Sergey Sharybin
ab347c8380 Fix T51115: Bump node is broken when the displacement socket is used 2017-04-03 10:51:00 +02:00
Sergey Sharybin
90df1142a3 Cycles: Solve threading conflict in shader synchronization
Update tag might access links (when checking for attributes) and
the links might be in the middle of rebuild in simplification
logic.
2017-03-31 17:08:18 +02:00
Mai Lavelle
4b7d95290f Cycles: More fixes after include changes 2017-03-31 10:12:13 +02:00
Sergey Sharybin
a88801b99b Cycles: Fix missing kernel re-compilation after recent changes
Reported by Mai in IRC, thanks!
2017-03-30 11:45:30 +02:00
Sergey Sharybin
ced8fff5de Fix T51051: Incorrect render on 32bit Linux
The issue was apparently caused by -fno-finite-math-only added to kernel.cpp
CFLAGS. For now just removed this flag from the kernel (we don't really want
it there at this point, and we don't have it for SSE/AVX optimized kernels).

But surely more investigation is needed here.
2017-03-30 11:37:31 +02:00
Sergey Sharybin
9b1564a862 Cycles: Cleanup, rename RegularBVH to BinaryBVH
Makes it more explicit what the structure is from it's name.
2017-03-30 09:47:27 +02:00
Sergey Sharybin
66ef0b8834 Cycles: Fix compilation error of app after the include directories change 2017-03-29 16:54:41 +02:00
Sergey Sharybin
48fa2c83eb Cycles: Attempt to work around compilation errors of CUDA on sm_2x 2017-03-29 16:22:51 +02:00
Sergey Sharybin
be17445714 Cycles: Cleanup, indentation 2017-03-29 15:41:56 +02:00
Sergey Sharybin
cc7386ec6b Cycles: Remove toolkit-specific workaround from kernel 2017-03-29 15:07:53 +02:00
Sergey Sharybin
5af4e1ca15 Cycles: Only use CUDA 8.0 as officially supported one
This deprecates CUDA 7.5.
2017-03-29 15:06:47 +02:00
Sergey Sharybin
270df9a60f Cycles: Cleanup, don't use m_ prefix for public properties 2017-03-29 14:45:49 +02:00
Sergey Sharybin
30bed91b78 Cycles: Fix compilation error with visibility flag disabled 2017-03-29 14:28:45 +02:00
Sergey Sharybin
0579eaae1f Cycles: Make all #include statements relative to cycles source directory
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.

For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.

Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.

This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.

Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.

Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner

Reviewed By: lukasstockner97, maiself, nirved, dingto

Subscribers: brecht

Differential Revision: https://developer.blender.org/D2586
2017-03-29 13:41:11 +02:00
Sergey Sharybin
61db9ee27a Cycles: Attempt to workaround compilation error on new CUDA toolkit and sm_2x 2017-03-29 11:50:17 +02:00
Sergey Sharybin
286adfde38 Cycles: Bring back preview AA samples when using BPT
This was removed in 93426cb. Please be more accurate when
changing interface.
2017-03-29 09:12:26 +02:00
Aaron Carlisle
93426cb295 Fix T51068: Place props in their own row
This allows the props to extend into the blank space that is to the right.
2017-03-28 16:33:05 -04:00
Sergey Sharybin
6ea54fe9ff Cycles: Switch to reformulated Pluecker ray/triangle intersection
The intention of this commit it to address issues mentioned in the
reports T43865,T50164 and T50452.

The code is based on Embree code with some extra vectorization
to speed up single ray to single triangle intersection.

Unfortunately, such a fix is not coming for free. There is some
slowdown for AVX2 processors, mainly due to different vectorization
code, which caused different number of instructions to be executed
and different instructions-per-cycle counters. But on another hand
this commit makes pre-AVX2 platforms such as AVX and SSE4.1 a bit
faster. The prerformance goes as following:

              2.78c AVX2   2.78c AVX   Patch AVX2         Patch AVX
BMW            05:21.09     06:05.34    05:32.97 (+3.5%)   05:34.97 (-8.5%)
Classroom      16:55.36     18:24.51    17:10.41 (+1.4%)   17:15.87 (-6.3%)
Fishy Cat      08:08.49     08:36.26    08:09.19 (+0.2%)   08:12.25 (-4.7%
Koro           11:22.54     11:45.24    11:13.25 (-1.5%)   11:43.81 (-0.3%)
Barcelone      14:18.32     16:09.46    14:15.20 (-0.4%)   14:25.15 (-10.8%)

On GPU the performance is about 1.5-2% slower in my tests on GTX1080
but afraid we can't do much as a part of this chaneg here and
consider it a price to pay for more proper intersection check.

Made in collaboration with Maxym Dmytrychenko, big thanks to him!

Reviewers: brecht, juicyfruit, lukasstockner97, dingto

Differential Revision: https://developer.blender.org/D1574
2017-03-28 17:26:47 +02:00
Sergey Sharybin
3f61280327 Cycles: Pass m128 vectors by const reference 2017-03-28 11:01:11 +02:00
Thomas Dinges
6a5e92c022 Cleanup: Use upper case consistently in adaptive feature compile logging. 2017-03-27 22:52:33 +02:00
Thomas Dinges
7a65f9b171 Cleanup: Resolve todo in CUDA voxel image code. 2017-03-27 22:36:26 +02:00
Thomas Dinges
0df33cc52d Cycles UI: Avoid abreviation for Hair Extension.
Since 2.5x we should try to avoid such abreviations in the UI, except for common terms like Min / Max as much as possible.
2017-03-27 21:59:29 +02:00
Thomas Dinges
0cfc557c5d Cycles: Move Shadow Catcher UI option next to Ray Visibility.
Previously it was beneath the Performance UI label, which was incorrect. It's better suited next to Ray Visibility.
2017-03-27 21:51:56 +02:00
Sergey Sharybin
bd053ac7ba Cycles: Correct ifdef around float3 intrinsics 2017-03-27 16:13:07 +02:00
Sergey Sharybin
8d48ea0233 Cycles: Make shadow catcher an optional feature for OpenCL
Solves majority of speed regression on AMD OpenCL.
2017-03-27 10:47:14 +02:00
Hristo Gueorguiev
e07ffcbd1c Cycles: Add OpenCL support for shadow catcher feature
The title says it all actually.
2017-03-27 10:46:59 +02:00
Hristo Gueorguiev
8ada7f7397 Cycles: Remove ccl_addr_space from RNG passed to functions
Simplifies code quite a bit, making it shorter and easier to extend.
Currently no functional changes for users, but is required for the
upcoming work of shadow catcher support with OpenCL.
2017-03-27 10:46:28 +02:00
Sergey Sharybin
d14e39622a Cycles: First implementation of shadow catcher
It uses an idea of accumulating all possible light reachable across the
light path (without taking shadow blocked into account) and accumulating
total shaded light across the path. Dividing second figure by first one
seems to be giving good estimate of the shadow.

In fact, to my knowledge, it's something really similar to what is
happening in the denoising branch, so we are aligned here which is good.

The workflow is following:

- Create an object which matches real-life object on which shadow is
  to be catched.

- Create approximate similar material on that object.

  This is needed to make indirect light properly affecting CG objects
  in the scene.

- Mark object as Shadow Catcher in the Object properties.

Ideally, after doing that it will be possible to render the image and
simply alpha-over it on top of real footage.
2017-03-27 10:46:03 +02:00
Lukas Stockner
5aaa643947 Cycles: Optimize shaders earlier to skip unneccessary attributes for noninteractive rendering
Before, Cycles would first sync the shader exactly as shown in the UI, then determine and sync the used attributes and later optimize the shader.
Therefore, even completely unconnected nodes would cause unneccessary attributes to be synced.

The reason for this is to avoid frequent resyncs when editing shaders interactively, but it can still be avoided for noninteractive renders - which is what this commit does.

Reviewed by: sergey

Differential Revision: https://developer.blender.org/D2285
2017-03-27 05:36:49 +02:00
Lukas Stockner
e9770adf63 Cycles: Remove obsolete variable from the TileManager 2017-03-24 19:44:05 +01:00
Sergey Sharybin
5b45715f8a Cycles: Correct isfinite check used in integrator
Use fast-math friendly version of this function.

We should probably avoid unsafe fast math, but this is to be done with
real care with all the benchmarks properly done.

For now comitting much safer fix.
2017-03-24 15:39:33 +01:00
Sergey Sharybin
85a5fbf2ce Cycles: Workaround incorrect SSS with CUDA toolkit 8.0.61 2017-03-24 10:08:18 +01:00
Sergey Sharybin
a96110e710 Cycles: Remove old non-optimized triangle intersection function
It is unused now and if we want similar function we should use
Pluecker intersection which is same performance with SSE optimization
but which is more watertight.
2017-03-23 17:59:34 +01:00
Sergey Sharybin
27248c8636 Cycles: Remove unused macro 2017-03-23 17:59:02 +01:00
Sergey Sharybin
ba8c7d2ba1 Cycles: Use SSE-optimized version of triangle intersection for motion triangles
The title says it all actually. Gives up to 10% speedup on test scenes here
on i7-6800K.

Render times on GPU are unreliable here, but there might be some slowdown
caused by watertight nature of intersections.
2017-03-23 17:58:03 +01:00
Sergey Sharybin
a1348dde2e Cycles: Fix speed regression on GPU
Avoid construction of temporary array and make utility function force-inlined.
Additionally avoid calling float4_to_float3 twice.

This brings render times to the same values as before current patch series.
2017-03-23 17:45:19 +01:00