Commit Graph

3823 Commits

Author SHA1 Message Date
Sergey Sharybin
850bb7a50b Cycles: Cleanup, indentation 2017-05-05 16:54:37 +02:00
Hristo Gueorguiev
8b97e42eca Cycles: Split kernel SSS & Volume data definitions cleanup 2017-05-05 13:42:26 +02:00
Hristo Gueorguiev
b9fda4480f Cycles: Show samples progress for OpenCL split kernel 2017-05-05 13:37:21 +02:00
Hristo Gueorguiev
f3c3483242 Cycles: Workaround for AMD GPU OpenCL compiler
Fix for SSS in BPT.
2017-05-05 13:00:43 +02:00
Pablo Vazquez
d29e3ebcc6 Typo: 'Signle program' -> 'Single program' 2017-05-04 22:15:53 +02:00
Lukas Stockner
ed688e4843 Cycles: Fix crash when assigning KernelGlobals
The memory isn't initialized during allocation, so calling the assignment operator is a bad idea.
2017-05-04 20:49:04 +02:00
Sergey Sharybin
a523dfd2fd Fix T51412: Instant crash with texture plugged into the Displacement output
The issue was caused by unlimited textures commit, root of the issue is that
displacement code updates some of the image slots directly, so it needs to
ensure device vectors are all proper size.
2017-05-04 16:28:22 +02:00
Sergey Sharybin
5fde78dcad Cycles: Fix unused argument warning when building without debug passes 2017-05-04 09:33:51 +02:00
Dalai Felinto
c171f0b3c9 Fix Cycles build on Windows 2017-05-03 21:16:45 +02:00
Lukas Stockner
4cf7fc3b3a Render API/Cycles: Identify Render Passes by their name instead of a type flag
Previously, every RenderPass would have a bitfield that specified its type. That limits the number of passes to 32, which was reached a while ago.
However, most of the code already supported arbitrary RenderPasses since they were also used to store Multilayer EXR images.
Therefore, this commit completely removes the passflag from RenderPass and changes all code to use the unique pass name for identification.
Since Blender Internal relies on hardcoded passes and to preserve compatibility, 32 pass names are reserved for the old hardcoded passes.

To support these arbitrary passes, the Render Result compositor node now adds dynamic sockets. For compatibility, the old hardcoded sockets are always stored and just hidden when the corresponding pass isn't available.

To use these changes, the Render Engine API now includes a function that allows render engines to add arbitrary passes to the render result. To be able to add options for these passes, addons can now add their own properties to SceneRenderLayers.
To keep the compositor input node updated, render engine plugins have to implement a callback that registers all the passes that will be generated.

From a user perspective, nothing should change with this commit.

Differential Revision: https://developer.blender.org/D2443

Differential Revision: https://developer.blender.org/D2444
2017-05-03 16:44:52 +02:00
Hristo Gueorguiev
6bf4115c13 Cycles: Split kernel - sort shaders
Reduce thread divergence in kernel_shader_eval.

Rays are sorted in blocks of 2048 according to shader->id.

On R9 290 Classroom is ~30% faster, and Pabellon Barcelone is ~8% faster.

No sorting for CUDA split kernel.

Reviewers: sergey, maiself

Reviewed By: maiself

Differential Revision: https://developer.blender.org/D2598
2017-05-03 15:30:45 +02:00
Sergey Sharybin
6f9c839f44 Cycles: Fix OpenCL compilation failure after recent color changes
It is really confusing to have some functions available in some devices
and not on another devices.
2017-05-03 14:11:19 +02:00
Sergey Sharybin
44991a0132 Cycles: Use render visibility for duplis when Render Layer option in viewport is used
Previously the logic was different for duplis and regular objects: regular objects
were using render visibility when Render Layer option is enabled which duplis were
always using viewport visibility when rendering from the viewport.

This was quite confusing because caused different results in viewport and render
when artists were expecting them to match 1:1.
2017-05-03 12:14:05 +02:00
Sergey Sharybin
cea0236026 Cycles: Simplify code in SVM image by using new utility function
Can not measure any performance difference, so seems the code is identical
and just shorter.
2017-05-03 11:22:48 +02:00
Sergey Sharybin
e616cd5706 Cycles: Add utility function to convert float4 color from srgb to linear
It will use SSE2 optimized version when is possible.
2017-05-03 11:19:40 +02:00
Mai Lavelle
d187014675 Cycles: Remove extra clFinish from driver workaround
These were causing problems with Nvidia OpenCL.
2017-05-02 14:26:46 -04:00
Mai Lavelle
299d839dc5 Cycles: Output split state element size 2017-05-02 14:26:46 -04:00
Mai Lavelle
915766f42d Cycles: Branched path tracing for the split kernel
This implements branched path tracing for the split kernel.

General approach is to store the ray state at a branch point, trace the
branched ray as normal, then restore the state as necessary before iterating
to the next part of the path. A state machine is used to advance the indirect
loop state, which avoids the need to add any new kernels. Each iteration the
state machine recreates as much state as possible from the stored ray to keep
overall storage down.

Its kind of hard to keep all the different integration loops in sync, so this
needs lots of testing to make sure everything is working correctly. We should
probably start trying to deduplicate the integration loops more now.

Nonbranched BMW is ~2% slower, while classroom is ~2% faster, other scenes
could use more testing still.

Reviewers: sergey, nirved

Reviewed By: nirved

Subscribers: Blendify, bliblubli

Differential Revision: https://developer.blender.org/D2611
2017-05-02 14:26:46 -04:00
Sergey Sharybin
4846184095 Cycles: Fix missing type declaration in OpenCL image
Spotted by Mai in IRC, thanks!
2017-05-02 15:39:33 +02:00
Sergey Sharybin
4384a7cf46 Cycles: Fix CUDA split kernel
Global size y needs to be a multiple of 16.
2017-05-02 15:03:51 +02:00
Sergey Sharybin
4174e533c0 Cycles: Cache split kernels in CUDA device
This way we don't re-load kernels for every sample in the viewport.
Additionally, we don't risk global size changed inbetween of samples.
2017-05-02 15:03:12 +02:00
Mai Lavelle
8f66d6826b Cycles: Fix crashes after recent image changes
Not sure if this is a proper fix, but was getting frequent crashes, so
committing this real quick just to make master sable again. Can be
reverted later if there's a better fix. The changes to images really
need a closer look...
2017-04-28 18:54:18 -04:00
Sergey Sharybin
9ebd737df3 Cycles: Use relative path for #line directives
This way moving Blender bundle around doesn't re-trigger kernels compilation.
2017-04-28 17:46:11 +02:00
Sergey Sharybin
c648ddb9a1 Cycles: Correct comment after previous commit 2017-04-28 16:47:24 +02:00
Sergey Sharybin
9ff88a596c Cycles: Lower default severity level to ERROR 2017-04-28 16:46:30 +02:00
Sergey Sharybin
f383c926b4 Cycles: De-duplicate bit magic for decoding image options in OpenCL kernel 2017-04-28 15:20:34 +02:00
Sergey Sharybin
f8d0b27d9d Cycles: Simplify code around maximum OpenCL info size allocation 2017-04-28 15:15:15 +02:00
Sergey Sharybin
407fd66d0a Cycles: Cleanup, de-duplicate image packing of various types 2017-04-28 15:08:54 +02:00
Sergey Sharybin
0f339a748e Cycles: Quick (real) fix for broken textures on OpenCL
Previous fix did not work for mixed textures. This one will over-allocate
information array, but it's better than not being able to render at all.

Some more cleanup and improvement is coming.
2017-04-28 14:56:22 +02:00
Sergey Sharybin
1ec59c8e06 Revert "Cycles: Fix image textures were completely broken since recent unlimited textures commit"
This reverts commit 8f4166ee495531fa38b676b0a5ef4c482e89f9a5.

The fix was not correct for cases when we've got float textures.
2017-04-28 14:48:40 +02:00
Sergey Sharybin
06034b147a Cycles: Cleanup, spelling and braces 2017-04-28 14:10:21 +02:00
Sergey Sharybin
8f4166ee49 Cycles: Fix image textures were completely broken since recent unlimited textures commit
The indexing was totally wrong in both image packing code and image sampling in kernel.

Fixes T51341: Cycles OpenCL corruption in todays buildbot
2017-04-28 14:04:27 +02:00
Sergey Sharybin
e3fc945fe2 Cycles: Cleanup, always use braces for blocks 2017-04-28 13:39:14 +02:00
Sergey Sharybin
82e5f60302 Cycles: Cleanup, indentation in preprocessor 2017-04-28 13:24:09 +02:00
Sergey Sharybin
20ebeb1a15 Cycles: Cleanup, use ccl::vector instead of std::vector 2017-04-28 13:22:07 +02:00
Sergey Sharybin
4245ed360e Cycles: Cleanup, indentaiton and trailing whitespace and wrapping 2017-04-28 13:21:17 +02:00
Thomas Dinges
a00f54332d Cleanup: Some style and code tweaks to Image Code after changes.
Whitespace and order of switch/case etc. Let's try to stick to float4/byte4/half4/float/byte/half order as defined in "ImageDataType".
2017-04-27 11:11:08 +02:00
Stefan Werner
ec25060a05 Unlimited number of textures for Cycles
This patch allows for an unlimited number of textures in Cycles where the hardware allows. It replaces a number static arrays with dynamic arrays and changes the way the flat_slot indices are calculated. Eventually, I'd like to get to a point where there are only flat slots left and textures off all kinds are stored in a single array.

Note that the arrays in DeviceScene are changed from containing device_vector<T> objects to device_vector<T>* pointers. Ideally, I'd like to store objects, but dynamic resizing of a std:vector in pre-C++11 calls the copy constructor, which for a good reason is not implemented for device_vector. Once we require C++11 for Cycles builds, we can implement a move constructor for device_vector and store objects again.

The limits for CUDA Fermi hardware still apply.

Reviewers: tod_baudais, InsigMathK, dingto, #cycles

Reviewed By: dingto, #cycles

Subscribers: dingto, smellslikedonkey

Differential Revision: https://developer.blender.org/D2650
2017-04-27 09:35:22 +02:00
Mai Lavelle
7c1263c1ee Cycles: Allow samples to finish in split kernel to avoid artifacts when canceling
Previously canceling a render done by the split kernel could cause artifacts
such as very bright or dark tiles. This was caused by unfinished samples
being included in the output buffer. To avoid this we now wait till all the
currently rendering samples have finished, up to a limit of twice the
expected time for them to finish (currently this is no more than 20 seconds,
but usually its much less). If samples still haven't finished by then we
stop anyways in case there's an endless loop occurring.
2017-04-26 10:48:15 -04:00
Mai Lavelle
90b2539248 Cycles: Change OpenCL split kernel to use single program by default
Single program builds twice as fast as multi programs, so its better for
users to have it as the default.
2017-04-26 10:48:15 -04:00
Mai Lavelle
fe81a32f69 Cycles: Enable Correlated Multi Jitter for OpenCL and split kernel
Testing showed no issues so there's no reason to not have this.
2017-04-26 10:48:15 -04:00
Sergey Sharybin
be60e9b8c5 Cycles: Fix over-allocation of triangles storage for triangle primitive hair
Was also causing some bad memory access caused by read data from non-initialized
arrays.

Repoted by bzztploink in IRC, thanks!
2017-04-26 16:00:02 +02:00
Sergey Sharybin
c0f555905d User preferences: Use checkbox for Cycles device selection
It was totally unclear whether the device is enabled or disabled.
Lots of people got fully lost in the current interface.

While the solution is not fully ideal, it is at least solves
ambiguity in the interface.
2017-04-26 16:00:02 +02:00
Sergey Sharybin
9623d93f14 Cycles: Fix access undefined macro on non-MSVC compiler
Also rremove trailing whitespace.
2017-04-26 10:00:31 +02:00
lazydodo
0f2d0ff124 workaround for T50176
This works around a long outstanding issue T50176 with cycles on msvc2015/x86 . root cause is still unknown though,feels like a game of whack'a'mole

Reviewers: sergey, dingto

Subscribers: Blendify

Tags: #cycles

Differential Revision: https://developer.blender.org/D2573
2017-04-25 14:21:46 -06:00
Hristo Gueorguiev
e91dc3a97c Cycles: use safe compiler flags for OpenCL.
Using -cl-fast-relaxed-math assumes no NaN/Inf values in any expression.
This causes problems on overflow, division by zero, square root of negative number.
Comparisons with NaN or infinite value are affected as well.

This patch causes <2% slowdown on benchmark scenes.

Fix T50985: Rendering volume scatter with GPU OpenCL comes to an halt after a few seconds
2017-04-25 20:10:51 +02:00
Hristo Gueorguiev
9d26e32ea2 Workaround for AMD GPU OpenCL compiler. 2017-04-25 20:08:14 +02:00
Sergey Sharybin
ab4f6f01a6 Cycles: Fix strict compiler flags 2017-04-25 14:12:14 +02:00
Sergey Sharybin
1f85a35a3d Cycles: Cleanup, mainly line length in random module
Was doing lots of investigation recently, with need to have lots of things
side by side.
2017-04-25 11:43:20 +02:00
Sergey Sharybin
0a07cdbe80 Cycles: Split vectorized math utilities to a dedicated files
This file was even a bigger mess than vectorized types header,
cleaning it up to make it easier to maintain this files and
extend further.
2017-04-25 10:33:26 +02:00