Commit Graph

3933 Commits

Author SHA1 Message Date
3b12a71972 Fix T52125: principled BSDF missing with macOS OpenCL. 2017-07-20 15:15:43 +02:00
Stefan Werner
c1ca3c8038 Cycles: fixed the SM_2x CUDA kernel build that I broke in my previous commit 2017-07-20 13:28:34 +02:00
Stefan Werner
4bc6faf9c8 Fix T52107: Color management difference when using multiple and different GPUs together
This commit unifies the flattened texture slot names for bindless and regular CUDA textures. Texture indices are now identical across all CUDA architectures, where before Fermi used different indices, which lead to problems when rendering on multi-GPU setups mixing Fermi with newer hardware.
2017-07-20 10:03:27 +02:00
Sergey Sharybin
5f35682f3a Fix T52021: Shadow catcher renders wrong when catcher object is behind transparent object
Tweaked the path radiance summing and alpha to accommodate for possible contribution of
light by transparent surface bounces happening prior to shadow catcher intersection.

This commit will change the way how shadow catcher results looks when was behind semi
transparent object, but the old result seemed to be fully wrong: there were big artifacts
when alpha-overing the result on some actual footage.
2017-07-18 09:46:21 +02:00
Sergey Sharybin
d8906f30d3 Cycles: Remove meaningless camera ray check
In branched path tracing main loop is always a camera ray, with varying
number of transparent bounces.
2017-07-18 09:27:36 +02:00
Mai Lavelle
87164114a3 Cycles: Enable SSS from Principled BSDF only when actually in use
This gives speed up for the split kernel in scenes using the principled BSDF
but without subsurface scattering.
2017-07-12 04:40:24 -04:00
Mai Lavelle
1f933c94a7 Cycles: Fix comparison in principled BSDF
Could have lead to black pixels.
2017-07-11 23:41:22 -04:00
29ec0b1162 Fix T52027: OSL getattribute() crash, when optimizer calls it before rendering. 2017-07-11 22:39:51 +02:00
Sergey Sharybin
e26f61a2b5 Cycles: Disable OpenCL clFlush workarounds
This is something which was reported to work fine by Mai, Benjamin and
confirmed by myself. Disabling this workaround gains us some speedup:

                      Before           Now
bmw27                04:28.42        04:07.79
classroom            09:26.48        08:54.53
fishy_cat            08:44.01        08:18.70
koro                 09:17.98        08:57.18
pavillon_barcelone   12:26.64        11:52.81

Test environment is:
- Ubuntu 16.04, with all updates installed
- AMD RX 480 GPU
- amdgpu pro driver version 17.10-450821
2017-07-11 12:16:58 +02:00
3361f2107b Fix T51967: OSL crash after rendering finished (mainly on Windows). 2017-07-08 02:46:06 +02:00
Sergey Sharybin
fee7f688c3 Cycles: Fix ambiguity in call of min() function 2017-07-07 10:40:19 +02:00
Mai Lavelle
9c3f1ad003 Cycles: Add artificial memory limit debug option for OpenCL 2017-07-06 05:25:46 -04:00
Mai Lavelle
95b345b2fe Revert "Cycles: use std::min and max for extra overloads"
We already have this in util_algorithm.h

This reverts commit cff172c7621d89773baa99a9460f19056efb5f1e.
2017-07-06 04:21:29 -04:00
Mai Lavelle
f9963f29e8 Cycles: Dont allow global size to fall to zero 2017-07-05 20:19:15 -04:00
Mai Lavelle
222b96e5c7 Cycles: Detect out of memory before buffer allocation in OpenCL devices 2017-07-05 20:19:12 -04:00
Mai Lavelle
cff172c762 Cycles: use std::min and max for extra overloads 2017-07-05 19:43:34 -04:00
Sergey Sharybin
31f8ca5034 Cycles: Fix compilation error after recent logging changes
This file uses std::ostream for helper << operators, so need to make sure
corresponding header is included.
2017-07-05 20:40:55 +02:00
Sergey Sharybin
d37dd97e45 Cycles: Pass string by const reference rather than by value
Some of the functions might have been inlined, but others i don't see
how that was possible (don't think virtual functions can be inlined here).

In any case, better be explicitly optimal in the code.
2017-07-05 12:27:41 +02:00
Sergey Sharybin
58c456b12d Cycles: Fix compilation error when building without Glog and no C++11 2017-07-05 12:01:12 +02:00
Lukas Stockner
15fd758bd6 Fix T51950: Abnormally long Cycles OpenCL GPU render times with certain panoramic camera settings
The problem here was that when a "invalid" path is generated by the panoramic camera, it was tagged
as RAY_TO_REGENERATE with the intention of generating a new path in kernel_buffer_update.

However, since that state was not handled in kernel_queue_enqueue, kernel_buffer_update did not
process the path which resulted in an infinite loop.
2017-07-03 18:26:19 +02:00
Lukas Stockner
6782a6076c Cycles: Add missing split kernel to CPUDevice 2017-07-03 18:26:18 +02:00
Luca Rood
d48a9528ca Fix missing return error introduced by last commit
End of non-void function was being reached since
f5535fcb83fd7c1374697923b43565c9e303d225
2017-07-03 12:12:27 +02:00
f5535fcb83 Fi T51023: MixRGB constant folding not effective with clamp option. 2017-07-03 05:25:27 +02:00
cda24d0853 Fix T51855: Cycles emssive objects with NaN transform break lighting. 2017-07-03 05:04:43 +02:00
29c8c50442 Fix T51956: color noise with principled sss, radius 0 and branched path. 2017-07-02 19:21:08 +02:00
52b9516e03 Fix principled BSDF incorrectly missing subsurface component with base color black. 2017-07-02 18:22:24 +02:00
Mai Lavelle
c8fa716c06 Cycles: Use float constants instead of double 2017-06-29 23:07:18 -04:00
Mai Lavelle
56dcfcce05 Cycles: Disable baking in mega kernel when not in use to improve build times 2017-06-29 23:07:18 -04:00
Lukas Stockner
1f3fd8e60a Fix T51909: Cycles: Uninitialized closure normals for the Hair BSDF
As the title says, the normal wasn't set for the Hair BSDF because it wasn't
needed before. However, the denoiser uses it to store the feature passes, so
it needs to be set now.
2017-06-28 21:32:02 +02:00
Lukas Stockner
1979176088 Cycles: Fix excessive sampling weight of glossy Principled BSDF components
If there was any specularity in the Principled BSDF, it would get a sampling
weight of one regardless of its actual impact.

This commit makes Cycles estimate the contribution of the component and adjust
the weighting accordingly, which greatly improves the noise characteristics of
the Principled BSDF in many cases.

Note that this commit might slightly change the brightness of areas when using
MultiGGX and high roughnesses, but the new brightness is more accurate and
closer to the result of Branched Path Tracing. See T51836 for details.

Differential Revision: https://developer.blender.org/D2677
2017-06-22 00:09:56 +02:00
Lukas Stockner
8cb741a598 Fix T51836: Cycles: Fix incorrect PDF approximations of the MultiGGX closures
The PDF of the MultiGGX sampling is approximated by the singlescattering GGX
term as well as a scaled diffuse term that makes up for the energy in the
multiscattering component that's missed by GGX.

However, there were two problems with the glossy terms: The diffuse term missed
a normalization factor, and the singlescattering term was not properly scaled
down based on the albedo estimate.

The glass term was completely wrong and has been rewritten. It uses the fresnel
factor to weight reflection vs. refraction and uses the glossy MultiGGX model
for reflection.
For refraction, the correct singlescattering term is now used, and a new
albedo approximation is used that was derived by evaluating GGX albedo for
roughnesses from 0 to 1 and IORs from 1 to 3 and fitting numerical
approximations to it. The resulting model has a mean relative error of 9e-5,
but could probably be simplified without losing noticable accuracy in the
final render.

The improved PDFs help with glossy highlights (due to better light sampling vs.
closure sampling MIS) and fix the situation described in T51836 where mixing
MultiGGX with other closures (as it happens in e.g. the Principled
BSDF) causes incorrect darkening.
2017-06-22 00:09:56 +02:00
14ea0c5fcc Fix T51849: change Cycles clearcoat gloss to roughness.
This is compatible with UE4 and more consistent with specular and transmission
roughness, even if it deviates from the original Disney BRDF.
2017-06-21 19:55:20 +02:00
Sergey Sharybin
794311c92b Cycles: Fix race condition happening in progress utility
This is not enough to mutex-guard modification code of integer values,
since this operation is NOT atomic. This is not even safe for a single
byte data types.

For now guarded the getter functions, similar to other functions in
this module.

Ideally we want to switch modification to an atomic operations, so we
wouldn't need any locks in the getters.
2017-06-16 10:22:35 +02:00
Sergey Sharybin
64aa0cff89 Cycles: Fix typo in comment 2017-06-14 09:54:07 +02:00
Hristo Gueorguiev
6cfa3ecd4d Fix T51791: Point Density doesn't work on GPU 2017-06-13 13:50:27 +02:00
Sergey Sharybin
40c04dd649 Cycles: Cleanup, indentation 2017-06-13 10:28:38 +02:00
Sergey Sharybin
0aa5431998 Cycles: Fix compilation error of OpenCL mega kernel
Was some mismatch in address space. Seems to be caused by recent additions.

Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.

Thanks Mai for testing the patch!
2017-06-13 10:26:45 +02:00
Campbell Barton
00c4f49a6d Cleanup: indentation, long lines 2017-06-12 13:38:21 +10:00
Hristo Gueorguiev
04530c9383 Cycles: adjust supported driver version for AMD GPUs
On Windows 17.Q1 and 17.Q2 return driver version 2236.10.
2017-06-11 23:17:46 +02:00
Lukas Stockner
558bea2252 Cycles Denoising: Add more failsafes for invalid pixels
Now, when there is no usable neighboring pixel for denoising, the noisy value
is preserved instead of producing a NaN.
Also, negative results are clamped to zero.

Note that there are just workarounds that don't fix the underlying problems,
but these issues are very rare and I'm not sure if it's even possible to fix
the underlying problems without introducing a significant slowdown or quality
decrease in other situations.
Because of that and since 2.79 is happening very soon, I just went for these
workarounds for now.
2017-06-11 01:51:39 +02:00
Sergey Sharybin
e097fc4aa6 Cycles: Selectively include denoising in kernel 2017-06-10 04:45:13 -04:00
Mai Lavelle
eb293f59f2 Cycles: Pass all buffers to each kernel call for OpenCL
Technically not passing all buffers used by a kernel is undefined
behavior. We haven't had any issues with this so far on AMD or
Nvidia, but it's known to be a problem with Intel and we received
a report from AMD that this is a problem on newer hardware, so we
need to make this change at some point.

Unfortunately there a cost to being correct, about 5% for the
benchmark scenes. For low sample counts it's even worse, I've
seen up to 50% slowdown. For the latter case I think adjusting
tile updating logic can help, but not sure what that would look
like yet (it would be just a few lines change however).
2017-06-10 04:08:49 -04:00
Mai Lavelle
6238214159 Cycles: Faster split branched path tracing by sharing samples with inactive threads
Unlike regular path tracing, branched path tracing is usually used with lower
sample counts, at least for primary rays. This means that are less samples for
the GPU to work on in parallel and rendering is slower. As there is less work
overall there is also more inactive threads during rendering with BPT. This
patch makes use of those inactive rays to render branched samples in parallel
with other samples.

Each thread that is preparing for a branched sample will attempt to find an
inactive thread and if one is found the state for the sample is copied to that
thread. Potentially, if there are enough inactive threads, 100s of branched
samples could be generated from the same originating thread and ran in
parallel giving large speed ups.

Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW
with car paint replaced with SSS/volumes.
2017-06-10 04:08:49 -04:00
Mai Lavelle
32299d32e7 Cycles: Modify path_radiance_accum_sample to use atomics for split kernel
Samples ran in parallel need a safe way to accumulate their results
with the results of other threads.
2017-06-10 04:08:02 -04:00
Mai Lavelle
6995b50e41 Cycles: Add function to dequeue a ray 2017-06-10 03:51:18 -04:00
Mai Lavelle
4360e8ce13 Cycles: Add atomic decrement functions to util_atomic.h 2017-06-10 03:51:18 -04:00
Mai Lavelle
ea846a4dfc Cycles: Add kernel to enqueue inactive rays
The queue will be used to make reuse of inactive threads to keep
the GPU more busy.
2017-06-10 03:51:18 -04:00
Hristo Gueorguiev
1f0998baa7 Cycles: Blacklist unsupported OpenCL devices
Due to various driver issues with AMD GCN 1 cards we can no longer support
these GPUs. This patch makes them unavailable to select for Cycles rendering.

GCN cards 2 and higher are still supported. Please use the most recent
drivers available to ensure proper functionality.

See here for a list to check which GPUs are supported:
https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units
2017-06-10 03:51:18 -04:00
Lukas Stockner
c73206acc5 Cycles: Fix denoising passes being written when they're not actually generated 2017-06-09 23:02:56 +02:00
Lukas Stockner
0a898e2405 Cleanup Cycles Denoising platform-specific defines 2017-06-09 22:38:16 +02:00