Commit Graph

3925 Commits

Author SHA1 Message Date
Sergey Sharybin
e26f61a2b5 Cycles: Disable OpenCL clFlush workarounds
This is something which was reported to work fine by Mai, Benjamin and
confirmed by myself. Disabling this workaround gains us some speedup:

                      Before           Now
bmw27                04:28.42        04:07.79
classroom            09:26.48        08:54.53
fishy_cat            08:44.01        08:18.70
koro                 09:17.98        08:57.18
pavillon_barcelone   12:26.64        11:52.81

Test environment is:
- Ubuntu 16.04, with all updates installed
- AMD RX 480 GPU
- amdgpu pro driver version 17.10-450821
2017-07-11 12:16:58 +02:00
3361f2107b Fix T51967: OSL crash after rendering finished (mainly on Windows). 2017-07-08 02:46:06 +02:00
Sergey Sharybin
fee7f688c3 Cycles: Fix ambiguity in call of min() function 2017-07-07 10:40:19 +02:00
Mai Lavelle
9c3f1ad003 Cycles: Add artificial memory limit debug option for OpenCL 2017-07-06 05:25:46 -04:00
Mai Lavelle
95b345b2fe Revert "Cycles: use std::min and max for extra overloads"
We already have this in util_algorithm.h

This reverts commit cff172c7621d89773baa99a9460f19056efb5f1e.
2017-07-06 04:21:29 -04:00
Mai Lavelle
f9963f29e8 Cycles: Dont allow global size to fall to zero 2017-07-05 20:19:15 -04:00
Mai Lavelle
222b96e5c7 Cycles: Detect out of memory before buffer allocation in OpenCL devices 2017-07-05 20:19:12 -04:00
Mai Lavelle
cff172c762 Cycles: use std::min and max for extra overloads 2017-07-05 19:43:34 -04:00
Sergey Sharybin
31f8ca5034 Cycles: Fix compilation error after recent logging changes
This file uses std::ostream for helper << operators, so need to make sure
corresponding header is included.
2017-07-05 20:40:55 +02:00
Sergey Sharybin
d37dd97e45 Cycles: Pass string by const reference rather than by value
Some of the functions might have been inlined, but others i don't see
how that was possible (don't think virtual functions can be inlined here).

In any case, better be explicitly optimal in the code.
2017-07-05 12:27:41 +02:00
Sergey Sharybin
58c456b12d Cycles: Fix compilation error when building without Glog and no C++11 2017-07-05 12:01:12 +02:00
Lukas Stockner
15fd758bd6 Fix T51950: Abnormally long Cycles OpenCL GPU render times with certain panoramic camera settings
The problem here was that when a "invalid" path is generated by the panoramic camera, it was tagged
as RAY_TO_REGENERATE with the intention of generating a new path in kernel_buffer_update.

However, since that state was not handled in kernel_queue_enqueue, kernel_buffer_update did not
process the path which resulted in an infinite loop.
2017-07-03 18:26:19 +02:00
Lukas Stockner
6782a6076c Cycles: Add missing split kernel to CPUDevice 2017-07-03 18:26:18 +02:00
Luca Rood
d48a9528ca Fix missing return error introduced by last commit
End of non-void function was being reached since
f5535fcb83fd7c1374697923b43565c9e303d225
2017-07-03 12:12:27 +02:00
f5535fcb83 Fi T51023: MixRGB constant folding not effective with clamp option. 2017-07-03 05:25:27 +02:00
cda24d0853 Fix T51855: Cycles emssive objects with NaN transform break lighting. 2017-07-03 05:04:43 +02:00
29c8c50442 Fix T51956: color noise with principled sss, radius 0 and branched path. 2017-07-02 19:21:08 +02:00
52b9516e03 Fix principled BSDF incorrectly missing subsurface component with base color black. 2017-07-02 18:22:24 +02:00
Mai Lavelle
c8fa716c06 Cycles: Use float constants instead of double 2017-06-29 23:07:18 -04:00
Mai Lavelle
56dcfcce05 Cycles: Disable baking in mega kernel when not in use to improve build times 2017-06-29 23:07:18 -04:00
Lukas Stockner
1f3fd8e60a Fix T51909: Cycles: Uninitialized closure normals for the Hair BSDF
As the title says, the normal wasn't set for the Hair BSDF because it wasn't
needed before. However, the denoiser uses it to store the feature passes, so
it needs to be set now.
2017-06-28 21:32:02 +02:00
Lukas Stockner
1979176088 Cycles: Fix excessive sampling weight of glossy Principled BSDF components
If there was any specularity in the Principled BSDF, it would get a sampling
weight of one regardless of its actual impact.

This commit makes Cycles estimate the contribution of the component and adjust
the weighting accordingly, which greatly improves the noise characteristics of
the Principled BSDF in many cases.

Note that this commit might slightly change the brightness of areas when using
MultiGGX and high roughnesses, but the new brightness is more accurate and
closer to the result of Branched Path Tracing. See T51836 for details.

Differential Revision: https://developer.blender.org/D2677
2017-06-22 00:09:56 +02:00
Lukas Stockner
8cb741a598 Fix T51836: Cycles: Fix incorrect PDF approximations of the MultiGGX closures
The PDF of the MultiGGX sampling is approximated by the singlescattering GGX
term as well as a scaled diffuse term that makes up for the energy in the
multiscattering component that's missed by GGX.

However, there were two problems with the glossy terms: The diffuse term missed
a normalization factor, and the singlescattering term was not properly scaled
down based on the albedo estimate.

The glass term was completely wrong and has been rewritten. It uses the fresnel
factor to weight reflection vs. refraction and uses the glossy MultiGGX model
for reflection.
For refraction, the correct singlescattering term is now used, and a new
albedo approximation is used that was derived by evaluating GGX albedo for
roughnesses from 0 to 1 and IORs from 1 to 3 and fitting numerical
approximations to it. The resulting model has a mean relative error of 9e-5,
but could probably be simplified without losing noticable accuracy in the
final render.

The improved PDFs help with glossy highlights (due to better light sampling vs.
closure sampling MIS) and fix the situation described in T51836 where mixing
MultiGGX with other closures (as it happens in e.g. the Principled
BSDF) causes incorrect darkening.
2017-06-22 00:09:56 +02:00
14ea0c5fcc Fix T51849: change Cycles clearcoat gloss to roughness.
This is compatible with UE4 and more consistent with specular and transmission
roughness, even if it deviates from the original Disney BRDF.
2017-06-21 19:55:20 +02:00
Sergey Sharybin
794311c92b Cycles: Fix race condition happening in progress utility
This is not enough to mutex-guard modification code of integer values,
since this operation is NOT atomic. This is not even safe for a single
byte data types.

For now guarded the getter functions, similar to other functions in
this module.

Ideally we want to switch modification to an atomic operations, so we
wouldn't need any locks in the getters.
2017-06-16 10:22:35 +02:00
Sergey Sharybin
64aa0cff89 Cycles: Fix typo in comment 2017-06-14 09:54:07 +02:00
Hristo Gueorguiev
6cfa3ecd4d Fix T51791: Point Density doesn't work on GPU 2017-06-13 13:50:27 +02:00
Sergey Sharybin
40c04dd649 Cycles: Cleanup, indentation 2017-06-13 10:28:38 +02:00
Sergey Sharybin
0aa5431998 Cycles: Fix compilation error of OpenCL mega kernel
Was some mismatch in address space. Seems to be caused by recent additions.

Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.

Thanks Mai for testing the patch!
2017-06-13 10:26:45 +02:00
Campbell Barton
00c4f49a6d Cleanup: indentation, long lines 2017-06-12 13:38:21 +10:00
Hristo Gueorguiev
04530c9383 Cycles: adjust supported driver version for AMD GPUs
On Windows 17.Q1 and 17.Q2 return driver version 2236.10.
2017-06-11 23:17:46 +02:00
Lukas Stockner
558bea2252 Cycles Denoising: Add more failsafes for invalid pixels
Now, when there is no usable neighboring pixel for denoising, the noisy value
is preserved instead of producing a NaN.
Also, negative results are clamped to zero.

Note that there are just workarounds that don't fix the underlying problems,
but these issues are very rare and I'm not sure if it's even possible to fix
the underlying problems without introducing a significant slowdown or quality
decrease in other situations.
Because of that and since 2.79 is happening very soon, I just went for these
workarounds for now.
2017-06-11 01:51:39 +02:00
Sergey Sharybin
e097fc4aa6 Cycles: Selectively include denoising in kernel 2017-06-10 04:45:13 -04:00
Mai Lavelle
eb293f59f2 Cycles: Pass all buffers to each kernel call for OpenCL
Technically not passing all buffers used by a kernel is undefined
behavior. We haven't had any issues with this so far on AMD or
Nvidia, but it's known to be a problem with Intel and we received
a report from AMD that this is a problem on newer hardware, so we
need to make this change at some point.

Unfortunately there a cost to being correct, about 5% for the
benchmark scenes. For low sample counts it's even worse, I've
seen up to 50% slowdown. For the latter case I think adjusting
tile updating logic can help, but not sure what that would look
like yet (it would be just a few lines change however).
2017-06-10 04:08:49 -04:00
Mai Lavelle
6238214159 Cycles: Faster split branched path tracing by sharing samples with inactive threads
Unlike regular path tracing, branched path tracing is usually used with lower
sample counts, at least for primary rays. This means that are less samples for
the GPU to work on in parallel and rendering is slower. As there is less work
overall there is also more inactive threads during rendering with BPT. This
patch makes use of those inactive rays to render branched samples in parallel
with other samples.

Each thread that is preparing for a branched sample will attempt to find an
inactive thread and if one is found the state for the sample is copied to that
thread. Potentially, if there are enough inactive threads, 100s of branched
samples could be generated from the same originating thread and ran in
parallel giving large speed ups.

Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW
with car paint replaced with SSS/volumes.
2017-06-10 04:08:49 -04:00
Mai Lavelle
32299d32e7 Cycles: Modify path_radiance_accum_sample to use atomics for split kernel
Samples ran in parallel need a safe way to accumulate their results
with the results of other threads.
2017-06-10 04:08:02 -04:00
Mai Lavelle
6995b50e41 Cycles: Add function to dequeue a ray 2017-06-10 03:51:18 -04:00
Mai Lavelle
4360e8ce13 Cycles: Add atomic decrement functions to util_atomic.h 2017-06-10 03:51:18 -04:00
Mai Lavelle
ea846a4dfc Cycles: Add kernel to enqueue inactive rays
The queue will be used to make reuse of inactive threads to keep
the GPU more busy.
2017-06-10 03:51:18 -04:00
Hristo Gueorguiev
1f0998baa7 Cycles: Blacklist unsupported OpenCL devices
Due to various driver issues with AMD GCN 1 cards we can no longer support
these GPUs. This patch makes them unavailable to select for Cycles rendering.

GCN cards 2 and higher are still supported. Please use the most recent
drivers available to ensure proper functionality.

See here for a list to check which GPUs are supported:
https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units
2017-06-10 03:51:18 -04:00
Lukas Stockner
c73206acc5 Cycles: Fix denoising passes being written when they're not actually generated 2017-06-09 23:02:56 +02:00
Lukas Stockner
0a898e2405 Cleanup Cycles Denoising platform-specific defines 2017-06-09 22:38:16 +02:00
Lukas Stockner
7dc51f87ed Cycles Denoising: Speedup reconstruction by skipping near-zero weights 2017-06-09 22:38:16 +02:00
Lukas Stockner
705c43be0b Cycles Denoising: Merge outlier heuristic and confidence interval test
The previous outlier heuristic only checked whether the pixel is more than
twice as bright compared to the 75% quantile of the 5x5 neighborhood.
While this detected fireflies robustly, it also incorrectly marked a lot of
legitimate small highlights as outliers and filtered them away.

This commit adds an additional condition for marking a pixel as a firefly:
In addition to being above the reference brightness, the lower end of the
3-sigma confidence interval has to be below it.
Since the lower end approximates how low the true value of the pixel might be,
this test separates pixels that are supposed to be very bright from pixels that
are very bright due to random fireflies.

Also, since there is now a reliable outlier filter as a preprocessing step,
the additional confidence interval test in the reconstruction kernel is no
longer needed.
2017-06-09 03:46:11 +02:00
Sergey Sharybin
6a546fc73e Cycles: Don't leave multiple spaces in the device name 2017-06-08 12:15:24 +02:00
Sergey Sharybin
45d3e22204 Cycles: Display optional board name in system info 2017-06-08 12:10:15 +02:00
Sergey Sharybin
78c0f09d4f Cycles: Cleanup, indentation 2017-06-08 12:03:08 +02:00
Pascal Schoen
c91d2d30df Improve backscatter color of subsurface scattering in Principled BSDF
Differential Revision: https://developer.blender.org/D2685
2017-05-31 07:29:17 +02:00
Sergey Sharybin
46da985c8e Cycles: Cleanup, trailing whitespace 2017-05-30 10:58:12 +02:00
Lukas Stockner
9b914764a9 Fix T51652: Cycles - Persistant Images not storing images
Denoising was setting session parameters for every frame, which was detected as
a change and therefore caused a resync.

Since the parameter modification change is only needed for viewport rendering
(which doesn't support denoising anyways) and resyncing after a frame change
(which isn't affected by denoising settings), an easy fix is to just ignore
the denoising parameters like it's currently done with the samples.
2017-05-30 06:34:53 +02:00