This is because with the addition of new features to Cycles, these GPUs
experienced significant performance regressions and bugs, all stemming
from bugs in the Metal GPU driver/compiler. The only reasonable way to
work around these issues was to disable parts of Cycles code on
these GPUs to avoid the driver/compiler bugs.
This resulted in increased development time maintaining these platforms
while being unable to deliver feature parity with other
GPU backends.
It has been decided that this development time is better spent
maintaining platforms that are still actively maintained by
hardware/software vendors, and so AMD and Intel GPU support will be
removed from the Metal backend for Cycles.
Pull Request: https://projects.blender.org/blender/blender/pulls/123551
On some Macs, MNEE would be disabled in Cycles to work around a bug.
However this just led to these devices skipping over MNEE related
parts of the rendering pipeline and not properly progressing through
the render.
This commit fixes this issue by properly disabling MNEE on these devices.
Pull Request: https://projects.blender.org/blender/blender/pulls/123765
The currently available `wayland-protocol` libraries in lib-linux_x64
repo do not appear to be advertised 1.36 (or 1.35) versions, since the
tablet protocol is not available among the stable ones.
This reverts commit 2a85eaaf169c51cfb24d4564fa58b4c89ce6d773.
Pull Request: https://projects.blender.org/blender/blender/pulls/123774
when computing coefficients in volume, the volume density of the object
at the top of the stack is used, which leads to wrong result if
overlapping volumes have different scales.
This commit fixes the problem by pre-multiplying the volume density per
object when evaluating the shader.
Pull Request: https://projects.blender.org/blender/blender/pulls/123733
The original paper only considers the minimal distance of the cluster to
the ray, not the interval length, resulting in low weight for distant
lights that have large influence over a long distance.
This commit modifies the measure by considering `theta_b - theta_a` for
local lights and the ray length `t` for distant lights.
Pull Request: https://projects.blender.org/blender/blender/pulls/123537
Ever since the introduction of GPU OIDN denoising on CPU devices,
using the path_tracing_device info to pick the automatic denoiser has
typically led to incorrect results.
This commit fixes this issue by using the denoising device info to pick
the denoiser.
Pull Request: https://projects.blender.org/blender/blender/pulls/123593
Precompiled Cycles kernels make up a considerable fraction of the total size of
Blender builds nowadays. As we add more features and support for more
architectures, this will only continue to increase.
However, since these kernels tend to be quite compressible, we can save a lot
of storage by storing them in compressed form and decompressing the required
kernel(s) during loading.
By using Zstandard compression with a high level, we can get decent compression
ratios (~5x for the current kernels) while keeping decompression time low
(about 30ms in the worse case in my tests). And since we already require zstd
for Blender, this doesn't introduce a new dependency.
While the main improvement is to the size of the extracted Blender installation
(which is reduced by ~400-500MB currently), this also shrinks the download on
Windows, since .zip's deflate compression is less effective. It doesn't help on
Linux since we're already using .tar.xz there, but the smaller installed size
is still a good thing.
See #123522 for initial discussion.
Pull Request: https://projects.blender.org/blender/blender/pulls/123557
The Oren Nayer diffuse BSDF had a energy compensation term added in a
recent commit[1]. This energy compensation term used the colour input
in it's computation. The colour input was clamped in SVM, but not OSL,
resulting in differences between the two backends. This commit resolves
this issue by clamping the colour in the OSL script to match SVM.
[1] 5e40b9bb5cefdaca2d88bb6316e180b2bdf929db
Pull Request: https://projects.blender.org/blender/blender/pulls/123527
Cycles automatic denoiser picker assumed that OIDN could not be
run on the GPU while the CPU was the render device. So if the user was
using their CPU for rendering, the automatic denoiser picker would
"fallback" to a different denoiser (OptiX or CPU OIDN). This was true
in Blender 4.1, but changed in 4.2. The UI assumed that OIDN could run
on the GPU if there was a compatible OIDN GPU device.
This lead to a issue on systems using the CPU for rendering
while having a NVIDIA GPU installed in the system. The
UI suggested that OIDN would be used, and would switch between
CPU and GPU depending on user preferences. But the automatic
denoiser picker in Cycle's backend said OIDN could not run on
the GPU in this situation and would always "fallback" to the
OptiX denoiser running on the NVIDIA GPU.
This created a mismatch between the UI and what Cycles was
acutally doing. This issue did not effect other GPU vendors because
their "fallback" was the OIDN denoiser.
This commit fixes this issue by aligning the Cycles automatic
denoiser picker in the backend with the UI. Using OIDN if a GPU
is supported, falling back to OptiX if it's not supported,
falling back to OIDN CPU if OptiX isn't supported,
then falling back to no denoiser if that's not supported.
Pull Request: https://projects.blender.org/blender/blender/pulls/123530
Add support for VK_EXT_dynamic_rendering_unused_attachments. Although
this is required, it needs to be registered optional for now due to
lacking support when using renderdoc.
Pull Request: https://projects.blender.org/blender/blender/pulls/123483
`VK_EXT_shader_stencil_export` isn't supported by NVIDIA devices.
This extension was recently added to support EEVEE PBR layer selection.
This PR makes this extension optional and selects the work around
when not supported by the physical device.
Fixes#114385
Pull Request: https://projects.blender.org/blender/blender/pulls/123470
Vulkan backend has recently switched to a render graph approach. Many
code was left so we could develop the render graph beside the previous
implementation. Last week we removed the switch. This PR will remove
most of the unused code. There might be some left and will be removed
when detected.
Pull Request: https://projects.blender.org/blender/blender/pulls/123422
Almost certainly not an issue in current codebase (this 'copy' version
of `MEM_cnew` does not seem much used in the first place), but better be
consistent with the 'allocating' version.
Pull Request: https://projects.blender.org/blender/blender/pulls/123445
Sync a bit better the checks on the alignment value between
`MEM_lockfree_mallocN_aligned` and `MEM_guarded_mallocN_aligned`.
The only significant change, in `MEM_guarded_mallocN_aligned`, is the
usage of `ALIGNED_MALLOC_MINIMUM_ALIGNMENT` instead of 'magic value' `8`.
This should not have any effect on 64bits platforms, but on 32bits ones
the minimum alignment would be reduced from `8` to `4` now.
NOTE: we could also consider making these checks part of a utils
function, instead of duplicating them in the codebase.
This patch implements a new Gabor noise node based on [1] but with the
improvements from [2] and the phasor formulation from [3].
We compare with the most popular existing implementation, that of OSL,
from the user's point of view:
- This implementation produces C1 continuous noise as opposed to the
non continuous OSL implementation, so it can be used for bump
mapping and is generally smother. This is achieved by windowing the
Gabor kernel using a Hann window.
- The Bandwidth input of OSL was hard-coded to 1 and was replaced with
a frequency input, which OSL hard codes to 2, since frequency is
more natural to control. This is even more true now that that Gabor
kernel is windowed as opposed to truncated, which means increasing
the bandwidth will just turn the Gaussian component of the Gabor
into a Hann window. While decreasing the bandwidth will eliminate
the harmonic from the Gabor kernel, which is the point of Gabor
noise.
- OSL had three discrete modes of operation for orienting the kernel.
Anisotropic, Isotropic, and a hybrid mode. While this implementation
provides a continuous Anisotropy parameter which users are already
familiar with from the Glossy BSDF node.
- This implementation provides not just the Gabor noise value, but
also its phase and intensity components. The Gabor noise value is
basically sin(phase) * intensity, but the phase is arguably more
useful since it does not suffer from the low contrast issues that
Gabor suffers from. While the intensity is useful to hide the
singularities in the phase.
- This implementation converges faster that OSL's relative to the
impulse count, so we fix the impulses count to 8 for simplicitly.
- This implementation does not implement anisotropic filtering.
Future improvements to the node includes implementing surface noise and
filtering. As well as extending the spectral control of the noise,
either by providing specialized kernels as was done in #110802, or by
providing some more procedural control over the frequencies of the
Gabor.
References:
[1]: Lagae, Ares, et al. "Procedural noise using sparse Gabor
convolution." ACM Transactions on Graphics (TOG) 28.3 (2009): 1-10.
[2]: Tavernier, Vincent, et al. "Making gabor noise fast and
normalized." Eurographics 2019-40th Annual Conference of the European
Association for Computer Graphics. 2019.
[3]: Tricard, Thibault, et al. "Procedural phasor noise." ACM
Transactions on Graphics (TOG) 38.4 (2019): 1-13.
Pull Request: https://projects.blender.org/blender/blender/pulls/121820
This multiscattering term comes from the OpenPBR specification and nicely
preserves energy while correctly modeling increased saturation at high
roughness.
Preparation for adding a diffuse roughness option to the Principled BSDF.
To me, the difference in output and computation seems small enough to
not need an enum for the old behavior.
Note that this also switches sampling to cosine-weighted, in my tests this
gives lower noise. I also checked doing MIS between cosine and uniform,
using the A term as a weight for how often to use cosine (since that term
is Lambertian diffuse), but always using cosine was better.
A nice consequence of that is that you don't get a huge noise jump when
going from 0.0 to 0.01 roughness.
Pull Request: https://projects.blender.org/blender/blender/pulls/123345
Cycles runs a check to see if the camera is possibly inside a
volumetric object by seeing if the bounding box of the camera
and volumetric object intersect.
If the calculation is wrong (Cycles says the camera is outside the
volume when it's inside it), then the volume will not render properly.
This commit resolves most of these issues by making the camera
bounding box larger than before, taking into consideration
features like:
1. The impact DOF could have on the camera ray start position and how
that should impact the bounding box size.
2. Taking into consideration near clipping, which was missed from the
orthographic camera due to a oversight in a previous commit
(08cc73a9bb).
Pull Request: https://projects.blender.org/blender/blender/pulls/123341
The callables generated by OSL reference other external functions
(defined in the OSL services module), in which case OptiX cannot
calculate the right stack size just based on the callable alone, it needs to
know all functions linked together in the pipeline to get to an accurate
result. `optixProgramGroupGetStackSize` has an optional pipeline
argument for this purpose, so make use of that to ensure the correct
stack size is calculated.
Ref #122779
Pull Request: https://projects.blender.org/blender/blender/pulls/123368
This makes it more verbose, and a little clearer that devices prior to 8cx Gen3 are not supported in >=v4.0. It makes the error message from #113674 more prominent than just being printed to cout.
Spurred by an email I got from someone trying to run blender on a Surface Pro X, and getting the not very helpful (to old devices) error.
Pull Request: https://projects.blender.org/blender/blender/pulls/122732