Fixes an issue where Blender would crash if the OptiX denoiser was
selected, but an unsupported GPU device (E.g. Intel GPU) was
selected in preferences.
This crash would occur because Cycles uses the device in preferences
to setup the denoiser, and there was no check stopping an unsupported
GPU from being used to try and setup and run the denoiser.
Pull Request: https://projects.blender.org/blender/blender/pulls/124001
Ever since the introduction of GPU OIDN denoising on CPU devices,
using the path_tracing_device info to pick the automatic denoiser has
typically led to incorrect results.
This commit fixes this issue by using the denoising device info to pick
the denoiser.
Pull Request: https://projects.blender.org/blender/blender/pulls/123593
Cycles automatic denoiser picker assumed that OIDN could not be
run on the GPU while the CPU was the render device. So if the user was
using their CPU for rendering, the automatic denoiser picker would
"fallback" to a different denoiser (OptiX or CPU OIDN). This was true
in Blender 4.1, but changed in 4.2. The UI assumed that OIDN could run
on the GPU if there was a compatible OIDN GPU device.
This lead to a issue on systems using the CPU for rendering
while having a NVIDIA GPU installed in the system. The
UI suggested that OIDN would be used, and would switch between
CPU and GPU depending on user preferences. But the automatic
denoiser picker in Cycle's backend said OIDN could not run on
the GPU in this situation and would always "fallback" to the
OptiX denoiser running on the NVIDIA GPU.
This created a mismatch between the UI and what Cycles was
acutally doing. This issue did not effect other GPU vendors because
their "fallback" was the OIDN denoiser.
This commit fixes this issue by aligning the Cycles automatic
denoiser picker in the backend with the UI. Using OIDN if a GPU
is supported, falling back to OptiX if it's not supported,
falling back to OIDN CPU if OptiX isn't supported,
then falling back to no denoiser if that's not supported.
Pull Request: https://projects.blender.org/blender/blender/pulls/123530
Since #118841 there are more cases where Cycles would check for the
graphics interop support. This could lead to a crash when graphics
interop functions are called without having active graphics context.
This change makes it so there is no graphics interop calls when doing
headless render. In order to achieve this the device creation is now
aware of the headless mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/122844
Additional requirement is to have OpenImageDenoiser, and the devices
should not support OIDN denoiser.
Reproduced here in the studio with a system on Linux with either double
Quadro GP100 cards, and Limnux with Quadro 6000 + Quadro 6000 ADA.
The reason for the crash is that the find_best_device() might return
nullptr, and it was never checked.
Pull Request: https://projects.blender.org/blender/blender/pulls/122823
Previously, Cycles would render up to 4SPP during viewport navigation when
using reduced resolution, even when the overall number of samples was set
lower.
This causes problems with the blue-noise pattern, so ensure that the
number of samples is always clamped to the configured maximum.
On a M3 MacBook Pro, this change increases the benchmark score by 8% (with classroom seeing a path-tracing speedup of 15%).
The integrator state is currently store using struct-of-arrays, with one array per field. Such fine grained separation can result in poor GPU cache utilisation in cases where multiple fields of the same parent struct are accessed together. This PR changes the layout of the `ray`, `isect`, `subsurface`, and `shadow_ray` structs so that the data is interleaved (per parent struct) instead of separate. To try and keep this change localised, I encapsulated the layout change by extending the integrator state access macros, however maybe we want to do this more explicitly? (e.g. by updating every bit of code that accesses these parts of the state). Feedback welcome.
Pull Request: https://projects.blender.org/blender/blender/pulls/122015
This is an oversight of #122543, for which benchmarking was done in
the headless mode.
The solution is to tweak policy a little bit, and keep refresh intervals
low for the first 10 seconds of render, after which increase updates to
every 15 seconds. Doing so allows:
- Have quick cancel of complex files when the error is noticed during
the first few samples.
- Have more predictable cancel time after long render.
- Mitigate the performance regression.
This does not fully solve the regression, but it makes it much more
manageable. There are some compromises to be done from the performance
for the UI renders. The interactivity is also not as fantastic, but it
could be solved later by introducing some "Instant Cancel" operations
which would be able to also stop render in the middle of a sample.
Performance measured with the Spring file (path tracing time in seconds):
Samples: 300 1024 2048
Base (prior to #122543): 29.1 85.4 174.1
This patch: 37.0 95.7 180.2
This is measured on M2 Ultra GPU render.
The penalty is close to a constant time (the time within which a more
interactive cancel is possible.
Pull Request: https://projects.blender.org/blender/blender/pulls/122658
Previously, GPU denoisers were ignoring settings about render
configuration and were using any available GPU. With these changes,
GPU denoisers will use the device selected in Blender Cycles
settings.
This allows any GPU denoiser to be used with CPU rendering.
Pull Request: https://projects.blender.org/blender/blender/pulls/118841
In some of the complex scenes it could a very long time for Cycles
to respond to cancel request. This is because Cycles only cancels
render at a consistent state of render buffer: when all scheduled
samples are rendered.
This was caused by the render scheduler over-scheduling the number
of samples in an attempt to improve occupancy of the GPU.
This fix makes it so the scheduler only compensates for the low
occupancy if rendering can happen within a desired update time.
There is no visible difference in the benchmark scenes with this
change.
Pull Request: https://projects.blender.org/blender/blender/pulls/122543
This patch adds a "shadow" prefix & array index suffixes to the shadow integrator state buffer names. This eliminates confusion when looking at GPU traces etc.
Pull Request: https://projects.blender.org/blender/blender/pulls/121745
This enables the new lazy module loading behavior introduced in OIDN 2.3,
without breaking compatibility with older versions of OIDN (using separate
code paths).
Also, the detection of OIDN support for devices is now much cleaner, and
devices do not need to be matched by PCI address or device name anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/121362
use available `film_pass_pixel_render_buffer()` to access the pointer
to the render buffer.
For shadow state, a similar function `film_pass_pixel_render_buffer_shadow()`
is created, because `shadow_path` instead of `path` is needed.
The enumerator values for various GPU compute platforms were
added starting with OIDN 2.0, with the Metal GPU type added
in OIDN 2.2.
This is an alternative fix to ebb781675dd, which does not lead to
unhandled cases in switch statement, and follows the configuration
of OIDN and not Cycles (as OIDN might report devices which are
disabled in local Cycles build).
Pull Request: https://projects.blender.org/blender/blender/pulls/119155
- Incorrect accurate prefiltering of albedo and normal (lower than expected quality)
- Changing the prefiltering mode has no immediate effect
- Default memory limit is too high (more than OIDN default)
- Memory limit is applied only to the main filter
- Quality setting applied only to the main filter
Pull Request: https://projects.blender.org/blender/blender/pulls/117930
This is supported on Apple Silicon GPUs and macOS 13.0+.
Co-authored-by: Stefan Werner <stefan.werner@intel.com>
Co-authored-by: Attila Afra <attila.t.afra@intel.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/116124
OpenImageDenoise API exposes two modes, high quality and balanced.
This currently only has effect on Nvidia devices, on which it
provides a noticeable performance improvement without visible
difference in quality. This change sets quality to balanced for
the viewport, and high quality for final frame rendering, as
it's what makes the most sense.
Ref #115045
Co-authored-by: Werner, Stefan <stefan.werner@intel.com>
Pull Request: #115265
Along with the 4.1 libraries upgrade, we are bumping the clang-format
version from 8-12 to 17. This affects quite a few files.
If not already the case, you may consider pointing your IDE to the
clang-format binary bundled with the Blender precompiled libraries.