Commit Graph

889 Commits

Author SHA1 Message Date
Brecht Van Lommel
073bf8bf52 Cycles: remove WITH_CYCLES_DEBUG, add WITH_CYCLES_DEBUG_NAN
WITH_CYCLES_DEBUG was used for rendering BVH debugging passes. But since we
mainly use Embree an OptiX now, this information is no longer important.

WITH_CYCLES_DEBUG_NAN will enable additional checks for NaNs and invalid values
in the kernel, for Cycles developers. Previously these asserts where enabled in
all debug builds, but this is too likely to crash Blender in scenes that render
fine regardless of the NaNs. So this is behind a CMake option now.

Fixes T90240
2021-07-28 19:27:57 +02:00
Brecht Van Lommel
cf74cd9367 Cycles: upgrade CUDA to 11.4
This fixes a performance regression on Ampere cards, on specific scenes like
classroom. For cycles-x there is little difference, but this is still helpful
for LTS releases, and we need to upgrade at some point anyway.
2021-07-26 19:46:51 +02:00
Campbell Barton
f1e4903854 Cleanup: full sentences in comments, improve comment formatting 2021-06-26 21:50:48 +10:00
Campbell Barton
4b9ff3cd42 Cleanup: comment blocks, trailing space in comments 2021-06-24 15:59:34 +10:00
Kévin Dietrich
cd39e3dec1 OptiX: select BVH build options from Scene params
Currently, the OptiX BVH build options are selected based on whether
we are in background mode (final renders) or not (viewport renders).
In background mode, the BVH is built for fast path tracing and low
memory footprint, while in viewport, it is built for fast updates.

However, on platforms without OpenGL support, the background flag is
always set to true and prevents using fast BVH builds in the viewport.

Now, the BVH options derive from the Scene BVH settings:
* if BVH is static, a fast to trace BVH is built
* if BVH is dynamic, a fast to update BVH is built

Reviewed By: #cycles, brecht

Differential Revision: https://developer.blender.org/D11154
2021-06-22 07:38:28 +02:00
Patrick Mours
b046bc536b Fix T88096: Baking with OptiX and displacement fails
Using displacement runs the shader eval kernel, but since OptiX modules are not loaded when
baking is active, those were not available and therefore failed to launch. This fixes that by falling
back to the CUDA kernels.
2021-05-25 16:56:16 +02:00
Sergey Sharybin
ff51c2e89a Cleanup: Use named unused arguments in Cycles Device 2021-05-21 11:19:33 +02:00
0745afeddb Merge remote-tracking branch 'origin/blender-v2.93-release' 2021-05-20 13:00:07 +02:00
Brecht Van Lommel
0456223cde Fix T87793: Cycles OptiX crash hiding objects in viewport render 2021-05-19 18:30:43 +02:00
Brecht Van Lommel
3e472d87a8 Cycles OpenCL: disable AO preview kernels
These seem to be causing some stability issues, and really are just not that
useful in practice. Compiling them is slow already, so it does not improve
the user experience much to show an AO preview if it's not nearly instant.
2021-05-19 18:30:43 +02:00
Brecht Van Lommel
542b8da831 Merge branch 'blender-v2.93-release' 2021-05-17 20:18:39 +02:00
Brecht Van Lommel
91a5dbbd17 Fix OpenCL group size performance issue on Intel GPUs
Contributed by Intel. On some scenes like classroom with particular integrated
GPUs this speeds up rendering 1.97x. With other benchmarks and GPUs it's
between 0.99-1.14x.
2021-05-17 19:40:57 +02:00
Patrick Mours
94960250b5 Cycles: Fix build with OptiX 7.3 SDK 2021-04-26 14:55:39 +02:00
Patrick Mours
7cbd66d42f Cycles: Initialize all OptiX structs to zero before use
This is done to ensure building with newer OptiX SDK releases that add new struct fields gives
deterministic results (no uninitialized fields and therefore random data is passed to OptiX).
2021-04-13 13:56:15 +02:00
Patrick Mours
f1fe42d912 Cycles: Do not allocate tile buffers on all devices when peer memory is active and denoising is not
Separate tile buffers on all devices only need to exist when denoising is active (so any overlap
being rendered simultaneously does not write to the same memory region).
When denoising is not active they can be distributed like all other memory when peer
memory support is available.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D10858
2021-03-30 14:04:56 +02:00
Brecht Van Lommel
91c44fe885 Cycles: disable NanoVDB for AMD OpenCL
It is causing issue with AMD OpenCL drivers, due to a potential driver bug.

Ref T84461
2021-03-30 00:00:17 +02:00
Brecht Van Lommel
8f93386e62 Fix (apparently harmless) Cycles asan warnings 2021-03-15 20:46:57 +01:00
Patrick Mours
f4f8b6dde3 Cycles: Change device-only memory to actually only allocate on the device
This patch changes the `MEM_DEVICE_ONLY` type to only allocate on the device and fail if
that is not possible anymore because out-of-memory (since OptiX acceleration structures may
not be allocated in host memory). It also fixes high peak memory usage during OptiX
acceleration structure building.

Reviewed By: brecht

Maniphest Tasks: T85985

Differential Revision: https://developer.blender.org/D10535
2021-03-11 14:12:35 +01:00
Campbell Barton
17e1e2bfd8 Cleanup: correct spelling in comments 2021-02-05 16:23:34 +11:00
Patrick Mours
b2e00e8f8e Merge branch 'blender-v2.92-release' 2021-01-29 13:35:21 +01:00
Patrick Mours
9f89166b52 Fix T85148: OptiX viewport denoising regression
Commit 6e74a8b69f215e63e136cb4c497e738371ac798f changed the denoiser input passes default to
include the normal pass. This does not always produce optimal images though, hence why the
default was previously set to only include the color and albedo passes. This restores that behavior, so
that viewport denoising with OptiX produces the same results as before.
2021-01-29 13:35:00 +01:00
Patrick Mours
9b80291412 Merge branch 'blender-v2.92-release' 2021-01-27 15:29:39 +01:00
James Horsley
4fbeb3e6be Fix T85089: Crash when rendering scene that does not fit into GPU memory with CUDA/OptiX
The "cuda_mem_map_mutex" was potentially being locked recursively during the call to
"CUDADevice::move_textures_to_host", which crashed. This moves around the locking and
unlocking of "cuda_mem_map_mutex", so that it doesn't call a function that locks it while
still holding the lock.

Reviewed By: pmoursnv

Maniphest Tasks: T85089, T84734

Differential Revision: https://developer.blender.org/D10219
2021-01-27 15:27:57 +01:00
Kévin Dietrich
bbe6d44928 Cycles: optimize device updates
This optimizes device updates (during user edits or frame changes in
the viewport) by avoiding unnecessary computations. To achieve this,
we use a combination of the sockets' update flags as well as some new
flags passed to the various managers when tagging for an update to tell
exactly what the tagging is for (e.g. shader was modified, object was
removed, etc.).

Besides avoiding recomputations, we also avoid resending to the devices
unmodified data arrays, thus reducing bandwidth usage. For OptiX and
Embree, BVH packing was also multithreaded.

The performance improvements may vary depending on the used device (CPU
or GPU), and the content of the scene. Simple scenes (e.g. with no adaptive
subdivision or volumes) rendered using OptiX will benefit from this work
the most.

On average, for a variety of animated scenes, this gives a 3x speedup.

Reviewed By: #cycles, brecht

Maniphest Tasks: T79174

Differential Revision: https://developer.blender.org/D9555
2021-01-22 16:08:25 +01:00
Brecht Van Lommel
10d2cbfa36 Fix T84872: OptiX GPU + CPU rendering uses branched path samples
Branched path tracing is not supported for OptiX, and it would still use the
number of AA samples from there when branched path was enabled by the user
earlier but auto disabled and hidden in the UI when using OptiX.

Ref D10159
2021-01-20 14:59:23 +01:00
Sergey Sharybin
0d8948387e Cycles: Fix missing OpenCL extensions in certain cases
If extensions string is longer than 1024 then the old code would have
reported empty string instead of extensions.

Now the code does dynamic string allocation to store result of request,
similar to what is done in `OpenCLInfo::get_hardware_id`.

The code looks a bit ugly, but it didn't really change much with this
patch. In other words, the code can become more modern and clear, but
it is considered to be outside of the scope of this change.

Differential Revision: https://developer.blender.org/D10135
2021-01-18 15:47:00 +01:00
Brecht Van Lommel
3732508c64 Fix T84745: build error with TBB 2021
task_group::is_canceling() was removed.
2021-01-15 17:29:36 +01:00
Lukas Stockner
688e5c6d38 Fix T82351: Cycles: Tile stealing glitches with adaptive sampling
In my testing this works, but it requires me to remove the min(start_sample...) part in the
adaptive sampling kernel, and I assume there's a reason why it was there?

Reviewed By: brecht

Maniphest Tasks: T82351

Differential Revision: https://developer.blender.org/D9445
2021-01-11 21:04:49 +01:00
Patrick Mours
c66f00dc26 Fix Cycles rendering with OptiX after instance limit increase when building with old SDK
Commit d259e7dcfbbd37cec5a45fdfb554f24de10d0268 increased the instance limit, but only provided
a fall back for the host code for older OptiX SDKs, not for kernel code. This caused a mismatch when
an old SDK was used (as is currently the case on buildbot) and subsequent rendering artifacts. This
fixes that by moving the bit that is checked to a common location that works with both old an new
SDK versions.
2021-01-08 13:38:26 +01:00
Patrick Mours
d259e7dcfb Cycles: Increase instance limit for OptiX acceleration structure building
For a while now OptiX had support for 28-bits of instance IDs, instead of the initial 24-bits (see also
value reported by OPTIX_DEVICE_PROPERTY_LIMIT_MAX_INSTANCE_ID). This change makes use of
that and also adds an error reported when the number of instances an OptiX acceleration structure is
created with goes beyond the limit, to make this clear instead of just rendering an image with artifacts.

Manifest Tasks: T81431
2021-01-07 19:23:13 +01:00
Patrick Mours
3373d14b1b Fix T83925: Crash when rendering on the CPU with OptiX denoiser enabled
Rendering on the CPU uses the Embree BVH layout, whether the OptiX denoiser is enabled or not.
This means the "build_bvh" function gets a "BVHEmbree" object to fill and not a "BVHMulti" as it
was assuming before, which caused crashes due to memory geting overwritten incorrectly. This
fixes that by redirecting Embree BVH builds to the Embree device.

Manifest Tasks: T83925
2021-01-05 18:37:31 +01:00
Brecht Van Lommel
c4f8aedbc2 Fix T84016: Cycles baking crash with OptiX after recent changes
This worked for CPU + GPU, but not GPU only.
2020-12-24 12:59:35 +01:00
Brecht Van Lommel
b2edc716c1 Fix Cycles OptiX runtime compilation broken after shader raytracing
Need to pass the appropriate flags as we do for compilation as part of the
CMake build.
2020-12-22 15:08:59 +01:00
Campbell Barton
001f2c5d50 Cleanup: spelling 2020-12-15 12:34:25 +11:00
Sergey Sharybin
f762d37790 Cycles: enable OpenCL rendering on recent Intel GPUs
Based on testing by Intel, rendering on Iris GPUs and upcoming Xe GPUs
should work. This is enabled on Windows and Linux.

More testing is needed to verify correctness and performance in production
scenes, but our basic benchmark files seem to give correct results.
2020-12-11 17:37:54 +01:00
Brecht Van Lommel
c6626a2f8a Cleanup: compiler warnings
If you mark one function as override in a class, all must be marked.
2020-12-11 17:37:31 +01:00
Patrick Mours
bfb6fce659 Cycles: Add CPU+GPU rendering support with OptiX
Adds support for building multiple BVH types in order to support using both CPU and OptiX
devices for rendering simultaneously. Primitive packing for Embree and OptiX is now
standalone, so it only needs to be run once and can be shared between the two. Additionally,
BVH building was made a device call, so that each device backend can decide how to
perform the building. The multi-device for instance creates a special multi-BVH that holds
references to several sub-BVHs, one for each sub-device.

Reviewed By: brecht, kevindietrich

Differential Revision: https://developer.blender.org/D9718
2020-12-11 13:24:29 +01:00
Patrick Mours
612b83bbd1 Cycles: Enable baking panel in OptiX and redirect those requests to CUDA for now
This enables support for baking when OptiX is active, but uses CUDA for that behind the scenes, since
the way baking is currently implemented does not work well with OptiX.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D9784
2020-12-08 16:06:39 +01:00
Patrick Mours
c10546f5e9 Cycles: Add support for shader raytracing in OptiX
Support for the AO and bevel shader nodes requires calling "optixTrace" from within the shading
VM, which is only allowed from inlined functions to the raygen program or callables. This patch
therefore converts the shading VM to use direct callables to make it work. To prevent performance
regressions a separate kernel module is compiled and used for this purpose.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D9733
2020-12-04 13:04:11 +01:00
Kévin Dietrich
31a620b942 Cycles API: encapsulate Node socket members
This encapsulates Node socket members behind a set of specific methods;
as such it is no longer possible to directly access Node class members
from exporters and parts of Cycles.

The methods are defined via the NODE_SOCKET_API macros in `graph/
node.h`, and are for getting or setting a specific socket's value, as
well as querying or modifying the state of its update flag.

The setters will check whether the value has changed and tag the socket
as modified appropriately. This will let us know how a Node has changed
and what to update, which is the first concrete step toward a more
granular scene update system.

Since the setters will tag the Node sockets as modified when passed
different data, this patch also removes the various modified methods
on Nodes in favor of Node::is_modified which checks the sockets'
update flags status.

Reviewed By: brecht

Maniphest Tasks: T79174

Differential Revision: https://developer.blender.org/D8544
2020-11-04 13:03:33 +01:00
Kévin Dietrich
57d1aea64f Cycles: add support for BVH refit in OptiX
This avoids recomputing the BVH for geometries that do not have changes in topology but whose vertices are modified (like a simple character animation), and gives up to 40% speedup for BVH building.

This is only available for viewport renders at the moment.

Reviewed By: pmoursnv, brecht

Differential Revision: https://developer.blender.org/D9353
2020-11-03 18:05:29 +01:00
Lukas Stockner
517ff40b12 Cycles: Implement tile stealing to improve CPU+GPU rendering performance
While Cycles already supports using both CPU and GPU at the same time, there
currently is a large problem with it: Since the CPU grabs one tile per thread,
at the end of the render the GPU runs out of new work but the CPU still needs
quite some time to finish its current times.

Having smaller tiles helps somewhat, but especially OpenCL rendering tends to
lose performance with smaller tiles.

Therefore, this commit adds support for tile stealing: When a GPU device runs
out of new tiles, it can signal the CPU to release one of its tiles.
This way, at the end of the render, the GPU quickly finishes the remaining
tiles instead of having to wait for the CPU.

Thanks to AMD for sponsoring this work!

Differential Revision: https://developer.blender.org/D9324
2020-10-31 01:57:39 +01:00
Brecht Van Lommel
f75b09e7e6 Cycles: abort rendering when --cycles-device not found
Rather than just printing a message and falling back to the CPU. For render
farms it's better to avoid a potentially slow render on the CPU if the intent
was to render on the GPU.

Ref T82193, D9086
2020-10-29 16:01:38 +01:00
Brecht Van Lommel
30f626fe4c Revert "Cycles API: encapsulate Node socket members"
This reverts commit 527f8b32b32187f754e5b176db6377736f9cb8ff. It is causing
motion blur test failures and crashes in some renders, reverting until this is
fixed.
2020-10-27 11:40:42 +01:00
Kévin Dietrich
527f8b32b3 Cycles API: encapsulate Node socket members
This encapsulates Node socket members behind a set of specific methods;
as such it is no longer possible to directly access Node class members
from exporters and parts of Cycles.

The methods are defined via the NODE_SOCKET_API macros in `graph/
node.h`, and are for getting or setting a specific socket's value, as
well as querying or modifying the state of its update flag.

The setters will check whether the value has changed and tag the socket
as modified appropriately. This will let us know how a Node has changed
and what to update, which is the first concrete step toward a more
granular scene update system.

Since the setters will tag the Node sockets as modified when passed
different data, this patch also removes the various `modified` methods
on Nodes in favor of `Node::is_modified` which checks the sockets'
update flags status.

Reviewed By: brecht

Maniphest Tasks: T79174

Differential Revision: https://developer.blender.org/D8544
2020-10-26 23:11:14 +01:00
Patrick Mours
841eaebfa4 Cycles: Add support for OptiX 7.2 SDK 2020-10-26 15:43:55 +01:00
Patrick Mours
3df90de6c2 Cycles: Add NanoVDB support for rendering volumes
NanoVDB is a platform-independent sparse volume data structure that makes it possible to
use OpenVDB volumes on the GPU. This patch uses it for volume rendering in Cycles,
replacing the previous usage of dense 3D textures.

Since it has a big impact on memory usage and performance and changes the OpenVDB
branch used for the rest of Blender as well, this is not enabled by default yet, which will
happen only after 2.82 was branched off. To enable it, build both dependencies and Blender
itself with the "WITH_NANOVDB" CMake option.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D8794
2020-10-05 15:03:30 +02:00
Matt McClellan
1d985159ad Fix Cycles OpenCL failing when extension string is long
This can happen for Intel OpenCL, now support arbitrary string length.

Differential Revision: https://developer.blender.org/D9020
2020-10-05 14:04:06 +02:00
Campbell Barton
14b2a35c8b Cleanup: use parenthesis for if statements in macros 2020-09-19 16:28:17 +10:00
Brecht Van Lommel
3dbb231ed2 Fix OpenCL render error in large scenes
In scenes such as Cosmos Laundromat, there were memory allocations bigger than
2GB which would overflow.

Problem and solution found by AMD, thanks!
2020-09-17 13:07:28 +02:00