blender

Author	SHA1	Message	Date
Brecht Van Lommel	45b9542e6c	Merge branch 'blender-v3.6-release' into main	2023-06-15 16:45:15 +02:00
Brecht Van Lommel	0ab58864f3	Fix Cycles Metal AMD crash with shadow caustics, by disabling it Better to disable than crashing, as we are not expecting a quick fix. The cause is likely similar to issues with the light tree, which was already disabled. Ref #104013	2023-06-15 16:33:21 +02:00
Campbell Barton	c12994612b	License headers: use SPDX-FileCopyrightText in intern/cycles	2023-06-14 16:53:23 +10:00
Sahar A. Kashi	557a245dd5	Cycles: add HIP RT device, for AMD hardware ray tracing on Windows HIP RT enables AMD hardware ray tracing on RDNA2 and above, and falls back to a to shader implementation for older graphics cards. It offers an average 25% sample rendering rate improvement in Cycles benchmarks, on a W6800 card. The ray tracing feature functions are accessed through HIP RT SDK, available on GPUOpen. HIP RT traversal functionality is pre-compiled in bitcode format and shipped with the SDK. This is not yet enabled as there are issues to be resolved, but landing the code now makes testing and further changes easier. Known limitations: * Not working yet with current public AMD drivers. * Visual artifact in motion blur. * One of the buffers allocated for traversal has a static size. Allocating it dynamically would reduce memory usage. * This is for Windows only currently, no Linux support. Co-authored-by: Brecht Van Lommel <brecht@blender.org> Ref #105538	2023-04-25 20:19:43 +02:00
Xavier Hallade	9821a2d397	Cycles: pass kernel features to get_bvh_layout_mask This allows to selectively disable Hardware Raytracing in oneAPI backend, depending on features used.	2023-04-18 22:09:42 +02:00
Nikita Sirgienko	3f8c995109	Cycles: add hardware raytracing support to oneAPI device Updated Embree 4 library with GPU support is required for it to be compiled - compatiblity with Embree 3 and Embree 4 without GPU support is maintained. Enabling hardware raytracing is an opt-in user setting for now. Pull Request: https://projects.blender.org/blender/blender/pulls/106266	2023-04-18 22:09:42 +02:00
Sergey Sharybin	d32d787f5f	Clang-Format: Allow empty functions to be single-line For example ``` OIIOOutputDriver::~OIIOOutputDriver() { } ``` becomes ``` OIIOOutputDriver::~OIIOOutputDriver() {} ``` Saves quite some vertical space, which is especially handy for constructors. Pull Request: https://projects.blender.org/blender/blender/pulls/105594	2023-03-29 16:50:54 +02:00
Campbell Barton	bb2dc141f2	Cleanup: spelling in comments	2023-03-27 12:08:14 +11:00
Brecht Van Lommel	74de2e23a5	Merge branch 'blender-v3.5-release'	2023-03-17 21:53:51 +01:00
Brecht Van Lommel	cc6d8cd573	Fix #105442 : Cycles CUDA and HIP host memory fallback not working Transforming the host pointer should not be done in an assert, it only works in debug builds then. Caused by 6dcfb6d.	2023-03-17 21:52:29 +01:00
Campbell Barton	b3625e6bfd	Cleanup: comment blocks	2023-03-09 10:39:49 +11:00
Michael Jones	7842347ec8	Cycles: Fix hanging unit tests when MetalRT is enabled This patch fixes hanging unit tests when MetalRT is enabled. It simplifies and fixes the kernel selection logic by baking the MetalRT-specific options into `kernels_md5` rather than expanding out and testing MetalRT bit flags explicitly. Pull Request #105270	2023-02-28 11:42:08 +01:00
Campbell Barton	a99022e22d	Cleanup: spelling in comments	2023-02-07 14:17:01 +11:00
Nikita Sirgienko	6dcfb6df9c	Cycles: Abstract host memory fallback for GPU devices Host memory fallback in CUDA and HIP devices is almost identical. We remove duplicated code and create a shared generic version that other devices (oneAPI) will be able to use. Reviewed By: brecht Differential Revision: https://developer.blender.org/D17173	2023-02-06 22:19:32 +01:00
Brecht Van Lommel	87f7b630b5	Cleanup: make format	2023-01-05 19:43:19 +01:00
Michael Jones	a7cc6e015c	Cycles: Additional Metal kernel specialisation exposed through UI This patch adds a new "Kernel Optimization Level" dropdown menu to control Metal kernel specialisation. Currently this defaults to "full" optimisation, on the assumption that the changes proposed in D16371 will address usability concerns around app responsiveness and shader cache housekeeping. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16514	2023-01-04 23:36:52 +00:00
Jacques Lucke	2540a52f91	Cleanup: quiet unused parameter warning	2023-01-04 17:30:55 +01:00
Michael Jones	77c3e67d3d	Cycles: Improved render start/stop responsiveness on Metal All kernel specialisation is now performed in the background regardless of kernel type, meaning that the first render will be visible a few seconds sooner. The only exception is during benchmark warm up, in which case we wait for all kernels to be cached. When stopping a render, we call a new `cancel()` method on the device which causes any outstanding compilation work to be cancelled, and we destroy the device in a detached thread so that any stale queued compilations can be safely purged without blocking the UI for longer than necessary. Reviewed By: brecht Differential Revision: https://developer.blender.org/D16371	2023-01-04 16:00:53 +00:00
Weizhen Huang	ee89f213de	Cycles: improve many lights sampling using light tree Uses a light tree to more effectively sample scenes with many lights. This can significantly reduce noise, at the cost of a somewhat longer render time per sample. Light tree sampling is enabled by default. It can be disabled in the Sampling > Lights panel. Scenes using light clamping or ray visibility tricks may render different as these are biased techniques that depend on the sampling strategy. The implementation is currently disabled on AMD HIP. This is planned to be fixed before the release. Implementation by Jeffrey Liu, Weizhen Huang, Alaska and Brecht Van Lommel. Ref T77889	2022-12-05 16:09:03 +01:00
Patrick Mours	a859837cde	Cleanup: Move OptiX denoiser code from device into denoiser class Cycles already treats denoising fairly separate in its code, with a dedicated `Denoiser` base class used to describe denoising behavior. That class has been fully implemented for OIDN (`denoiser_oidn.cpp`), but for OptiX was mostly empty (`denoiser_optix.cpp`) and denoising was instead implemented in the OptiX device. That meant denoising code was split over various files and directories, making it a bit awkward to work with. This patch moves the OptiX denoising implementation into the existing `OptiXDenoiser` class, so that everything is in one place. There are no functional changes, code has been mostly moved as-is. To retain support for potential other denoiser implementations based on a GPU device in the future, the `DeviceDenoiser` base class was kept and slightly extended (and its file renamed to `denoiser_gpu.cpp` to follow similar naming rules as `path_trace_work_*.cpp`). Differential Revision: https://developer.blender.org/D16502	2022-11-15 15:50:01 +01:00
Patrick Mours	e6b38deb9d	Cycles: Add basic support for using OSL with OptiX This patch generalizes the OSL support in Cycles to include GPU device types and adds an implementation for that in the OptiX device. There are some caveats still, including simplified texturing due to lack of OIIO on the GPU and a few missing OSL intrinsics. Note that this is incomplete and missing an update to the OSL library before being enabled! The implementation is already committed now to simplify further development. Maniphest Tasks: T101222 Differential Revision: https://developer.blender.org/D15902	2022-11-09 15:30:21 +01:00
Sebastian Herhoz	75a6d3abf7	Cycles: add Path Guiding on CPU through Intel OpenPGL This adds path guiding features into Cycles by integrating Intel's Open Path Guiding Library. It can be enabled in the Sampling > Path Guiding panel in the render properties. This feature helps reduce noise in scenes where finding a path to light is difficult for regular path tracing. The current implementation supports guiding directional sampling decisions on surfaces, when the material contains a least one diffuse component, and in volumes with isotropic and anisotropic Henyey-Greenstein phase functions. On surfaces, the guided sampling decision is proportional to the product of the incident radiance and the normal-oriented cosine lobe and in volumes it is proportional to the product of the incident radiance and the phase function. The incident radiance field of a scene is learned and updated during rendering after each per-frame rendering iteration/progression. At the moment, path guiding is only supported by the CPU backend. Support for GPU backends will be added in future versions of OpenPGL. Ref T92571 Differential Revision: https://developer.blender.org/D15286	2022-09-27 15:56:32 +02:00
Brecht Van Lommel	011d3c75a7	Cleanup: compiler warning	2022-07-15 15:20:53 +02:00
Michael Jones	da4ef05e4d	Cycles: Apple Silicon optimization to specialize intersection kernels The Metal backend now compiles and caches a second set of kernels which are optimized for scene contents, enabled for Apple Silicon. The implementation supports doing this both for intersection and shading kernels. However this is currently only enabled for intersection kernels that are quick to compile, and already give a good speedup. Enabling this for shading kernels would be faster still, however this also causes a long wait times and would need a good user interface to control this. M1 Max samples per minute (macOS 13.0): PSO_GENERIC PSO_SPECIALIZED_INTERSECT PSO_SPECIALIZED_SHADE barbershop_interior 83.4 89.5 93.7 bmw27 1486.1 1671.0 1825.8 classroom 175.2 196.8 206.3 fishy_cat 674.2 704.3 719.3 junkshop 205.4 212.0 257.7 koro 310.1 336.1 342.8 monster 376.7 418.6 424.1 pabellon 273.5 325.4 339.8 sponza 830.6 929.6 1142.4 victor 86.7 96.4 96.3 wdas_cloud 111.8 112.7 183.1 Code contributed by Jason Fielder, Morteza Mostajabodaveh and Michael Jones Differential Revision: https://developer.blender.org/D14645	2022-07-15 13:40:04 +02:00
Xavier Hallade	a02992f131	Cycles: Add support for rendering on Intel GPUs using oneAPI This patch adds a new Cycles device with similar functionality to the existing GPU devices. Kernel compilation and runtime interaction happen via oneAPI DPC++ compiler and SYCL API. This implementation is primarly focusing on Intel® Arc™ GPUs and other future Intel GPUs. The first supported drivers are 101.1660 on Windows and 22.10.22597 on Linux. The necessary tools for compilation are: - A SYCL compiler such as oneAPI DPC++ compiler or https://github.com/intel/llvm - Intel® oneAPI Level Zero which is used for low level device queries: https://github.com/oneapi-src/level-zero - To optionally generate prebuilt graphics binaries: Intel® Graphics Compiler All are included in Linux precompiled libraries on svn: https://svn.blender.org/svnroot/bf-blender/trunk/lib The same goes for Windows precompiled binaries but for the graphics compiler, available as "Intel® Graphics Offline Compiler for OpenCL™ Code" from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html, for which path can be set as OCLOC_INSTALL_DIR. Being based on the open SYCL standard, this implementation could also be extended to run on other compatible non-Intel hardware in the future. Reviewed By: sergey, brecht Differential Revision: https://developer.blender.org/D15254 Co-authored-by: Nikita Sirgienko <nikita.sirgienko@intel.com> Co-authored-by: Stefan Werner <stefan.werner@intel.com>	2022-06-29 12:58:04 +02:00
Brecht Van Lommel	9cfc7967dd	Cycles: use SPDX license headers * Replace license text in headers with SPDX identifiers. * Remove specific license info from outdated readme.txt, instead leave details to the source files. * Add list of SPDX license identifiers used, and corresponding license texts. * Update copyright dates while we're at it. Ref D14069, T95597	2022-02-11 17:47:34 +01:00
Michael Jones	a44366a642	Cycles: Expose "Use MetalRT" checkbox For curve-heavy scenes, memory consumption regressed when we switched from MetalRT to bvh2. Allow users to opt in to MetalRT to workaround this. Reviewed By: brecht Differential Revision: https://developer.blender.org/D14071	2022-02-10 17:32:46 +00:00
Michael Jones	9558fa5196	Cycles: Metal host-side code This patch adds the Metal host-side code: - Add all core host-side Metal backend files (device_impl, queue, etc) - Add MetalRT BVH setup files - Integrate with Cycles device enumeration code - Revive `path_source_replace_includes` in util/path (required for MSL compilation) This patch also includes a couple of small kernel-side fixes: - Add an implementation of `lgammaf` for Metal [Nemes, Gergő (2010), "New asymptotic expansion for the Gamma function", Archiv der Mathematik](https://users.renyi.hu/~gergonemes/) - include "work_stealing.h" inside the Metal context class because it accesses state now Ref T92212 Reviewed By: brecht Maniphest Tasks: T92212 Differential Revision: https://developer.blender.org/D13423	2021-12-07 15:52:21 +00:00
Thomas Dinges	83a4d51997	Cleanup: Remove unused show_samples() device code in Cycles.	2021-11-17 11:16:48 +01:00
Hans Goudey	9e611c5616	Merge branch 'blender-v3.0-release'	2021-11-05 16:33:08 -05:00
Brecht Van Lommel	97ff37bf54	Cycles: perform CPU film reading in the kernel, to use AVX2 half conversion Adds a bunch of CPU kernel function to process on row of pixels, and use those instead of calling unoptimized implementations. Fixes T92598	2021-11-05 22:04:36 +01:00
Thomas Dinges	5327413b37	Cleanup: Remove Cycles device checks for half float. All supported devices support half float now, so we can remove the check. Differential Revision: https://developer.blender.org/D13021	2021-11-01 10:18:30 +01:00
Brecht Van Lommel	fd25e883e2	Cycles: remove prefix from source code file names Remove prefix of filenames that is the same as the folder name. This used to help when #includes were using individual files, but now they are always relative to the cycles root directory and so the prefixes are redundant. For patches and branches, git merge and rebase should be able to detect the renames and move over code to the right file.	2021-10-26 15:37:04 +02:00
Brian Savery	044a77352f	Cycles: add HIP device support for AMD GPUs NOTE: this feature is not ready for user testing, and not yet enabled in daily builds. It is being merged now for easier collaboration on development. HIP is a heterogenous compute interface allowing C++ code to be executed on GPUs similar to CUDA. It is intended to bring back AMD GPU rendering support on Windows and Linux. https://github.com/ROCm-Developer-Tools/HIP. As of the time of writing, it should compile and run on Linux with existing HIP compilers and driver runtimes. Publicly available compilers and drivers for Windows will come later. See task T91571 for more details on the current status and work remaining to be done. Credits: Sayak Biswas (AMD) Arya Rafii (AMD) Brian Savery (AMD) Differential Revision: https://developer.blender.org/D12578	2021-09-28 19:18:55 +02:00
Brecht Van Lommel	d7f803f522	Fix T91641: crash rendering with 16k environment map in Cycles Protect against integer overflow.	2021-09-23 17:48:16 +02:00
Campbell Barton	4d66cbd140	Cleanup: spelling in comments	2021-09-22 14:54:01 +10:00
Brecht Van Lommel	0803119725	Cycles: merge of cycles-x branch, a major update to the renderer This includes much improved GPU rendering performance, viewport interactivity, new shadow catcher, revamped sampling settings, subsurface scattering anisotropy, new GPU volume sampling, improved PMJ sampling pattern, and more. Some features have also been removed or changed, breaking backwards compatibility. Including the removal of the OpenCL backend, for which alternatives are under development. Release notes and code docs: https://wiki.blender.org/wiki/Reference/Release_Notes/3.0/Cycles https://wiki.blender.org/wiki/Source/Render/Cycles Credits: * Sergey Sharybin * Brecht Van Lommel * Patrick Mours (OptiX backend) * Christophe Hery (subsurface scattering anisotropy) * William Leeson (PMJ sampling pattern) * Alaska (various fixes and tweaks) * Thomas Dinges (various fixes) For the full commit history, see the cycles-x branch. This squashes together all the changes since intermediate changes would often fail building or tests. Ref T87839, T87837, T87836 Fixes T90734, T89353, T80267, T80267, T77185, T69800	2021-09-21 14:55:54 +02:00
Sergey Sharybin	ff51c2e89a	Cleanup: Use named unused arguments in Cycles Device	2021-05-21 11:19:33 +02:00
Brecht Van Lommel	0456223cde	Fix T87793: Cycles OptiX crash hiding objects in viewport render	2021-05-19 18:30:43 +02:00
Brecht Van Lommel	3e472d87a8	Cycles OpenCL: disable AO preview kernels These seem to be causing some stability issues, and really are just not that useful in practice. Compiling them is slow already, so it does not improve the user experience much to show an AO preview if it's not nearly instant.	2021-05-19 18:30:43 +02:00
Brecht Van Lommel	91c44fe885	Cycles: disable NanoVDB for AMD OpenCL It is causing issue with AMD OpenCL drivers, due to a potential driver bug. Ref T84461	2021-03-30 00:00:17 +02:00
Brecht Van Lommel	10d2cbfa36	Fix T84872: OptiX GPU + CPU rendering uses branched path samples Branched path tracing is not supported for OptiX, and it would still use the number of AA samples from there when branched path was enabled by the user earlier but auto disabled and hidden in the UI when using OptiX. Ref D10159	2021-01-20 14:59:23 +01:00
Patrick Mours	bfb6fce659	Cycles: Add CPU+GPU rendering support with OptiX Adds support for building multiple BVH types in order to support using both CPU and OptiX devices for rendering simultaneously. Primitive packing for Embree and OptiX is now standalone, so it only needs to be run once and can be shared between the two. Additionally, BVH building was made a device call, so that each device backend can decide how to perform the building. The multi-device for instance creates a special multi-BVH that holds references to several sub-BVHs, one for each sub-device. Reviewed By: brecht, kevindietrich Differential Revision: https://developer.blender.org/D9718	2020-12-11 13:24:29 +01:00
Brecht Van Lommel	f75b09e7e6	Cycles: abort rendering when --cycles-device not found Rather than just printing a message and falling back to the CPU. For render farms it's better to avoid a potentially slow render on the CPU if the intent was to render on the GPU. Ref T82193, D9086	2020-10-29 16:01:38 +01:00
Stefan Werner	009971ba7a	Cycles: Separate Embree device for each CPU Device. Before, Cycles was using a shared Embree device across all instances. This could result in crashes when viewport rendering and material preview were using Cycles simultaneously. Fixes issue T80042 Maniphest Tasks: T80042 Differential Revision: https://developer.blender.org/D8772	2020-09-01 21:00:55 +02:00
Kévin Dietrich	c82166ffcd	Cycles: move some Scene related methods out of Session This moves `Session::get_requested_device_features`, `Session::load_kernels`, and `Session::update_scene` out of `Session` and into `Scene`, as mentioned in D8544. Reviewed By: brecht Differential Revision: https://developer.blender.org/D8590	2020-08-18 11:50:37 +02:00
Brecht Van Lommel	93791381fe	Cleanup: reduce hardcoded numbers in denoising neighbor tiles code	2020-07-10 17:10:05 +02:00
Brecht Van Lommel	0a3bde6300	Cycles: add denoising settings to the render properties Enabling render and viewport denoising is now both done from the render properties. View layers still can individually be enabled/disabled for denoising and have their own denoising parameters. Note that the denoising engine also affects how denoising data passes are output even if no denoising happens on the render itself, to make the passes compatible with the engine. This includes internal refactoring for how denoising parameters are passed along, trying to avoid code duplication and unclear naming. Ref T76259	2020-06-24 15:17:36 +02:00
Brecht Van Lommel	207338bb58	Cycles: port curve-ray intersection from Embree for use in Cycles GPU This keeps render results compatible for combined CPU + GPU rendering. Peformance and quality primitives is quite different than before. There are now two options: * Rounded Ribbon: render hair as flat ribbon with (fake) rounded normals, for fast rendering. Hair curves are subdivided with a fixed number of user specified subdivisions. This gives relatively good results, especially when used with the Principled Hair BSDF and hair viewed from a typical distance. There are artifacts when viewed closed up, though this was also the case with all previous primitives (but different ones). * 3D Curve: render hair as 3D curve, for accurate results when viewing hair close up. This automatically subdivides the curve until it is smooth. This gives higher quality than any of the previous primitives, but does come at a performance cost and is somewhat slower than our previous Thick curves. The main problem here is performance. For CPU and OpenCL rendering performance seems usually quite close or better for similar quality results. However for CUDA and Optix, performance of 3D curve intersection is problematic, with e.g. 1.45x longer render time in Koro (though there is no equivalent quality and rounded ribbons seem fine for that scene). Any help or ideas to optimize this are welcome. Ref T73778 Depends on D8012 Maniphest Tasks: T73778 Differential Revision: https://developer.blender.org/D8013	2020-06-22 13:28:01 +02:00
Brecht Van Lommel	e50f1ddc65	Cycles: use TBB for task pools and task scheduler No significant performance improvement is expected, but it means we have a single thread pool throughout Blender. And it should make adding more parallellization in the future easier. After previous refactoring commits this is basically a drop-in replacement. One difference is that the task pool had a mechanism for scheduling tasks to the front of the queue to minimize memory usage. TBB has a smarter algorithm to balance depth-first and breadth-first scheduling of tasks and we assume that removes the need to manually provide hints to the scheduler. Fixes T77533	2020-06-22 13:27:37 +02:00

1 2 3 4

167 Commits