blender

Author	SHA1	Message	Date
Hristo Gueorguiev	06c051363b	Cycles: split kernel_shadow_blocked to AO & DL parts Reduces memory allocation for split kernel. This allows for faster rendering due to bigger global size, specially when GPU memory is limited. Perfromance results: R9 290 total render time Before After Change BMW 4:37 4:34 -1.1 % Classroom 14:43 14:30 -1.5 % Fishy Cat 11:20 11:04 -2.4 % Koro 12:11 12:04 -1.0 % Pabellon Barcelona 22:01 20:44 -5.8 % Pabellon Barcelona() 15:32 15:09 -2.5 % () without glossy connected to volume	2017-03-09 17:09:37 +01:00
Hristo Gueorguiev	57e26627c4	Cycles: SSS and Volume rendering in split kernel Decoupled ray marching is not supported yet. Transparent shadows are always enabled for volume rendering. Changes in kernel/bvh and kernel/geom are from Sergey. This simiplifies code significantly, and prepares it for record-all transparent shadow function in split kernel.	2017-03-09 17:09:37 +01:00
Sergey Sharybin	97c4c2689f	Cycles: Make it more obvious message which initialization failed	2017-03-08 13:57:21 +01:00
Sergey Sharybin	ecfbfe478b	Cycles: Log which device kernels are being loaded for	2017-03-08 12:33:51 +01:00
Sergey Sharybin	712f7c3640	Cycles: Make it possible to access KernelGlobals from split data initialization function	2017-03-08 11:02:54 +01:00
Sergey Sharybin	ef7c36f5ed	Cycles: Cleanup, remove residue of previous split kernel data This is all in split data state array.	2017-03-08 10:26:29 +01:00
Mai Lavelle	64751552f7	Cycles: Fix indentation	2017-03-08 01:31:32 -05:00
Mai Lavelle	306034790f	Cycles: Calculate size of split state buffer kernel side By calculating the size of the state buffer in the kernel rather than the host less code is needed and the size actually reflects the requested features. Will also be a little faster in some cases because of larger global work size.	2017-03-08 01:31:30 -05:00
Mai Lavelle	997e345bd2	Cycles: Fix crash after failed kernel build Pointers to kernels were uninitialized leading to freeing of random memory addresses. Another reason it would be good to use smart pointers.	2017-03-08 01:31:09 -05:00
Mai Lavelle	18e50927f7	Cycles: Faster building of split kernel Simple change to make it so that only kernels that have been modified are rebuilt. Might only be useful during development.	2017-03-08 01:31:09 -05:00
Mai Lavelle	cd7d5669d1	Cycles: Remove sum_all_radiance kernel This was only needed for the previous implementation of parallel samples. As we don't have that any more it can be removed. Real reason for removal tho is this: `per_sample_output_buffers` was being calculated too small and artifacts resulted. The tile buffer is already the correct size and calculating the size for `per_sample_output_buffers` is a bit difficult with the current layout of the code. As `per_sample_output_buffers` was only needed for `sum_all_radiance`, removing that kernel and writing output to the tile buffer directly fixes the artifacts.	2017-03-08 01:31:07 -05:00
Mai Lavelle	4cf501b835	Cycles: Split path initialization into own kernel This makes it easier to initialize things correctly in the data_init kernel before they are needed by path tracing.	2017-03-08 01:30:43 -05:00
Mai Lavelle	b78e543af9	Cycles: Add names to buffer allocations This is to help debug and track memory usage for generic buffers. We have similar for textures already since those require a name, but for buffers the name is only for debugging proposes.	2017-03-08 01:24:55 -05:00
Mai Lavelle	817873cc83	Cycles: CUDA implementation of split kernel	2017-03-08 01:24:53 -05:00
Mai Lavelle	0892352bfe	Cycles: CPU implementation of split kernel	2017-03-08 00:52:41 -05:00
Sergey Sharybin	a87766416f	Cycles: Report device maximum allocation and detected global size	2017-03-08 00:52:41 -05:00
Mai Lavelle	365a4239c5	Cycles: Workaround for driver hangs Simple workaround for some issues we've been having with AMD drivers hanging and rendering systems unresponsive. Unfortunately this makes things a bit slower, but its better than having to do hard reboots. Will be removed when drivers have been fixed. Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable for testing purposes.	2017-03-08 00:52:41 -05:00
Mai Lavelle	230c00d872	Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering	2017-03-08 00:52:41 -05:00
Mai Lavelle	520b53364c	Cycles: Add OpenCL kernel for zeroing memory buffers Transferring memory to the device was very slow and there's really no need when only zeroing a buffer.	2017-03-08 00:52:41 -05:00
Mai Lavelle	bc652766e8	Cycles: Expose passes size to device tasks This is needed so devices can know the size of a tile buffer before any tiles are acquired.	2017-03-08 00:52:41 -05:00
Mai Lavelle	0f56f7a811	Cycles: Allow device_memory to be used directly This is useful for when theres no host side memory attched to the buffer	2017-03-08 00:52:41 -05:00
Sergey Sharybin	5acac13eb4	Cycles: Fix compilation error on vanilla Ubuntu 16.10 Patch by @swerner, thanks!	2017-02-27 15:22:51 +01:00
Sergey Sharybin	2c30fd83f1	Cycles: Additionally report all OpenCL cflags This way we can control exact spaces and such added to the cflags which is crucial to troubleshoot certain drivers.	2017-02-22 10:06:02 +01:00
Sergey Sharybin	333dc8d60f	Fix T50719: Memory usage won't reset to zero while re-rendering on two video cards Was only visible with Persistent Images option ON.	2017-02-20 11:02:19 +01:00
Aaron Carlisle	e5d8c2a67f	Use new manual URL	2017-01-23 19:10:37 -05:00
lazydodo	5a8b5a0377	Land D2339 by bliblu bli	2016-12-09 08:28:04 -07:00
Lukas Stockner	a2ebc5268f	Cycles: Refactor Progress system to provide better estimates The Progress system in Cycles had two limitations so far: - It just counted tiles, but ignored their size. For example, when rendering a 600x500 image with 512x512 tiles, the right 88x500 tile would count for 50% of the progress, although it only covers 15% of the image. - Scene update time was incorrectly counted as rendering time - therefore, the remaining time started very long and gradually decreased. This patch fixes both problems: First of all, the Progress now has a function to ignore time spans, and that is used to ignore scene update time. The larger change is the tile size: Instead of counting samples per tile, so that the final value is num_samplesnum_tiles, the code now counts every sample for every pixel, so that the final value is num_samplesnum_pixels. Along with that, some unused variables were removed from the Progress and Session classes. Reviewers: brecht, sergey, #cycles Subscribers: brecht, candreacchio, sergey Differential Revision: https://developer.blender.org/D2214	2016-12-03 05:02:21 +01:00
Sergey Sharybin	9aa8d1bc45	Cycles: Fix strict compilation warnings Should be no functional changes.	2016-11-22 16:39:03 +01:00
Sergey Sharybin	af7343ae22	Cycles: Attempt to fix compilation error on ppc64el There is some define conflict between system headers and clew, so delay include of clew.h as much as possible.] This is something which needed to be done in the code before the refactor, hopefully such change will still work.	2016-11-21 13:32:41 +01:00
Lukas Stockner	dd921238d9	Cycles: Refactor Device selection to allow individual GPU compute device selection Previously, it was only possible to choose a single GPU or all of that type (CUDA or OpenCL). Now, a toggle button is displayed for every device. These settings are tied to the PCI Bus ID of the devices, so they're consistent across hardware addition and removal (but not when swapping/moving cards). From the code perspective, the more important change is that now, the compute device properties are stored in the Addon preferences of the Cycles addon, instead of directly in the User Preferences. This allows for a cleaner implementation, removing the Cycles C API functions that were called by the RNA code to specify the enum items. Note that this change is neither backwards- nor forwards-compatible, but since it's only a User Preference no existing files are broken. Reviewers: #cycles, brecht Reviewed By: #cycles, brecht Subscribers: brecht, juicyfruit, mib2berlin, Blendify Differential Revision: https://developer.blender.org/D2338	2016-11-07 03:19:29 +01:00
Martijn Berger	c02cce7b75	cycles, cuDeviceComputeCapability is deprecated as of cuda 5.0	2016-11-04 14:49:54 +01:00
Martijn Berger	4fdf68271c	Cycles standalone, compile fix UINT_MAX is not defined in device_cuda.cpp	2016-11-02 10:56:16 +01:00
Sergey Sharybin	80a6e5beb5	Cycles: Remove explicit std:: from types where possible We have our own abstraction level on top of the STL's implementation. This commit will guarantee our tweaks are used for all cases.	2016-10-24 12:31:11 +02:00
Sergey Sharybin	48997d2e40	Cycles: Cleanup, style	2016-10-24 12:26:12 +02:00
Lukas Stockner	f7ce482385	Cycles: Fix another OpenCL logging issue Previously an error message would be printed whenever the OpenCL build produced output. However, some frameworks seem to print extra information even if the build succeeded, so now the actual returned error is checked as well. When --debug-cycles is activated, the build output will always be printed, otherwise it only gets printed if there was an error.	2016-10-21 02:49:00 +02:00
Lukas Stockner	cd843409d3	Fix T49630: Cycles: Swapped shader and bake kernels The problem here was, as the title says, that the two kernels were swapped. Since shader evaluation is only used for building the samling map when World MIS is enabled, rendering without it would still work fine, although baking also was broken.	2016-10-17 12:28:01 +02:00
Lukas Stockner	d5dd12e56c	Cycles: Improve OpenCL kernel compilation logging The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation. However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour. So, this commit changes the logging to go diectly to stdout/stderr once again by default.	2016-10-17 11:51:18 +02:00
Lukas Stockner	9ea71bc674	Cycles: Split device_opencl.cpp into multiple files for easier maintenance There are no user-visible changes, just some internal restructuring. Differential Revision: https://developer.blender.org/D2231	2016-10-09 15:49:50 +02:00
Sergey Sharybin	80837d06de	Cycles: Support earlier tile rendering termination on cancel It will discard the whole tile, but it's still kind of more friendly than fully locked interface (sort of) for until tile is fully sampled. Sorry if it causes PITA to merge for the opencl split work, but this issue bothering a lot when collecting benchmarks.	2016-09-29 16:00:25 +02:00
Sergey Sharybin	333366dbcf	Cycles: Fix typo in shader cancel routines	2016-09-29 15:48:10 +02:00
Sergey Sharybin	2372e67dd6	Cycles: Don't sum up memory usage of all devices together for the stats	2016-09-23 12:43:23 +02:00
Sergey Sharybin	91e0a16f2f	Cycles: Use XDG's .cache folder for cached kernels Basically just moves cached kernels from ~/.config/blender/BLENDER_VERSION to ~/.cache/cycles/kernels. This has following benefits: - Follows XDG specification more closely, not as if it's totally crucial or measurable by users, but still nice. - Prevents unexpected sizes of config folder, makes disk space used in more predictable for users way. - Allows to share kernels across multiple Blender versions, which makes it easier debugging at the times close to release. - "Copy Previous Settings" operator will no longer be copying possibly gigabytes of cached kernels, which used to lead to really nast disk usage and annoying delays of copying settings. - In the future we can have some smart logic to clear old unused cached kernels. Currently only done for Linux and OSX. Windows still follows old "cache" folder logic, but it's not really important for now because we don't support kernel compilation on this platform yet. Reviewers: dingto, juicyfruit, brecht Reviewed By: brecht Differential Revision: https://developer.blender.org/D2197	2016-09-12 09:39:05 +02:00
Mai Lavelle	76b6c77f2c	Cycles microdisplacement: Allow kernels to be built without patch evaluation Kernels can now be built without patch evaluation when not needed by the scene (Catmull-Clark subdivision not in use), giving a performance boost for some devices.	2016-08-15 11:13:18 -04:00
Thomas Dinges	9d236ac06c	Cycles: Enable half float support (4 channels and 1 channel) on CUDA. Atm OpenEXR half files benefit from this and will use only 1/2 of the memory now. More space for HDRs! Part of my GSoC 2016.	2016-08-11 22:47:53 +02:00
Thomas Dinges	c2a7317d1f	CUDA: We don't support Toolkits < 7.5, update error message.	2016-08-09 11:41:25 +02:00
Sergey Sharybin	29dc04d9bb	Cycles: Report human-readable string of compilation error code It is possible that compilation will fail without giving anything in the log buffer. For this cases giving a tip about error code will be really handy. Patch by @Ilia, thanks!	2016-08-04 12:14:43 +02:00
Sergey Sharybin	b416168d85	Cycles: Cleanup, trailing whitespace	2016-08-02 14:09:34 +02:00
Sergey Sharybin	7b8b16a18c	Cycles: Some cleanup in CUDA device file	2016-08-02 14:09:34 +02:00
Sergey Sharybin	ad48f13099	Cycles: Include NVCC compiler flags into md5 hash This way we can easily switch between toolkits without worrying whether some kernel was compiled with old or new CUDA toolkit. It's also now possible to switch machine architecture and have proper cached kernel detected. Not as if it happens every day, but i did such a bitness switch back in the days :)	2016-08-02 14:09:34 +02:00
Sergey Sharybin	6353ecb996	Cycles: Tweaks to support CUDA 8 toolkit All the changes are mainly giving explicit tips on inlining functions, so they match how inlining worked with previous toolkit. This make kernel compiled by CUDA 8 render in average with same speed as previous kernels. Some scenes are somewhat faster, some of them are somewhat slower. But slowdown is within 1% so far. On a positive side it allows us to enable newer generation cards on buildbots (so GTX 10x0 will be officially supported soon).	2016-08-01 15:54:29 +02:00

1 2 3 4 5 ...

396 Commits