blender

Author	SHA1	Message	Date
Sergey Sharybin	136d7a4f62	Cycles: Only whitelist AMD GPU devices in the OpenCL section Only those ones are priority for now, all the rest are still testable if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment variables are set.	2015-05-09 23:40:26 +05:00
Sergey Sharybin	2840a5de8f	Cycles: Workaround for AMD compiler crashing building the split kernel It's a but in compiler but it's nice to have working kernel for until that bug is fixed.	2015-05-09 19:56:38 +05:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Sergey Sharybin	f680c1b54a	Cycles: Communicate number of closures and nodes feature set to the device This way device can actually make a decision of how it can optimize the kernel in order to make it most efficient.	2015-05-09 19:28:00 +05:00
Sergey Sharybin	6fc1669679	Cycles: Initial work towards selective nodes support compilation The goal is to be able to compile kernel with nodes which are actually needed to render current scene, hence improving performance of the kernel, The idea is: - Have few node groups, starting with a group which contains nodes are used really often, and then couple of groups which will be extension of this one. - Have feature-based nodes disabling, so it's possible to disable nodes related to features which are not used with the currently used nodes group. This commit only lays down needed routines for this approach, actual split will happen later after gathering statistics from bunch of production scenes.	2015-05-09 19:22:16 +05:00
Sergey Sharybin	17c95d0a96	Cycles: Add utility function to count maximum number of closures used by session This will be used by split kernel in order to compile most optimal kernel. Maximum number of closures is actually being cached in the session, so viewport rendering will not trigger kernel re-loading when number of closures goes down.	2015-05-09 19:17:49 +05:00
Sergey Sharybin	5068f7dc01	Cycles: Add utility function to graph to query number of closures used in it Currently unused but will be needed soon for the split kernel work.	2015-05-09 19:13:32 +05:00
Sergey Sharybin	b3299bace0	Cycles: Pass requested tile size to the device via device task This is currently unused but crucial for things like calculating amount of device memory required to deal with the tasks. Maybe not really best place to store it, but consider it good enough for now.	2015-05-09 19:09:07 +05:00
Sergey Sharybin	0e4ddaadd4	Cycles: Change the way how we pass requested capabilities to the device Previously we only had experimental flag passed to device's load_kernel() which was all fine. But since we're gonna to have some extra parameters passed there it makes sense to wrap them into a single struct, which will make it easier to pass stuff around.	2015-05-09 19:05:49 +05:00
Sergey Sharybin	d69c80f717	Cycles: Presumably correct workaround for addrspace in camera motion blur	2015-05-09 19:04:19 +05:00
Sergey Sharybin	c9133778cf	Cycles: Add CPU compat headers to some of the OSL implementation files This header was already included into some of the implementation files already, and this change is needed for some upcoming changes in the way how kernel_types.h works.	2015-05-09 19:04:16 +05:00
Sergey Sharybin	7eac672e4f	Cycles: Set default closure values to some of the nodes Previously it was only set at compilation time which is all fine but does not let us to check which closure the node corresponds to prior to the compilation.	2015-05-09 19:04:09 +05:00
Thomas Dinges	900fc43bb4	Cleanup: Remove unused ray type flags. They were added for completeness, but it seems we don't need them.	2015-05-08 12:10:26 +02:00
Sergey Sharybin	9ca2b76a9f	Cycles: Cleanup, make it more clear what endif closes what ifdef	2015-05-07 15:02:43 +05:00
Campbell Barton	165598e49e	Correct typo: ifdef'd now, but obviously wrong	2015-05-07 10:12:12 +10:00
Sergey Sharybin	b45ad4b214	Cycles: Fix for wrong clamp usage in fast math	2015-05-06 00:01:40 +05:00
Thomas Dinges	d01b226870	Cleanup: Remove leftover from Distorted Noise node in XML reader.	2015-05-05 10:38:45 +02:00
Sv. Lockal	7201f6d14c	Cycles: Use curve approximation for blackbody instead of lookup table Now we calculate color in range 800..12000 using an approximation a/x+bx+c for R and G and ((at + b)t + c)t + d) for B. Max absolute error for RGB for non-lut function is less than 0.0001, which is enough to get the same 8 bit/channel color as for OSL with a noticeable performance difference. However there is a slight visible difference between previous non-OSL implementation because of lookup table interpolation and offset-by-one mistake. The previous implementation gave black color outside of soft range (t > 12000), now it gives the same color as for 12000. Also blackbody node without input connected is being converted to value input at shader compile time. Reviewers: dingto, sergey Reviewed By: dingto Subscribers: nutel, brecht, juicyfruit Differential Revision: https://developer.blender.org/D1280	2015-05-05 06:11:54 +00:00
Campbell Barton	e59bd19fa7	Cleanup: style & const's	2015-05-05 05:19:49 +10:00
Thomas Dinges	66f96e555c	Cycles: Fix copy / paste mistake in XML reader.	2015-05-04 14:31:20 +02:00
Sergey Sharybin	b7d0ff0ad6	Separate scene simplification into viewport and render This way it is possible to have viewport simplification bumped all the way up, making viewport really responsive but still have final render to use highest subdivision possible. Reviewers: lukastoenne, campbellbarton, dingto Reviewed By: campbellbarton, dingto Subscribers: dingto, nutel, eyecandy, venomgfx Differential Revision: https://developer.blender.org/D1273	2015-05-04 16:31:10 +05:00
Sergey Sharybin	16794f908f	Cycles: Fix possible uninitialized XML read state which might cause crashes	2015-04-30 15:46:09 +05:00
Sergey Sharybin	41d817f15d	Fix T44548: Cycles Tube Mapping off / not compatible with BI Was a typo in original implementation, probably a result of some code reshuffle happened for optimization reasons.	2015-04-30 14:27:16 +05:00
Thomas Dinges	4eab0e72b3	Cleanup: Update some comments and add ToDo.	2015-04-29 23:56:46 +02:00
Thomas Dinges	b3def11f5b	Cycles: Record all possible volume intersections for SSS and camera checks This replaces sequential ray moving followed with scene intersection with single BVH traversal, which gives us all possible intersections. Only implemented for CPU, due to qsort and a bigger memory usage on GPU which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code. This improves render performance for scenes with: a) Camera inside volume mesh b) SSS mesh intersecting a volume mesh/domain In simple volume files (not much geometry) performance is roughly the same (slightly faster). In files with a lot of geometry, the performance increase is larger. bmps.blend with a volume shader and camera inside the mesh, it renders ~10% faster here. Patch by Sergey and myself. Differential Revision: https://developer.blender.org/D1264	2015-04-29 23:31:06 +02:00
Sergey Sharybin	7aab5c6ca9	Cycles: Fix wrong termination criteria in SSS volume stack update Another issue spotted with Thomas.	2015-04-30 01:20:17 +05:00
Sergey Sharybin	e5f3193df3	Cycles: Fix wrong order in object flags calculations Object flags are depending on bounding box which is only available after mesh synchronization. This was broken since 7fd4c44 which happened quite close to the release and oddly enough was not sopped by anyone. Render test is coming for this. Was spotted by Thomas Dinges while working on another patch.	2015-04-30 01:09:48 +05:00
Sergey Sharybin	d6b28bbb1d	Cycles: Fix crashes when loading cache created with pre-leaf split builds	2015-04-29 15:48:49 +05:00
Sergey Sharybin	2e91bcfb9d	Fix T44544: Cached BVH is broken since BVH leaf split Still need to solve issues with reading old cache with new builds.	2015-04-29 15:38:07 +05:00
Thomas Dinges	5e423775da	Cleanup: Move Cycles volume stack update for subsurface into kernel_volume.h.	2015-04-28 11:20:27 +02:00
Thomas Dinges	58a2b10a65	Cycles: Initialize portal variable directly, so we can avoid the one NULL check.	2015-04-27 23:12:53 +02:00
Lukas Stockner	f478c2cfbd	Cycles: Added support for light portals This patch adds support for light portals: objects that help sampling the environment light, therefore improving convergence. Using them tor other lights in a unidirectional pathtracer is virtually useless. The sampling is done with the area-preserving code already used for area lamps. MIS is used both for combination of different portals and for combining portal- and envmap-sampling. The direction of portals is considered, they aren't used if the sampling point is behind them. Reviewers: sergey, dingto, #cycles Reviewed By: dingto, #cycles Subscribers: Lapineige, nutel, jtheninja, dsisco11, januz, vitorbalbio, candreacchio, TARDISMaker, lichtwerk, ace_dragon, marcog, mib2berlin, Tunge, lopataasdf, lordodin, sergey, dingto Differential Revision: https://developer.blender.org/D1133	2015-04-28 01:30:16 +05:00
Sergey Sharybin	ae7d84dbc1	Cycles: Use native saturate function for CUDA This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1) into a single instruction and uses 4 instructions instead. Original patch by @lockal with own modification: Don't make changes outside of the kernel. They don't make any difference anyway and term saturate() has a bit different meaning outside of kernel. This gives around 2% of speedup in Barcelona file, but in more complex shader setups with lots of math nodes with clamping speedup could be much nicer. Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D1224	2015-04-28 00:38:32 +05:00
Thomas Dinges	bc160d8a85	Cleanup: Code style.	2015-04-26 00:42:26 +02:00
Thomas Dinges	8dd055cd47	Cleanup: Update Lookup table comments.	2015-04-26 00:06:38 +02:00
Lukas Stockner	60c5a2f2d2	Cycles: Add Mirror ball mapping to camera panorama options The projection code was already in place, so this just exposes the option. Differential Revision: https://developer.blender.org/D1079	2015-04-25 23:51:56 +02:00
Campbell Barton	b82d571c85	Cleanup: style	2015-04-21 15:53:32 +10:00
Sergey Sharybin	828abaf11c	Cycles: Split BVH nodes storage into inner and leaf nodes This way we can get rid of inefficient memory usage caused by BVH boundbox part being unused by leaf nodes but still being allocated for them. Doing such split allows to save 6 of float4 values for QBVH per leaf node and 3 of float4 values for regular BVH per leaf node. This translates into following memory save using 01.01.01.G rendered without hair: Device memory size Device memory peak Global memory peak Before the patch: 4957 5051 7668 With the patch: 4467 4562 7332 The measurements are done against current master. Still need to run speed tests and it's hard to predict if it's faster or not: on the one hand leaf nodes are now much more coherent in cache, on the other hand they're not so much coherent with regular nodes anymore. Reviewers: brecht, juicyfruit Subscribers: venomgfx, eyecandy Differential Revision: https://developer.blender.org/D1236	2015-04-20 17:29:51 +05:00
Sergey Sharybin	cd44449578	Cycles: Synchronize images after building mesh BVH This way memory overhead caused by the BVH building is not so visible and peak memory usage will be reduced. Implementing this idea is not so straightforward actually, because we need to synchronize images used for true displacement before meshes. Detecting whether image is used for true displacement is not so striaghtforward, so for now all all displacement types will synchronize images used for them. Such change brings memory usage from 4.1G to 4.0G with the 01_01_01_D scene from gooseberry. With 01_01_01_G scene it's 7.6G vs. 6.8G (before and after the patch). Reviewers: campbellbarton, juicyfruit, brecht Subscribers: eyecandy Differential Revision: https://developer.blender.org/D1217	2015-04-20 17:29:51 +05:00
Dalai Felinto	394c5318c6	Bake-API: reduce memory footprint when baking more than one object (Fix T41092) Combine all the highpoly pixel arrays into a single array with a lookup object_id for each of the highpoly objects. Note: This changes the Bake API, external engines should refer to the bake_api.c for the latest API. Many thanks for Sergey Sharybin for the complete review, changes suggestion and feedback. (you rock!) Reviewers: sergey Subscribers: pildanovak, marcclintdion, monio, metalliandy, brecht Maniphest Tasks: T41092 Differential Revision: https://developer.blender.org/D772	2015-04-17 12:25:37 -03:00
Campbell Barton	d1f9fcaabc	Cleanup: style	2015-04-13 22:08:51 +10:00
Sergey Sharybin	35812e65f4	Cycles: Fix compilation error on windows after recent logging changes	2015-04-10 22:35:10 +05:00
Sergey Sharybin	aac0df956f	Cycles: Cleanup, make more clear what camera utility functions are private/public	2015-04-10 16:25:35 +05:00
Sergey Sharybin	e073562f80	Cycles: Make transform from viewplane a generic utility function	2015-04-10 15:53:14 +05:00
Sergey Sharybin	2f5dd83759	Cycles: Add some statistics logging Covers number of entities in the scene (objects, meshes etc), also reports sizes of textures being allocated.	2015-04-10 15:37:49 +05:00
Sergey Sharybin	7ea4163e1e	Cycles: Fix BVH counter on mesh updates	2015-04-09 22:23:59 +05:00
Sergey Sharybin	cca4405437	Cycles: Fix wrong render result in certain configuration of render layer's surface/hair There were some synchronization missing in cases when only one of those settings was disabled. Also added a render test for such configurations now.	2015-04-09 21:22:48 +05:00
Sergey Sharybin	bf11e362c5	Fix T44046: Cycles speed regression in 2.74 (CPU only) Issue was caused by MSVC not being able to optimize some code out in the same way as GCC/Clang does, so now that parts of code are explicitly unfolded in order to help compilers out. This makes speed loss much less drastic on my laptop. That's probably as good as we can do with MSVC without investing infinite amount of time looking trying to workaround the optimizer.	2015-04-08 18:47:25 +05:00
Sergey Sharybin	7621ff7e55	Cycles: Code cleanup, indentation. Was wrong in the multiview commit	2015-04-08 15:35:01 +05:00
Sergey Sharybin	09a746b857	Cycles: Cleanup, typos	2015-04-08 01:15:38 +05:00

1 2 3 4 5 ...

2391 Commits