blender

Author	SHA1	Message	Date
Sergey Sharybin	e7721f5ec8	Cycles: Fix SSS with spatial splits and motion blur	2016-07-25 13:55:03 +02:00
Sergey Sharybin	17e7454263	Cycles: Reduce memory usage by de-duplicating triangle storage There are several internal changes for this: First idea is to make __tri_verts to behave similar to __tri_storage, meaning, __tri_verts array now contains all vertices of all triangles instead of just mesh vertices. This saves some lookup when reading triangle coordinates in functions like triangle_normal(). In order to make it efficient needed to store global triangle offset somewhere. So no __tri_vindex.w contains a global triangle index which can be used to read triangle vertices. Additionally, the order of vertices in that array is aligned with primitives from BVH. This is needed to keep cache as much coherent as possible for BVH traversal. This causes some extra tricks needed to fill the array in and deal with True Displacement but those trickery is fully required to prevent noticeable slowdown. Next idea was to use this __tri_verts instead of __tri_storage in intersection code. Unfortunately, this is quite tricky to do without noticeable speed loss. Mainly this loss is caused by extra lookup happening to access vertex coordinate. Fortunately, tricks here and there (i,e, some types changes to avoid casts which are not really coming for free) reduces those losses to an acceptable level. So now they are within couple of percent only, On a positive site we've achieved: - Few percent of memory save with triangle-only scenes. Actual save in this case is close to size of all vertices. On a more fine-subdivided scenes this benefit might become more obvious. - Huge memory save of hairy scenes. For example, on koro.blend there is about 20% memory save. Similar figure for bunny.blend. This memory save was the main goal of this commit to move forward with Hair BVH which required more memory per BVH node. So while this sounds exciting, this memory optimization will become invisible by upcoming Hair BVH work. But again on a positive side, we can add an option to NOT use Hair BVH and then we'll have same-ish render times as we've got currently but will have this 20% memory benefit on hairy scenes.	2016-07-07 17:25:48 +02:00
Sergey Sharybin	700722f686	Cycles: Cleanup, indent nested preprocessor directives Quite straightforward, main trick is happening in path_source_replace_includes(). Reviewers: brecht, dingto, lukasstockner97, juicyfruit Differential Revision: https://developer.blender.org/D1794	2016-03-25 13:55:42 +01:00
Sergey Sharybin	c93069083e	Cycles: Tweaks for 32bit CUDA binaries Tweak some inline policies. Not totally crazy yet, and in fact we now have one less ifdef statement now.	2016-02-15 19:11:02 +01:00
Sergey Sharybin	8bca34fe32	Cysles: Avoid having ShaderData on the stack This commit introduces a SSS-oriented intersection structure which is replacing old logic of having separate arrays for just intersections and shader data and encapsulates all the data needed for SSS evaluation. This giver a huge stack memory saving on GPU. In own experiments it gave 25% memory usage reduction on GTX560Ti (722MB vs. 946MB). Unfortunately, this gave some performance loss of 20% which only happens on GPU. This is perhaps due to different memory access pattern. Will be solved in the future, hopefully. Famous saying: won in memory - lost in time (which is also valid in other way around).	2015-11-25 13:01:22 +05:00
Sergey Sharybin	3d3d805b64	Cycles: Prepare code for OpenCL camera/motion blur The kernels are now compiling just fine, but there're some issues during rendering. This is still to be investigated.	2015-05-14 18:48:56 +05:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Sergey Sharybin	e2354e64d2	Cycles: Cleanup, spaces around assignment operator Did some bad spacing in recent commits, better to get rid of those so they does not confuse those who're working on sources.	2015-04-07 00:25:54 +05:00
Sergey Sharybin	edb7195f27	Cycles: Bring back distance check in re-intersection From more investigation of the numeric failures in the kernel it appears the check was rather correct. But in theory it;s also needed for the motion triangles.	2015-02-10 19:07:55 +05:00
Campbell Barton	abd38c00f1	Cycles: set hit values in-order	2014-10-11 11:17:08 +02:00
Martijn Berger	25ec0d97f9	make "tri_shader" an int instead of a float tri_shader does no longer need to a float. Reviewers: dingto, sergey Reviewed By: dingto, sergey Subscribers: dingto Projects: #cycles Differential Revision: https://developer.blender.org/D789	2014-09-24 13:34:28 +02:00
Sergey Sharybin	bfaf4f2d0d	Fix T41219: Cycles backface detection doesn't work properly Root of the issue goes back to the on-fly normals commit and the latest fix for it wasn't actually correct. I've mixed two fixes in there. So the idea here goes back to storing negative scaled object flag and flip runtime-calculated normal if this flag is set, which is pretty much the same as the original fix for the issue from me. The issue with motion blur wasn't caused by the rumtime normals patch and it had issues before, because it already did runtime normals calculation. Now made it so motion triangles takes the negative scale flag into account. This actually makes code more clean imo and avoids rather confusing flipping code in mesh.cpp.	2014-08-13 16:35:54 +06:00
Thomas Dinges	49df707496	Cycles: Calculate face normal on the fly. Instead of pre-calculation and storage, we now calculate the face normal during render. This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs, where you have limited VRAM. Part of my GSoC 2014.	2014-06-13 21:59:13 +02:00
Brecht Van Lommel	f8ce417eba	Fix T40320: wrong render layer visibility with cycles deformation motion blur.	2014-05-23 16:11:59 +02:00
Sv. Lockal	e7c2578576	Cycles: avoid 1.0f/(1.0f/x) divisions, which msvc (only) can't optimize. This makes bmw scene in msvc 12 builds 6% faster. It also gives a minor speedup for SSE hair in all compilers.	2014-04-03 22:08:53 +04:00
Brecht Van Lommel	393216a6df	Cycles code refactor: move more code to geom folder, add some comments.	2014-03-29 13:03:48 +01:00
Brecht Van Lommel	e8b1cfed0a	Cycles code refactor: replace magic ~0 values in the code with defines.	2014-03-29 13:03:47 +01:00
Brecht Van Lommel	6020d00990	Cycles: add support for mesh deformation motion blur.	2014-03-29 13:03:47 +01:00

18 Commits