Commit Graph

25 Commits

Author SHA1 Message Date
Sergey Sharybin
9815f8a623 Cycles: Cleanup of OpenCL split kernel routines
The idea is to switch from allocating separate buffers for shader data's
structure of arrays to allocating one huge memory block and do some index
trickery to make it accessed as SOA.

This saves quite reasonable amount of lines of code in device_opencl and
also makes it possible to get rid of special declaration of ShaderData
structure.

As a side effect it also makes it easier to experiment with SOA vs. AOS
for split kernel.

Works fine here on NVidia GTX580, Intel CPU amd AMD Fiji cards.

Reviewers: #cycles, brecht, juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1593
2016-01-30 00:23:06 +01:00
Sergey Sharybin
e2161ca854 Cycles: Remove few function arguments needed only for the split kernel
Use KernelGlobals to access all the global arrays for the intermediate
storage instead of passing all this storage things explicitly.

Tested here with Intel OpenCL, NVIDIA GTX580 and AMD Fiji, didn't see
any artifacts, so guess it's all good.

Reviewers: juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1736
2016-01-28 18:59:27 +01:00
Sergey Sharybin
1f273cec00 Cycles: Tweak inline policy for some functions
The goal is to make Experimental kernel closer in performance to the
official kernel, avoiding spills and such.

There should not be big impact on official kernel, own tests showed
few percent performance drop on laptop's GPU. CPU was always the
same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here.

This seems to be the last essential step before we can get rid of
Experimental kernel and enable SSS officially on GPU without causing
some major performance issues.

Surely some more tweaks are possibly required, but that we can do
for until cows go home anyway.
2016-01-14 14:53:05 +05:00
Thomas Dinges
83e73a2100 Cycles: Refactor how we pass bounce info to light path node.
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.

This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.

Reviewed by Sergey and Brecht, thanks for some hlp with this.

I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
2016-01-06 23:43:29 +01:00
Thomas Dinges
53eab562b4 Cleanup: Remove some outdated comments related to split kernel. 2015-05-21 20:32:20 +02:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Thomas Dinges
b3def11f5b Cycles: Record all possible volume intersections for SSS and camera checks
This replaces sequential ray moving followed with scene intersection with
single BVH traversal, which gives us all possible intersections.

Only implemented for CPU, due to qsort and a bigger memory usage on GPU
which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code.

This improves render performance for scenes with:
a) Camera inside volume mesh
b) SSS mesh intersecting a volume mesh/domain

In simple volume files (not much geometry) performance is roughly the same
(slightly faster). In files with a lot of geometry, the performance
increase is larger. bmps.blend with a volume shader and camera inside the
mesh, it renders ~10% faster here.

Patch by Sergey and myself.

Differential Revision: https://developer.blender.org/D1264
2015-04-29 23:31:06 +02:00
Thomas Dinges
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
Sergey Sharybin
4832538ad0 Cycles: Keep STACK_MAX_HITS private in kernel_shadow
This way adding record_all for other things becomes easier and doesn't
lead to naming conflicts.
2014-09-26 14:23:48 +06:00
Thomas Dinges
1b5ec32ed9 Cleanup: Avoid some defines for scene_intersect(), related to Min Width. 2014-09-24 11:32:29 +02:00
Sergey Sharybin
0b5fda5678 Condition was inverted in the previous transparent shadows commit
Handbook example what happens when you've got loads of patches
and not double-check stuff before committing.
2014-06-30 17:00:51 +06:00
Sergey Sharybin
cb95544e41 Fix T40836: Cycles volume scattering shader crash
Volume scatter might happen before path termination, so
need to check transparent bounces and consider shadow an
opaque when max transparent bounces are reached.

TODO: CPU code seems to have different branching in conditions
which made me thinking it does different things with volume
attenuation, but from the render results it seems the same
exact things are happening there. Worth looking into making
simplifying code a bit here to improve readability.
2014-06-30 16:24:43 +06:00
Campbell Barton
238a6149af Fix T40289: Cycles leaking memory
error in recent commit
2014-05-21 16:00:20 +10:00
caed2394e2 Fix cycles bug with new transparent shadow code, giving too much volume shadow. 2014-05-15 21:31:58 +02:00
Campbell Barton
dc13969e48 Style cleanup: indentation, braces 2014-05-05 02:19:08 +10:00
61eba8fd06 Fix T39843: cycles memory leak rendering with high transparent depth. 2014-04-25 15:30:12 +02:00
9ab259f55b Cycles: shadow function optimization for transparent shadows (CPU only).
Old algorithm:

Raytrace from one transparent surface to the next step by step. To minimize
overhead in cases where we don't need transparent shadows, we first trace a
regular shadow ray. We check if the hit primitive was potentially transparent,
and only in that case start marching. this gives extra ray cast for the cases
were we do want transparency.

New algorithm:

We trace a single ray. If it hits any opaque surface, or more than a given
number of transparent surfaces is hit, then we consider the geometry to be
entirely blocked. If not, all transparent surfaces will be recorded and we
will shade them one by one to determine how much light is blocked. This all
happens in one scene intersection function.

Recording all hits works well in some cases but may be slower in others. If
we have many semi-transparent hairs, one intersection may be faster because
you'd be reinteresecting the same hairs a lot with each step otherwise. If
however there is mostly binary transparency then we may be recording many
unnecessary intersections when one of the first surfaces blocks all light.

We found that this helps quite nicely in some scenes, on koro.blend this can
give a 50% reduction in render time, on the pabellon barcelona scene and a
forest scene with transparent leaves it was 30%. Some other files rendered
maybe 1% or 2% slower, but this seems a reasonable tradeoff.

Differential Revision: https://developer.blender.org/D473
2014-04-21 19:34:25 +02:00
04a10907dc Code cleanup: remove old closure sampling code Cycles.
This was the original code to get things working on old GPUs, but now it is no
longer in use and various features in fact depend on this to work correctly to
the point that enabling this code is too buggy to be useful.
2014-04-21 16:14:37 +02:00
Carlo Andreacchio
7765b73f6d Cycles: add Transparent Depth output to Light Path node.
This can for example be useful if you want to manually terminate the path at
some point and use a color other than black.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D454
2014-04-21 14:44:36 +02:00
e8b1cfed0a Cycles code refactor: replace magic ~0 values in the code with defines. 2014-03-29 13:03:47 +01:00
889d77e6f6 Cycles Volume Render: heterogeneous (textured) volumes support.
Volumes can now have textured colors and density. There is a Volume Sampling
panel in the Render properties with these settings:

* Step size: distance between volume shader samples when rendering the volume.
  Lower values give more accurate and detailed results but also increased render
  time.
* Max steps: maximum number of steps through the volume before giving up, to
  protect from extremely long render times with big objects or small step sizes.

This is much more compute intensive than homogeneous volume, so when you are not
using a texture you should enable the Homogeneous Volume option in the material
or world for faster rendering.

One important missing feature is that Generated texture coordinates are not yet
working in volumes, and they are the default coordinates for nearly all texture
nodes. So until that works you need to plug in object texture coordinates or a
world space position.

This is work by "storm", Stuart Broadfoot, Thomas Dinges and myself.
2013-12-30 00:04:02 +01:00
fe222643b4 Cycles Volume Render: add volume emission support.
This is done using the existing Emission node and closure (we may add a volume
emission node, not clear yet if it will be needed).

Volume emission only supports indirect light sampling which means it's not very
efficient to make small or far away bright light sources. Using direct light
sampling and MIS would be tricky and probably won't be added anytime soon. Other
renderers don't support this either as far as I know, lamps and ray visibility
tricks may be used instead.
2013-12-28 23:20:53 +01:00
2b39214c4d Cycles Volume Render: add support for overlapping volume objects.
This works pretty much as you would expect, overlapping volume objects gives
a more dense volume. What did change is that world volume shaders are now
active everywhere, they are no longer excluded inside objects.

This may not be desirable and we need to think of better control over this.
In some cases you clearly want it to happen, for example if you are rendering
a fire in a foggy environment. In other cases like the inside of a house you
may not want any fog, but it doesn't seem possible in general for the renderer
to automatically determine what is inside or outside of the house.

This is implemented using a simple fixed size array of shader/object ID pairs,
limited to max 15 overlapping objects. The closures from all shaders are put
into a single closure array, exactly the same as if an add shader was used to
combine them.
2013-12-28 20:12:11 +01:00
e369a5c485 Cycles Volume Render: support for rendering of homogeneous volume with absorption.
This is the simplest possible volume rendering case, constant density inside
the volume and no scattering or emission. My plan is to tweak, verify and commit
more volume rendering effects one by one, doing it all at once makes it
difficult to verify correctness and track down bugs.

Documentation is here:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Materials/Volume

Currently this hooks into path tracing in 3 ways, which should get us pretty
far until we add more advanced light sampling. These 3 hooks are repeated in
the path tracing, branched path tracing and transparent shadow code:

* Determine active volume shader at start of the path
* Change active volume shader on transmission through a surface
* Light attenuation over line segments between camera, surfaces and background

This is work by "storm", Stuart Broadfoot, Thomas Dinges and myself.
2013-12-28 16:57:10 +01:00
133f770ab3 Code cleanup: move shadow_blocked function into separate file. 2013-12-28 16:57:10 +01:00