Commit Graph

75 Commits

Author SHA1 Message Date
Sergey Sharybin
03cb146afa Fix T43496: Infinite loop in kernel when using surface attribute for volume
The issue was caused bu the optimization in surface attributes for cases when
there's only a volume shader used. Some attributes doesn't make sense in that
case and were skipped from calculation.

However, it is possible that kernel would still try to access them (because of
the shader setup etc). Prevented an infinite loop in the kernel now, which
should not have much affect on regular renders.
2015-01-31 14:39:19 +05:00
Sergey Sharybin
3f5771475d Cycles: Don't perform re-intersection if ray distance is zero
It is possible that ray distance will be zero which would make intersection
refinement return NaN as the refined position which would later lead to all
sort of mathematical issues.

Don't think there are ways to improve intersection accuracy for such rays
so just return original intersection coordinate.

This should fix T43475.

TODO: Need to look into possible issues in Ashikhmin BSDF which might return
zero-length reflected/transmitted ray?
2015-01-31 01:49:48 +05:00
Sergey Sharybin
09ac6cae09 Cycles: Cleanup and optimization comment update 2015-01-17 00:15:47 +05:00
Sergey Sharybin
5719ed1225 Cycles: Add leaf primitives sanity check asserts to the kernel
This way we'll notice that leaf splitting didn't happen correct pretty easily
in debug builds.

There'll be absolutely no impact on release builds.
2015-01-12 15:05:14 +05:00
Sergey Sharybin
bc7ff3c2b4 Cycles: Enable leaf split by primitive type and adopt BVH traversal for this
This commit enables BVH leaf nodes split by the primitive type and makes it
so BVH traversal code is now aware and benefits from this.

As was mentioned in original commit, this change is crucial to be able to do
single ray to multiple triangle intersection. But it also appears to give
barely visible speedup in some scene.

In any case there should be no noticeable slowdown, and this change is what
we need to have anyway.
2015-01-12 15:04:52 +05:00
Sergey Sharybin
2a8a56929b Cycles: Fix unneeded int/float conversion happened in previous commit 2015-01-02 17:21:24 +05:00
Sergey Sharybin
4f2583ee13 Fix T43027: OpenCL kernel compilation broken after QBVH
OpenCL apparently does not support templates, so the idea of generic
function for swapping is a bit of a failure. Now it is either inlined
into the code (in triangle intersection) or has specific implementation
for QBVH.

This is probably even better, because we can't create QBVH-specific
function in util_math anyway.
2015-01-02 14:58:01 +05:00
Sergey Sharybin
7778f0ff20 Cycles: Fix MSVC which desn't like condition to be split by preprocessor 2014-12-29 21:10:37 +05:00
Sergey Sharybin
4088fad6dd Cycles: Add asserts around BVH stack pushes
This way we're kind of safer to troubleshoot possible stack overflow issues.
2014-12-29 14:02:15 +05:00
Sergey Sharybin
40517283ca Cycles: Bump stack size for QBVH traversal code
Traversal now can push up to 2x of nodes to the stack, so need some tweaks
to the stack size.
2014-12-29 13:37:18 +05:00
Sergey Sharybin
9c4aba11c9 Cycles: Add some sanity check asserts in the traversal code
This way we'll be sure (in debug builds) that regular BVH traversal is not used
for QBVH tree (could happen because of mismatch of logic in kernel and render).
2014-12-29 13:35:31 +05:00
Sergey Sharybin
91bbaaa271 Cycles: Fix visibility check for instanced nodes
The issue is that only instance node contains proper visibility flags,
nodes from instanced BVH are not correct.
2014-12-27 23:33:50 +05:00
Sergey Sharybin
cd095aae13 Cycles: Distance optimization for QBVH
This commit implements heuristic which allows to skip nodes pushed to the stack
from intersection if distance to them is larger than the distance to the current
intersection.

This should solve speed regression which i didn't notice in the original QBVH
commit (which could have because i had WIP version of this patch applied in my
local branch).

From quick tests speed seems to be much closer to what is was with regular BVH.

There's still some possible code cleanup, but they'll need a bit of assembly
code check and now i want to make it so artists can happily use Cycles over the
holidays.
2014-12-25 22:40:02 +05:00
Sergey Sharybin
9e57babd8d Cycles: Fix really bad bug with shadow rays on non-SSE CPUs
basically shadow rays were totally broken and most of the time did not record
any intersections, leading to really ad rendering artifacts.

This commit makes it so regardless of enabled optimization level render result
would be the same.
2014-12-25 14:30:05 +05:00
Sergey Sharybin
fe06ec82a9 Cycles: Workaround CUDA 6.5.16 error after watertight commit
This issue doesn't happen with 6.5.12 and there's slight piece of hope it'll be
fixed in next toolkit releases..

For now we're forcing CUDA to not inline ray precalculation. This could lead to
some speed regression, but wouldn't expect it to be huge -- this code does not
run that often comparing to actual triangle intersection.
2014-12-25 14:15:37 +05:00
Thomas Dinges
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
Thomas Dinges
4ab821c675 Cleanup: Typo fixes for comments. 2014-12-25 02:42:06 +01:00
Sergey Sharybin
03f28553ff Cycles: Implement QBVH tree traversal
This commit implements traversal for QBVH tree, which is based on the old loop
code for traversal itself and Embree for node intersection.

This commit also does some changes to the loop inspired by Embree:

- Visibility flags are only checked for primitives.

  Doing visibility check for every node cost quite reasonable amount of time
  and in most cases those checks are true-positive.

  Other idea here would be to do visibility checks for leaf nodes only, but
  this would need to be investigated further.

- For minimum hair width we extend all the nodes' bounding boxes.

  Again doing curve visibility check is quite costly for each of the nodes and
  those checks returns truth for most of the hierarchy anyway.

There are number of possible optimization still, but current state is good
enough in terms it makes rendering faster a little bit after recent watertight
commit.

Currently QBVH is only implemented for CPU with SSE2 support at least. All
other devices would need to be supported later (if that'd make sense from
performance point of view).

The code is enabled for compilation in kernel. but blender wouldn't use it
still.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
30b12b1b27 Cycles: Code cleanup, de-duplicate definition of FEATURE
Previously every BVH traversal file was defining macro to check which features
should be compiled in, now this macro is defined in the parent header.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
0476e2c87a Cycles: Rework BVH functions calls a little bit
Basic idea is to allow multiple implementation per feature-set, meaning this
commit tries to make it easier to hook new algorithms for BVH traversal.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
ab8d9c4b88 Cycles: Add some utility functions and structures
Most of them are not currently used but are essential for the further work.

- CPU kernels with SSE2 support will now have sse3b, sse3f and sse3i

- Added templatedversions of min4, max4 which are handy to use with register
  variables.

- Added util_swap function which gets arguments by pointers.
  So hopefully it'll be a portable version of std::swap.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f770bc4757 Cycles: Implement watertight ray/triangle intersection
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection

  http://jcgt.org/published/0002/01/05/paper.pdf

This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.

Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.

But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.

Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.

Reviewers: brecht, juicyfruit

Subscribers: dingto, ton

Differential Revision: https://developer.blender.org/D819
2014-12-25 02:50:49 +05:00
Sergey Sharybin
a888b8beaf Cycles; Code cleanup, make it more obvious what #endif belongs to 2014-12-25 02:50:49 +05:00
Sergey Sharybin
144096faad Cycles: Make it more clear offsets in BVH construction
Previously offsets were calculated based on the BVH node size,
which is wrong and real PITA in cases when some extra data is
to be added into (or removed from) the node.

Now use offsets which are not calculated form the node size.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
345ed4dd10 Cycles: Don't do node visibility check in subsurface and volume traversal
Visibility flags are set to all visibility anyway, So there was no reason
to perform that test.

TODO: We need to investigate if having primitive intersection functions
which doesn't do visibility check gives any speedup here as well.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f4df3ec05a Cycles: Move triangle intersection functions into own file
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
2014-12-25 02:50:48 +05:00
Sergey Sharybin
d06b1a5d8b Cycles: Missed some changes in the previous hair motion blur fix
So now cases when object has both hair motion blur and deformation motion blur
vector pass is all correct.

We could get rid of the flag in the future, still need to look deeper into all
the areas trying to find a more clear solution.
2014-11-19 02:42:22 +05:00
Sergey Sharybin
729dc98be1 Fix T42475: Vector motion blur on hair
Issue was caused by mismatch in pre/post transform matrix spaces for mesh and
curve vectors. This happened because of current way how static transform apply
works: it only stores post/pre in the world space if there's triangle motion
exists. This lead to situation when there's no triangle motion happening but
was hair motion happening.

After long time of trying to solve it in a nice way, ended up solving it in
a bit slow way -- pre/post transform is still storing in the same spaces as
they used to be stored and just convert hair pre/post position to a world
space in the kernel.

This is because currently it's not so clear how to deal with cases when curve
and mesh motion needs different space of pre/post transform (which happens in
cases when only one of the motions exists).

Would think of some magic, and meanwhile artists could be happy with proper
render results.
2014-11-18 15:05:15 +01:00
Sergey Sharybin
4a9b912b96 Fix T42411: Camera inside volume + particle dupli (object/group) doesn't work
The issue was caused by missing current object instance initialization after
object was ignored for instance push.
2014-11-04 19:55:05 +01:00
Sergey Sharybin
d2d1b19170 Cycles: Expose volume voxel data interpolation to the interface
It is per-material setting which could be found under the Volume settings
in the material and world context buttons.

There could still be some code-wise improvements, like using variable-size
macro for interp3d instead of having interp3d_ex to which you can pass the
interpolation method.
2014-10-22 19:53:06 +06:00
Campbell Barton
abd38c00f1 Cycles: set hit values in-order 2014-10-11 11:17:08 +02:00
Sergey Sharybin
45ce901079 Cycles: Remove redundant float4->float3 conversion
Not as if it gives noticeable changes render-time, but it's just weird to
convert float4 to float 3 to just access individual x/y/z components.

Plus some compilers might be more stupid than GCC and don't optimize this
out well.
2014-10-09 11:48:47 +02:00
Thomas Dinges
dde740bcd7 Cycles / CUDA: Change inline rules for BVH intersection functions.
* On sm_30 and above there is no change (was not inlined already before), this just fixes a speed regression from yesterday. 6359c36ba407
* On sm_2x (tested with sm_21), I get a nice 8% speedup in the bmw scene with this. As a bonus, cubin compilation time and memory usage is significantly reduced. Regular cubin size went from 2.5MB to 2.0MB, Experimental one from 3.8MB to 2.5MB.
2014-10-05 03:53:51 +02:00
Sergey Sharybin
15969e8a30 Cycles: Fix wrong ifdef check around shadows record all 2014-10-04 16:21:05 +02:00
Sergey Sharybin
27d660ad20 Cycles: Add support for debug passes
Currently only summed number of traversal steps and intersections used by the
camera ray intersection pass is implemented, but in the future we will support
more debug passes which would help checking what things makes the scene slow.
Example of such extra passes could be number of bounces, time spent on the
shader tree evaluation and so.

Implementation from the Cycles side is pretty much straightforward, could only
mention here that it's a build-time option disabled by default.

From the blender side it's implemented as a PASS_DEBUG with several subtypes
possible. This way we don't need to create an extra DNA pass type for each of
the debug passes, saving us a bits.

Reviewers: campbellbarton

Reviewed By: campbellbarton

Differential Revision: https://developer.blender.org/D813
2014-10-04 19:00:26 +06:00
Thomas Dinges
6359c36ba4 Cycles: Remove a workaround for Titan GPUs, not needed anymore with the latest CUDA compiler. 2014-10-04 01:29:08 +02:00
Thomas Dinges
cdbac018a2 Cycles, some tweaks to scene_intersect_shadow_all()
* Function returns a bool, not an uint.
* Remove GPU ifdefs, this is CPU only due to malloc / qsort.
2014-10-03 20:41:38 +02:00
Thomas Dinges
dc1ca0c94f Cycles: Fix OpenCL compile after new Volume BVH introduction and add some comments. 2014-10-03 17:23:45 +02:00
Sergey Sharybin
7dabfb2048 Cycles: Speedup of kernel side camera-in-volume detection
The idea is to only count intersections with objects which has volumetric shader
and ignore all other objects.

This is probably as fast as we can go without involving some forth level magic.
2014-10-03 12:55:31 +06:00
Martijn Berger
25ec0d97f9 make "tri_shader" an int instead of a float
tri_shader does no longer need to a float.

Reviewers: dingto, sergey

Reviewed By: dingto, sergey

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D789
2014-09-24 13:34:28 +02:00
Thomas Dinges
1b5ec32ed9 Cleanup: Avoid some defines for scene_intersect(), related to Min Width. 2014-09-24 11:32:29 +02:00
Sergey Sharybin
c256072e91 Cycles: Correction to previous commit -- forgot to take instancing into account 2014-08-14 11:48:50 +06:00
Sergey Sharybin
bfaf4f2d0d Fix T41219: Cycles backface detection doesn't work properly
Root of the issue goes back to the on-fly normals commit and the
latest fix for it wasn't actually correct. I've mixed two fixes
in there.

So the idea here goes back to storing negative scaled object flag
and flip runtime-calculated normal if this flag is set, which is
pretty much the same as the original fix for the issue from me.

The issue with motion blur wasn't caused by the rumtime normals
patch and it had issues before, because it already did runtime
normals calculation. Now made it so motion triangles takes the
negative scale flag into account.

This actually makes code more clean imo and avoids rather confusing
flipping code in mesh.cpp.
2014-08-13 16:35:54 +06:00
Campbell Barton
9c3025cd26 Spelling 2014-08-02 16:53:52 +10:00
Sergey Sharybin
34937f6547 Fix T41139: Cycles Hair BSDF roughness problem 2014-07-27 19:51:28 +06:00
Sergey Sharybin
eb8f85d8be Fix T41116: Motion Blur causes random black surfaces on rigged models
Fix T41115: Motion Blur renders Objects Black - But not in Viewport Preview

This actually extends previous fix to normals and makes it all much nicer now.

Worth doing some intense testing, quick one worked just fine but there always
could be some corner cases.
2014-07-23 18:01:35 +06:00
Sergey Sharybin
9fcaac5009 Fix T41147: Static BVH shading problem
Fix T41079: Solid black render of object with negative scale and smooth shading

In both cases the issue was caused by negative scaled objects with single mesh
users for which scale gets applied when using static BVH.

Since the on-fly normals calculation land normals for such cases weren't flipped
leading them to point to a wrong direction.

Added a special object flag for this, which is a bit of a bummer because now
we've got less bits for real useful things, but this is the only way to get
proper normals without adding more complexity in the on-fly calculations.
2014-07-23 13:00:52 +06:00
Thomas Dinges
5aec61f849 Cycles: Compile fixes for CUDA Volumetrics.
* CUDA can be compiled with Volume support again, change line 78 kernel_types.h for that.

Volumes are still fragile on GPU though, got some Memory/Address CUDA errors in tests.. needs to be investigated more deeply.
2014-07-05 02:04:07 +02:00
Thomas Dinges
0ce3a755f8 Cycles: Add support for uchar4 attributes.
* Added support for uchar4 attributes to Cycles' attribute system.
* This is used for Vertex Colors now, which saves some memory (4 unsigned characters, instead of 4 floats).
* GPU Texture Limit on sm_20 and sm_21 decreased from 95 to 94, because we need a new texture for the uchar4 attributes. This is no problem for sm_30 or newer.

Part of my GSoC 2014.
2014-06-13 23:40:54 +02:00
Thomas Dinges
49df707496 Cycles: Calculate face normal on the fly.
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.

Part of my GSoC 2014.
2014-06-13 21:59:13 +02:00