Commit Graph

13 Commits

Author SHA1 Message Date
Sergey Sharybin
298d8681a0 Fix T43596: Refraction BSDF crashes blender on pre-sse4 CPU
This is the same issue T43475: SSE4 code is more robust to non-finite values
in the ray origin/direction. So for now added a check before doing BVH traversal
for pre-SSE4 CPUs.

For sure actual root of the issue is a bit different and much more tricky to
solve, especially without disturbing render results too much. Still looking
into this.

In any case, it's kinda fine to have such a check, we might later make it to be
a kernel_assert() instead of just a return.
2015-02-10 17:36:05 +05:00
Sergey Sharybin
b83d851901 Cycles: Another attempt to solve 32bit CUDA kernel
Previous fix didn't quite work well. For some reason everything worked fine when
using native nvcc in 32bit environment, but cross-compiling from 64bit platform
it was still running out of memory.

For now just made it so all the kernels are slower on 32bit CUDA as a temporary
solution. Either it'll be solved in next CUDA releases (by dropped 32bit? =\) or
we'll find better workaround.
2015-02-09 16:14:44 +05:00
Sergey Sharybin
da06dab4e5 Cycles: Use pre-aligned triangle vertex coordinates for subsurface intersection
This gives small speedup (around 2% in quick tests) for ray scattering.
2015-02-04 14:49:19 +05:00
Sergey Sharybin
432e478f43 Cycles: Further tweaks to T43511 to solve compilation error on 32bit platforms 2015-02-02 22:09:02 +05:00
Sergey Sharybin
31263192bb Fix T43511: Major slow down with many instanced objects in cycles GPU
Slowdown was caused by watertight intersection commit and follow-up workaorund
for compiler crash which uninlined utility function which rotates the ray.

Now it's only uninlined for sm_50 and sm_52 experimental kernels which are the
only ones which failed to compile.

Rendering still might be a bit slower but at least shouldn't be that dramatic.
2015-02-02 17:35:57 +05:00
Sergey Sharybin
3f5771475d Cycles: Don't perform re-intersection if ray distance is zero
It is possible that ray distance will be zero which would make intersection
refinement return NaN as the refined position which would later lead to all
sort of mathematical issues.

Don't think there are ways to improve intersection accuracy for such rays
so just return original intersection coordinate.

This should fix T43475.

TODO: Need to look into possible issues in Ashikhmin BSDF which might return
zero-length reflected/transmitted ray?
2015-01-31 01:49:48 +05:00
Sergey Sharybin
2a8a56929b Cycles: Fix unneeded int/float conversion happened in previous commit 2015-01-02 17:21:24 +05:00
Sergey Sharybin
4f2583ee13 Fix T43027: OpenCL kernel compilation broken after QBVH
OpenCL apparently does not support templates, so the idea of generic
function for swapping is a bit of a failure. Now it is either inlined
into the code (in triangle intersection) or has specific implementation
for QBVH.

This is probably even better, because we can't create QBVH-specific
function in util_math anyway.
2015-01-02 14:58:01 +05:00
Sergey Sharybin
fe06ec82a9 Cycles: Workaround CUDA 6.5.16 error after watertight commit
This issue doesn't happen with 6.5.12 and there's slight piece of hope it'll be
fixed in next toolkit releases..

For now we're forcing CUDA to not inline ray precalculation. This could lead to
some speed regression, but wouldn't expect it to be huge -- this code does not
run that often comparing to actual triangle intersection.
2014-12-25 14:15:37 +05:00
Thomas Dinges
4ab821c675 Cleanup: Typo fixes for comments. 2014-12-25 02:42:06 +01:00
Sergey Sharybin
ab8d9c4b88 Cycles: Add some utility functions and structures
Most of them are not currently used but are essential for the further work.

- CPU kernels with SSE2 support will now have sse3b, sse3f and sse3i

- Added templatedversions of min4, max4 which are handy to use with register
  variables.

- Added util_swap function which gets arguments by pointers.
  So hopefully it'll be a portable version of std::swap.
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f770bc4757 Cycles: Implement watertight ray/triangle intersection
Using this paper: Sven Woop, Watertight Ray/Triangle Intersection

  http://jcgt.org/published/0002/01/05/paper.pdf

This change is expected to address quite reasonable amount of reports from the
bug tracker, plus it might help reducing the noise in some scenes.

Unfortunately, it's currently about 7% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.

But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Embree does
this and it's watertight single ray intersection is not any faster that this.

Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.

Reviewers: brecht, juicyfruit

Subscribers: dingto, ton

Differential Revision: https://developer.blender.org/D819
2014-12-25 02:50:49 +05:00
Sergey Sharybin
f4df3ec05a Cycles: Move triangle intersection functions into own file
This way extending intersection routines with some pre-calculation step wouldn't
explode the single file size, hopefully keeping them all in a nice maintainable
state.
2014-12-25 02:50:48 +05:00