Commit Graph

8 Commits

Author SHA1 Message Date
Mai Lavelle
96868a3941 Fix T50888: Numeric overflow in split kernel state buffer size calculation
Overflow led to the state buffer being too small and the split kernel to
get stuck doing nothing forever.
2017-03-11 05:39:28 -05:00
Hristo Gueorguiev
57e26627c4 Cycles: SSS and Volume rendering in split kernel
Decoupled ray marching is not supported yet.

Transparent shadows are always enabled for volume rendering.

Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
2017-03-09 17:09:37 +01:00
Sergey Sharybin
712f7c3640 Cycles: Make it possible to access KernelGlobals from split data initialization function 2017-03-08 11:02:54 +01:00
Mai Lavelle
fe7cc94dfa Cycles: Fix strict warning about unused variable 2017-03-08 01:31:32 -05:00
Mai Lavelle
306034790f Cycles: Calculate size of split state buffer kernel side
By calculating the size of the state buffer in the kernel rather than the host
less code is needed and the size actually reflects the requested features.

Will also be a little faster in some cases because of larger global work size.
2017-03-08 01:31:30 -05:00
Mai Lavelle
cd7d5669d1 Cycles: Remove sum_all_radiance kernel
This was only needed for the previous implementation of parallel samples. As
we don't have that any more it can be removed.

Real reason for removal tho is this: `per_sample_output_buffers` was being
calculated too small and artifacts resulted. The tile buffer is already
the correct size and calculating the size for `per_sample_output_buffers`
is a bit difficult with the current layout of the code. As
`per_sample_output_buffers` was only needed for `sum_all_radiance`,
removing that kernel and writing output to the tile buffer directly
fixes the artifacts.
2017-03-08 01:31:07 -05:00
Mai Lavelle
817873cc83 Cycles: CUDA implementation of split kernel 2017-03-08 01:24:53 -05:00
Mai Lavelle
230c00d872 Cycles: OpenCL split kernel refactor
This does a few things at once:

- Refactors host side split kernel logic into a new device
  agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
  allows as many threads as will fit in memory regardless of tile size, which
  can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
  number of arguments passed to kernels. Means there's less code to deal
  with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
  other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
2017-03-08 00:52:41 -05:00