Commit Graph

6671 Commits

Author SHA1 Message Date
Lukas Stockner
d544a61e8a Cycles: Update remaining time once per second without waiting for a tile change
Previously, the code would only update the status string if the main status changed.
However, the main status did not include the remaining time, and therefore it wasn't updated until the amount of rendered tiles (which is part of the main status) changed.

This commit therefore makes the BlenderSession remember the time of the last status update and forces a status update if the last one was more than a second ago.

Reviewers: sergey

Differential Revision: https://developer.blender.org/D2465
2017-03-20 15:28:36 +01:00
Sergey Sharybin
a201b99c5a Fix T50975: Cycles: Light sampling threshold inadvertently clamps negative lamps 2017-03-20 14:48:55 +01:00
Sergey Sharybin
18bf900b31 Fix T50990: Random black pixels in Cycles when rendering material with Multiscatter GGX 2017-03-20 12:07:41 +01:00
Campbell Barton
eaf88f564c Remove register_module use in Cycles 2017-03-20 12:16:51 +11:00
Sergey Sharybin
ea3d7a7f58 Fix T50968: Cycles crashes when image datablock points to a directory
See more details about root of the cause there:

  https://github.com/OpenImageIO/oiio/pull/1640
2017-03-17 14:47:12 +01:00
Sergey Sharybin
d6b4fb6429 Cycles: Fix mistake in previous split kernel commits
Own stupid mistake. Reported by nirved in IRC, thanks!
2017-03-17 11:55:59 +01:00
Sergey Sharybin
a58350b07f Cycles: Cleanup, indentation 2017-03-17 10:25:37 +01:00
Sergey Sharybin
e361adbca2 Cycles: Fix compilation error of LCG RNG 2017-03-17 09:58:08 +01:00
Sergey Sharybin
439a277aa5 Cycles: Silence strict compiler warning 2017-03-17 09:56:44 +01:00
Mai Lavelle
2cae58524c Cycles: Improve memory usage of CPU split kernel by using smaller global size 2017-03-17 01:54:10 -04:00
Mai Lavelle
60a344b43d Cycles: Fix handling of barriers 2017-03-17 01:54:04 -04:00
Sergey Sharybin
1cad64900e Cycles: Define ccl_local variables in kernel functions
Declaring ccl_local in a device function is not supported
by certain compilers.
2017-03-16 11:27:17 +01:00
Sergey Sharybin
1ff753baa4 Cycles: Workaround for compilation error caused by passing KernelGlobals
Pass globals as a bare pointer, same as it sued to be prior to split kernel rework.

AMD CPU platform and Intel OpenCL were complaining about this.

Perhaps we shouldn't pass globals as pointer at all, this isn't something what is
really portable and can cause issues on 32 bit perhaps.
2017-03-16 11:27:17 +01:00
Sergey Sharybin
26620f3f87 Cycles: Avoid some ccl_local in various kernels 2017-03-16 11:27:17 +01:00
Mai Lavelle
4833a71621 Cycles: Adjust global size for OpenCL CPU devices to make them faster 2017-03-16 06:11:42 -04:00
Sergey Sharybin
c44cdd5905 Cycles: Allow rendering a range of resumable chunks
The range is controlled using the following command line arguments:

  --cycles-resumable-start-chunk
  --cycles-resumable-end-chunk

Those are 1-based index of range for rendering.
2017-03-15 16:00:01 +01:00
Sergey Sharybin
c5dba540d7 Cycles: Use argument parser for resumable render feature
Currently there is no functional changes, but we will be adding
couple more of options here soon.
2017-03-15 16:00:01 +01:00
Sergey Sharybin
5ba51de84a Cycles: Cleanup, indentation 2017-03-14 16:54:16 +01:00
Mai Lavelle
8dd0355c21 Cycles: Try to avoid infinite loops by catching invalid ray states 2017-03-14 06:22:57 -04:00
Sergey Sharybin
76acaefdd7 Cycles: Cleanup, wipe obviously outdated parts of split kernel comments 2017-03-13 17:16:16 +01:00
lazydodo
0c72008592 fix msvc warnings about unknown opencl pragmas 2017-03-13 10:08:14 -06:00
Sergey Sharybin
aa36c73c33 Cycles: Add missing header in the file 2017-03-13 16:59:09 +01:00
Hristo Gueorguiev
f169ff8b88 Fix T50925: Add AO approximation to split kernel 2017-03-13 11:15:58 +01:00
Sergey Sharybin
8794a43b68 Cycles: Make MESA compiler more happy
While this compiler is not officially supported yet, getting it to work is
a nice thing because more and more AMD cards will fall under MESA driver.

It's also nice to use explicit comparison with NULL, which makes it more
clear whether variable is a boolean or pointer. Even Rust enforces this!

Patch by Ian Bruce with own modifications.
2017-03-13 09:57:25 +01:00
68ca973f7f Fix T50628: gray out cycles device menu when no device configured only for GPU Compute. 2017-03-12 18:00:17 +01:00
Campbell Barton
bcc8c04db4 Cleanup: code style & cmake 2017-03-12 02:47:53 +11:00
Mai Lavelle
96868a3941 Fix T50888: Numeric overflow in split kernel state buffer size calculation
Overflow led to the state buffer being too small and the split kernel to
get stuck doing nothing forever.
2017-03-11 05:39:28 -05:00
2d3c44389a Fix OpenCL warnings about doubles on some platforms. 2017-03-11 00:55:23 +01:00
Sergey Sharybin
59fd21296a Cycles: Cleanup, extra semicolon and space 2017-03-10 15:38:30 +01:00
Mai Lavelle
4a2cde3f0e Cycles: Enable SSS and volumes for CUDA and Nvidia OpenCL split kernel 2017-03-10 02:09:41 -05:00
Hristo Gueorguiev
9de9f25b24 Cycles: add single program debug option for split kernel
Single program generally compiles kernels faster (2-3 times), loads faster,
takes less drive space (2-3 times), and reduces the number of cached kernels.
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
06c051363b Cycles: split kernel_shadow_blocked to AO & DL parts
Reduces memory allocation for split kernel.

This allows for faster rendering due to bigger global size,
specially when GPU memory is limited.

Perfromance results:

                         R9 290 total render time
                        Before    After   Change
BMW                      4:37      4:34   -1.1 %
Classroom               14:43     14:30   -1.5 %
Fishy Cat               11:20     11:04   -2.4 %
Koro                    12:11     12:04   -1.0 %
Pabellon Barcelona      22:01     20:44   -5.8 %
Pabellon Barcelona(*)   15:32     15:09   -2.5 %

(*) without glossy connected to volume
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
e8b5a5bf5b Cycles: Speedup transparent shadows in split kernel
This commit enables record-all transparent shadows rays.

Perfromance results:

               R9 290 render time (without synchronization), seconds
                        Before    After   Change
BMW                      261.5    262.5   +0.4 %
Classroom                869.6    867.3   -0.3 %
Fishy Cat                657.4    639.8   -2.7 %
Koro                    1909.8    692.8  -63.7 %
Pabellon Barcelona      1633.3   1238.0  -24.2 %
Pabellon Barcelona(*)   1158.1    903.8  -22.0 %

(*) without glossy connected to volume
2017-03-09 17:09:37 +01:00
Hristo Gueorguiev
57e26627c4 Cycles: SSS and Volume rendering in split kernel
Decoupled ray marching is not supported yet.

Transparent shadows are always enabled for volume rendering.

Changes in kernel/bvh and kernel/geom are from Sergey.
This simiplifies code significantly, and prepares it for
record-all transparent shadow function in split kernel.
2017-03-09 17:09:37 +01:00
Mai Lavelle
c837bd5ea5 Cycles: Fix CUDA build error for some compilers
Needed to include `util_types.h` before using `uint`.
2017-03-08 16:44:43 -05:00
Sergey Sharybin
97c4c2689f Cycles: Make it more obvious message which initialization failed 2017-03-08 13:57:21 +01:00
Sergey Sharybin
75cb4850f0 Cycles: Use 1-based line number for #line directives
AMD CPU platform was complaining about #line 0 directives in the code.
2017-03-08 12:45:18 +01:00
Sergey Sharybin
ecfbfe478b Cycles: Log which device kernels are being loaded for 2017-03-08 12:33:51 +01:00
Sergey Sharybin
712f7c3640 Cycles: Make it possible to access KernelGlobals from split data initialization function 2017-03-08 11:02:54 +01:00
Sergey Sharybin
ef7c36f5ed Cycles: Cleanup, remove residue of previous split kernel data
This is all in split data state array.
2017-03-08 10:26:29 +01:00
Mai Lavelle
64751552f7 Cycles: Fix indentation 2017-03-08 01:31:32 -05:00
Mai Lavelle
fe7cc94dfa Cycles: Fix strict warning about unused variable 2017-03-08 01:31:32 -05:00
Mai Lavelle
306034790f Cycles: Calculate size of split state buffer kernel side
By calculating the size of the state buffer in the kernel rather than the host
less code is needed and the size actually reflects the requested features.

Will also be a little faster in some cases because of larger global work size.
2017-03-08 01:31:30 -05:00
Mai Lavelle
997e345bd2 Cycles: Fix crash after failed kernel build
Pointers to kernels were uninitialized leading to freeing of random memory
addresses. Another reason it would be good to use smart pointers.
2017-03-08 01:31:09 -05:00
Mai Lavelle
18e50927f7 Cycles: Faster building of split kernel
Simple change to make it so that only kernels that have been modified are
rebuilt. Might only be useful during development.
2017-03-08 01:31:09 -05:00
Mai Lavelle
223f45818e Cycles: Initialize rng_state for split kernel
Because the split kernel can render multiple samples in parallel it is
necessary to have everything initialized before rendering of any samples
begins. The code that normally handles initialization of
`rng_state` (`kernel_path_trace_setup()`) only does so for the first sample,
which was causing artifacts in the split kernel due to uninitialized
`rng_state` for some samples.

Note that because the split kernel can render samples in parallel this
means that the split kernel is incompatible with the LCG.
2017-03-08 01:31:09 -05:00
Mai Lavelle
cd7d5669d1 Cycles: Remove sum_all_radiance kernel
This was only needed for the previous implementation of parallel samples. As
we don't have that any more it can be removed.

Real reason for removal tho is this: `per_sample_output_buffers` was being
calculated too small and artifacts resulted. The tile buffer is already
the correct size and calculating the size for `per_sample_output_buffers`
is a bit difficult with the current layout of the code. As
`per_sample_output_buffers` was only needed for `sum_all_radiance`,
removing that kernel and writing output to the tile buffer directly
fixes the artifacts.
2017-03-08 01:31:07 -05:00
Mai Lavelle
4cf501b835 Cycles: Split path initialization into own kernel
This makes it easier to initialize things correctly in the data_init kernel
before they are needed by path tracing.
2017-03-08 01:30:43 -05:00
Mai Lavelle
5b8f1c8d34 Cycles: Seperate kernel loading time from render time 2017-03-08 01:24:55 -05:00
Mai Lavelle
b78e543af9 Cycles: Add names to buffer allocations
This is to help debug and track memory usage for generic buffers. We
have similar for textures already since those require a name, but for
buffers the name is only for debugging proposes.
2017-03-08 01:24:55 -05:00
Mai Lavelle
817873cc83 Cycles: CUDA implementation of split kernel 2017-03-08 01:24:53 -05:00
Mai Lavelle
0892352bfe Cycles: CPU implementation of split kernel 2017-03-08 00:52:41 -05:00
Mai Lavelle
352ee7c3ef Cycles: Remove ccl_fetch and SOA 2017-03-08 00:52:41 -05:00
Sergey Sharybin
a87766416f Cycles: Report device maximum allocation and detected global size 2017-03-08 00:52:41 -05:00
Mai Lavelle
365a4239c5 Cycles: Workaround for driver hangs
Simple workaround for some issues we've been having with AMD drivers hanging
and rendering systems unresponsive. Unfortunately this makes things a bit
slower, but its better than having to do hard reboots. Will be removed when
drivers have been fixed.

Define CYCLES_DISABLE_DRIVER_WORKAROUNDS to disable for testing purposes.
2017-03-08 00:52:41 -05:00
Mai Lavelle
230c00d872 Cycles: OpenCL split kernel refactor
This does a few things at once:

- Refactors host side split kernel logic into a new device
  agnostic class `DeviceSplitKernel`.
- Removes tile splitting, a new work pool implementation takes its place and
  allows as many threads as will fit in memory regardless of tile size, which
  can give performance gains.
- Refactors split state buffers into one buffer, as well as reduces the
  number of arguments passed to kernels. Means there's less code to deal
  with overall.
- Moves kernel logic out of OpenCL kernel files so they can later be used by
  other device types.
- Replaced OpenCL specific APIs with new generic versions
- Tiles can now be seen updating during rendering
2017-03-08 00:52:41 -05:00
Mai Lavelle
520b53364c Cycles: Add OpenCL kernel for zeroing memory buffers
Transferring memory to the device was very slow and there's really no
need when only zeroing a buffer.
2017-03-08 00:52:41 -05:00
Mai Lavelle
dfd6055eb0 Cycles: Add more atomic operations 2017-03-08 00:52:41 -05:00
Mai Lavelle
bc652766e8 Cycles: Expose passes size to device tasks
This is needed so devices can know the size of a tile buffer before any
tiles are acquired.
2017-03-08 00:52:41 -05:00
Mai Lavelle
0f56f7a811 Cycles: Allow device_memory to be used directly
This is useful for when theres no host side memory attched to the buffer
2017-03-08 00:52:41 -05:00
Sergey Sharybin
0e995e0bfe Cycles: Fix strict -Wpedantic warnings with GCC
Patch by Stefan Werner, thanks!
2017-03-06 14:18:26 +01:00
Sergey Sharybin
3623f32b48 FFmpeg: Update for the deprecated API in 3.2.x
Should be no functional changes.
2017-03-06 10:34:57 +01:00
Jörg Müller
f75b52eca1 Fix T50843: Pitched Audio renders incorrectly in VSE
There was a bug in the intended code behaviour to always seek with a
pitch of 1.0 regardless of pitch/pitch animation/doppler effects.

Check the bug report for a more detailed explanation of problems
concerning pitch and seeking.
2017-03-05 12:19:32 +01:00
Sergey Sharybin
810d7d4694 Cycles: Fix possibly uninitialized variable
Hopefully this was a reason of randomly disappearing textures in our renders.
2017-03-03 10:10:26 +01:00
Sergey Sharybin
351c9239ed Cleanup: Use explicit unsigned int in atomics 2017-03-01 12:01:19 +01:00
Sergey Sharybin
87f236cd10 Cycles: Fix division by zero in volume code which was producing -nan 2017-02-28 17:33:06 +01:00
Aaron Carlisle
6d1ac79514 Cleanup: Grey --> Gray 2017-02-27 19:33:57 -05:00
Sergey Sharybin
5acac13eb4 Cycles: Fix compilation error on vanilla Ubuntu 16.10
Patch by @swerner, thanks!
2017-02-27 15:22:51 +01:00
Sergey Sharybin
f1b21d5960 Fix T50634: Hair Primitive as Triangles + Hair shader with a texture = crash
Attributes were not resized after pushing new triangles to the mesh.
2017-02-27 15:21:14 +01:00
Sergey Sharybin
209a64111e Fix part of T50634: Hair Primitive as Triangles + Hair shader with a texture = crash
Wrong formula was used to calculate needed verts and tris to be reserved.
2017-02-27 15:21:14 +01:00
Sergey Sharybin
00ceb6d2f4 Cycles: Make it more clear values never changes by using const qualifier 2017-02-27 15:21:14 +01:00
Sergey Sharybin
cc78690be3 Cycles: Forgot this in previous commit 2017-02-27 12:54:35 +01:00
Sergey Sharybin
238db604c5 Cycles: Add more logs about what's going on in shader optimization 2017-02-27 12:38:24 +01:00
Sergey Sharybin
845ba1a6fb Cycles: Experiment with replacing Sharp Glossy with GGX when Filter Glossy is used
The idea is to make it simpler to remove noise from scenes when some prop uses
Sharp glossy closure and causes noise in certain cases. Previously Sharp Glossy
was not affected by Filter Glossy at all, which was quite confusing.

Here is a file which demonstrates the issue: {F417797}

After applying the patch all the noise from the scene is gone.

This change also solves fireflies reported in T50700.

Reviewers: brecht, lukasstockner97

Differential Revision: https://developer.blender.org/D2416
2017-02-27 12:33:59 +01:00
8c5826f59a Fix T50698: Cycles baking artifacts with transparent surfaces. 2017-02-25 03:12:53 +01:00
15f1072ee2 Fix build error with macOS / clang / c++11. 2017-02-25 03:12:53 +01:00
Sergey Sharybin
1e29286c8c Cycles: Fix compilation warning with CUDA on OSX 2017-02-24 14:33:10 +01:00
Sergey Sharybin
50328b41a7 Cycles: Fix compilation error on 32bit Linux 2017-02-23 17:30:26 +01:00
Sergey Sharybin
4e12113bea Cycles: Fix wrong render results with texture limit and half-float textures 2017-02-23 14:46:22 +01:00
Sergey Sharybin
13e075600a Cycles: Add utility function to convert float to half
handles overflow and underflow, but not NaN/inf.
2017-02-23 14:42:06 +01:00
Sergey Sharybin
60592f6778 Fix T50748: Render Time incorrect when refreshing rendered preview in GPU mode 2017-02-23 10:51:06 +01:00
Sergey Sharybin
36c4fc1ea9 Cycles: Fix shading with autosmooth and custom normals
New logic of split_faces was leaving mesh in a proper state
from Blender's point of view, but Cycles wanted loop normals
to be "flushed" to vertex normals.

Now we do such a flush from Cycles side again, so we don't
leave bad meshes behind.

Thanks Bastien for assistance here!
2017-02-22 10:54:36 +01:00
Sergey Sharybin
2c30fd83f1 Cycles: Additionally report all OpenCL cflags
This way we can control exact spaces and such added to the cflags
which is crucial to troubleshoot certain drivers.
2017-02-22 10:06:02 +01:00
Mai Lavelle
4e9b17da4c Cycles: Speedup by avoiding extra calculations in noise texture when unneeded
Noise texture is now faster when the color socket is unused. Potential for
speedup spotted by @nutel.

Some performance results:

                     Render Time Before    After    Difference
Gooseberry benchmark         47:51.34    45:55.57       -4%
Koro                         12:24.92    12:18.46     -0.8%
Simple cube (Color socket)      48.53       48.72     +0.3%
Simple cube (Fac socket)        48.74       32.78    -32.7%
Goethe displacement           1:21.18     1:08.47    -15.6%
Cycles brick displacement     3:02.38     2:16.76    -25.0%
Large displacement scene     23:54.12    20:09.62    -15.6%

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2513
2017-02-21 07:24:33 -05:00
Sergey Sharybin
696836af1d Fix T50718: Regression: Split Normals Render Problem with Cycles
The issue seems to be caused by vertex normal being re-calculated
to something else than loop normal, which also caused wrong loop
normals after re-calculation.

For now issue is solved by preserving CD_NORMAL for loops after
split_faces() is finished, so render engine can access original
proper value.
2017-02-20 11:56:02 +01:00
Sergey Sharybin
333dc8d60f Fix T50719: Memory usage won't reset to zero while re-rendering on two video cards
Was only visible with Persistent Images option ON.
2017-02-20 11:02:19 +01:00
9992e6a169 Fix a few compiler warnings with macOS / clang. 2017-02-18 23:59:34 +01:00
Sergey Sharybin
306acb7dda Fix T50687: Cycles baking time estimate and progress bar doesn't work / progress when baking with high samples 2017-02-16 17:15:08 +01:00
Sergey Sharybin
6468cb5f9c Faces split: Don't leave CD_NORMAL after split
This is supposed to be a temporary layer.

If someone needs loop normals after split it should explicitly
ask for that.
2017-02-16 11:00:17 +01:00
Sergey Sharybin
e22d4699cb Cycles: Cleanup, style 2017-02-15 20:33:49 +01:00
Sergey Sharybin
fe47163a1e Cycles: Fix CUDA compilation error after recent changes 2017-02-15 15:01:08 +01:00
Sergey Sharybin
8b8c0d0049 Cycles: Don't calculate primitive time if BVH motion steps are not used
Solves memory regression by the default configuration.
2017-02-15 12:59:31 +01:00
Sergey Sharybin
6cdc954e8c Cycles: Pass special flag whether BVH motion steps are used
Doesn't currently change anything, but would need for some future
work here.

It uses existing padding in kernel BVH structure, so there is
nothing changed memory-wise.
2017-02-15 12:45:06 +01:00
Sergey Sharybin
dc7bbd731a Cycles: Fix wrong hair render results when using BVH motion steps
The issue here was mainly coming from minimal pixel width feature
which is quite commonly enabled in production shots.

This feature will use some probabilistic heuristic in the curve
intersection function to check whether we need to return intersection
or not. This probability is calculated for every intersection check.
Now, when we use multiple BVH nodes for curve primitives we increase
probability of that primitive to be considered a good intersection
for us. This is similar to increasing minimal width of curve.

What is worst here is that change in the intersection probability
fully depends on exact layout of BVH, meaning probability might
change differently depending on a view angle, the way how builder
binned the primitives and such. This makes it impossible to do
simple check like dividing probability by number of BVH steps.

Other solution might have been to split BVH into fully independent
trees, but that will increase memory usage of all the static
objects in the scenes, which is also not something desirable.

For now used most simple but robust approach: store BVH primitives
time and test it in curve intersection functions. This solves the
regression, but has two downsides:

- Uses more memory.

  which isn't surprising, and ANY solution to this problem will
  use more memory.

  What we still have to do is to avoid this memory increase for
  cases when we don't use BVH motion steps.

- Reduces number of maximum available textures on pre-kepler cards.

  There is not much we can do here, hardware gets old but we need
  to move forward on more modern hardware..
2017-02-15 12:45:04 +01:00
Sergey Sharybin
088c6a17ba Cycles: Fix missing initialization of triangle BVH steps
Likely was harmless for Blender, but better be safe here.
2017-02-15 12:44:52 +01:00
Sergey Sharybin
5723aa8c02 Cycles: Fix wrong pointiness caused by precision issues 2017-02-15 12:40:13 +01:00
Sergey Sharybin
930186d3df Cycles: Optimize sorting of transparent intersections on CUDA 2017-02-13 18:24:45 +01:00
Sergey Sharybin
21dbfb7828 Cycles: Fix wrong transparent shadows with CUDA
Was a bug in recent optimization commit.
2017-02-13 18:22:10 +01:00
Sergey Sharybin
581c819013 Cycles: Fix wrong shading on GPU when background has NaN pixels and MIS enabled
Quite simple fix for now which only deals with this case. Maybe we want to do
some "clipping" on image load time so regular textures wouldn't give NaN as
well.
2017-02-13 16:32:55 +01:00
Sergey Sharybin
81eee0f536 Cycles: Use fast math without finite optimization
This allows us to use faster math and still have reliable
isnan/isfinite tests.

Only do it for host side, kernels stays unchanged.

Thanks Lukas Stockner for the tip!
2017-02-13 16:25:35 +01:00
Sergey Sharybin
37afa965a4 Fix T50655: Pointiness is too slow to calculate
Optimize vertex de-duplication the same way as we do doe Remove Doubles.
2017-02-13 12:00:10 +01:00
Sergey Sharybin
594015fb7e Cycles: Use Cycles-side mesh instead of C++ RNA
Those are now matching and it's faster to skip C++ RNA to
calculate pointiness.
2017-02-13 10:40:05 +01:00
Sergey Sharybin
5552e83b53 Cycles: Don't use built-in API for image sequences in preview mode
Our Python API is not ready for such things at all. Better be slower
but more correct for until we improve our API.
2017-02-11 22:24:59 +01:00
Sergey Sharybin
cd4309ced0 Cycles: Cleanup, move EdgeMap to blender_util
it's better place for such an utility structure. Still not fully ideal tho.
2017-02-10 13:34:10 +01:00
Sergey Sharybin
0178915ce9 Cycles: Make an utility class for edge map
Simplifies some logic.
2017-02-10 13:34:09 +01:00
Sergey Sharybin
fd7e9f7974 Cycles: Fix pointiness attribute giving wrong results with autosplit
Basically made the algorithm to handle vertices with the same coordinate
as a single vertex.
2017-02-10 13:34:09 +01:00
Sergey Sharybin
d395d81bfc Cycles: Cleanup: Use less indentation by inverting condition 2017-02-10 13:34:09 +01:00
Sergey Sharybin
0b65b889ef Cycles: Calculate all vertex attribute after faces generation
This way the calculation is not spread over multiple places.
2017-02-10 13:34:09 +01:00
Sergey Sharybin
b26da8b467 Cycles: Cleanup: use vector instead of bare malloc
This way memory is more "manageable" and easier to follow.
2017-02-10 13:34:09 +01:00
Sergey Sharybin
b16fd22018 Cycles: Fix regression with transparent shadows in volume 2017-02-08 14:00:48 +01:00
Sergey Sharybin
da31a82832 Cycles: Solve speed regression by casting opaque ray first 2017-02-08 14:00:48 +01:00
Sergey Sharybin
04cf1538b5 Cycles: Fix compilation error on OpenCL 2017-02-08 14:00:48 +01:00
Sergey Sharybin
31a025f51e Cycles: Split shadow functions to avoid some duplicated calculations 2017-02-08 14:00:48 +01:00
Sergey Sharybin
dde40989f3 Cycles: Store shadow intersections in the kernel globals
Seems CUDA failed to de-duplicate the array across multiple inlined
versions of the shadow_blocked(). Helped it a bit with that now.

Gives about 100MB memory improvement on a scenes after previous
commit and brings up memory "regression" to only 100MB comparing to
the master branch now.
2017-02-08 14:00:48 +01:00
Sergey Sharybin
7447950bc3 Cycles: Speedup transparent shadows on CUDA
This commit enables record-all behavior of transparent shadows
rays.

Render times difference goes as following:

               GTX 1080 render time
BMW                  -0.5%
Fishy Cat            -0.0%
Pabellon Barcelona   -11.6%
Classroom            +1.2%
Koro                 -58.6%

Kernel will now use some extra VRAM memory to store the intersection
array (200MB on my configuration). This we can optimize out with some
further commits.
2017-02-08 14:00:48 +01:00
Sergey Sharybin
9830eeb44b Cycles: Implement record-all transparent shadow function for GPU
The idea is to record all possible transparent intersections when
shooting transparent ray on GPU (similar to what we were doing  on
CPU already).

This avoids need of doing whole ray-to-scene intersections queries
for each intersection and speeds up a lot cases like transparent
hair in the cost of extra memory.

This commit is a base ground for now and this feature is kept
disabled for until some further tweaks.
2017-02-08 14:00:48 +01:00
Sergey Sharybin
9c3d202e56 Cycles: Use an utility function to sort intersections array 2017-02-08 14:00:48 +01:00
Sergey Sharybin
58a10122d0 Cycles: Make GPU version of shadow_blocked() closer to CPU
Now we break the traversal cycle and then perform volume attenuation
and check with zero throughput. Not sure it makes any measurable sense
at this moment, but in the future it might help de-duplicating some
extra logic here.
2017-02-08 14:00:48 +01:00
Sergey Sharybin
98a1855803 Cycles: De-duplicate transparent shadows attenuation
Fair amount of code was duplicated for CPU and GPU, now we are
using inlined function to avoid such duplication.
2017-02-08 14:00:48 +01:00
Sergey Sharybin
53896d4235 Fix T49253: Cycles blackbody is wrong on AVX2 CPU on Windows
Seems to be bug in optimizer, but managed to reshuffle in a way
which should also give some speedup.
2017-02-07 13:05:19 +01:00
Jorge Bernal
dbdc346e9f CMake: Remove MOTO library dependency when it is not needed
It is not necessary to add MOTO library dependency when we use
WITH_IK_SOLVER (now it uses Eigen) or we use WITH_MOD_BOOLEAN (it was
used by bsp intern library some time ago but it is not present in the
code anymore).

Reviewers: mont29, sergey

Subscribers: mont29, sergey

Differential Revision: https://developer.blender.org/D2477
2017-02-06 19:29:42 +01:00
Phil Christensen
351c409317 C++ conformance fixes (MSVC /permissive-)
We (the Microsoft C++ team) use the Blender project as part of our "Real world code" tests.
I noticed a place in WIN32 specific code (dvpapi.cpp:85) where a string literal is losing
its const-ness when being passed to BLI_dynlib_open().  This is not permitted when using the
/permissive- conformance compiler switch (see our blog
https://blogs.msdn.microsoft.com/vcblog/2016/11/16/permissive-switch/)

My suggested fix is to add const and propagate it where needed.  Another possible fix would be
to explicitly cast away the const.

Reviewers: mont29, sergey, LazyDodo

Subscribers: Blendify, sergey, mont29, LazyDodo

Tags: #platform:_windows

Differential Revision: https://developer.blender.org/D2495
2017-02-06 10:44:56 +01:00
Sergey Sharybin
e1e85454ea Cycles: Cleanup, order of arguments to EXPECT_EQ
The order was wrong from the semantic point of view, caused
by some legacy workarounds in Libmv. Didn't realize it's was
not how things were expected to be used.
2017-02-03 11:35:34 +01:00
Lukas Stockner
fa19940dc6 Cycles: Fix rng_state initialization when using resumable rendering 2017-02-01 05:43:17 +01:00
Sergey Sharybin
326516c9d7 Cycles: Fix spelling in comment 2017-01-31 12:08:19 +01:00
Sergey Sharybin
0330741548 Cycles: Add option to replace GI with AO approximation after certain amount of bounces
This is a speed up option which is mainly useful for viewport. Gives nice speedup in
the barbershop scene of 2x when replacing GI with AO after 2nd bounce without loosing
too much details.

Reviewers: brecht

Subscribers: eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D2383
2017-01-27 14:21:49 +01:00
lazydodo
64f5afdb89 [Cycles/MSVC/Testing] Fix broken test code.
Currently the tests don't run on windows for the following reasons

1) render_graph_finalize has an linking issue due missing a bunch of libraries (not sure why this is not an issue for linux)
2) This one is more interesting, in test/python/cmakelists.txt ${TEST_BLENDER_EXE_BARE} and ${TEST_BLENDER_EXE} are flat out wrong, but for some reason this doesn't matter for most tests, cause ctest will actually go out and look for the executable and fix the path for you *BUT* only for the command, if you use them in any of the parameters it'll happily pass on the wrong path.
3) on linux you can just run a .py file, windows is not as awesome and needs to be told to run it with pyton.
4) had to use the NAME/COMMAND long form of add_test otherwise $<TARGET_FILE:blender> doesn't get expanded, why? beats me.
5) missing idiff.exe for msvc2015/x64 in the libs folder.

This patch addresses 1-4 , but given I have no working Linux build environment, I'm unsure if it'll break anything there

5 has been fixed in rBL61751

Reviewers: juicyfruit, brecht, sergey

Reviewed By: sergey

Subscribers: Blendify

Tags: #cycles, #automated_testing

Differential Revision: https://developer.blender.org/D2367
2017-01-25 09:37:19 -07:00
Sergey Sharybin
ced20b74e5 Fix T50032: Wrong render result when same image is used with and without alpha 2017-01-25 14:02:59 +01:00
Sergey Sharybin
8ea09252c8 Fix T50517: Rendering expecting time is negative 2017-01-25 11:18:12 +01:00
Mai Lavelle
a7d5cabd4e Fix T49405: Crash when baking with adaptive subdivision
Blenders baking system currently doesn't support the topology used by
adaptive subdivision and primitive ids will be wrong or out of range
leading to crashes. Updating the baking system to support other
topologies would be a bit involved, so for now we simply disable
subdivision while baking to avoid crashes.
2017-01-25 00:40:45 -05:00
Sergey Sharybin
d84df351d0 Cycles: Don't rely on indirectly included algorithm 2017-01-24 16:39:16 +01:00
Aaron Carlisle
e5d8c2a67f Use new manual URL 2017-01-23 19:10:37 -05:00
Sergey Sharybin
bc096e1eb8 Cycles: Split ShaderData object and shader flags
We started to run out of bits there, so now we separate flags
which came from __object_flags and which are either runtime or
coming from __shader_flags.

Rule now is: SD_OBJECT_* flags are to be tested against new
object_flags field of ShaderData, all the rest flags are to
be tested against flags field of ShaderData.

There should be no user-visible changes, and time difference
should be minimal. In fact, from tests here can only see hardly
measurable difference and sometimes the new code is somewhat
faster (all within a noise floor, so hard to tell for sure).

Reviewers: brecht, dingto, juicyfruit, lukasstockner97, maiself

Differential Revision: https://developer.blender.org/D2428
2017-01-23 12:56:55 +01:00
Sergey Sharybin
b9311b5e5a Cycles: Make object flag names more obvious that hey are object and not shader 2017-01-23 12:14:17 +01:00
Sergey Sharybin
77982e159c Cycles: Fix typo in the panel name
No user visible changes, it was a typo in the name of the class.

Spotted by povmaniac in IRC, thanks!
2017-01-23 10:35:15 +01:00
Sergey Sharybin
2268f41418 Cycles: Update current Cycles version 2017-01-23 10:25:59 +01:00
Bastien Montagne
ce8889175a Fix T50491: Cycles UI breaks when pushing F8.
Cycles add-on did not actually support reloading correctly.

When you want to correctly reload sub-modules (i.e. modules of an add-on
which is a package), you need to use importlib, a mere import will do
nothing with already loaded modules (RNA classes are sort of
pre-registered when they are evaluated, through the meta-class system).
2017-01-22 12:42:14 +01:00
Sergey Sharybin
43268c1997 Cycles: Use more const qualifiers to avoid possible issues 2017-01-20 17:54:17 +01:00
Sergey Sharybin
a1c21e0b50 Cycles: Cleanup, split one gigantic function into two smaller ones 2017-01-20 17:52:48 +01:00
Sergey Sharybin
1ad04c7d65 Cycles: Store time in BVH nodes
This way we can stop traversing BVH node early on.

Gives about 2-2.5x times render time improvement with 3 BVH steps.
Hopefully this gives no measurable performance loss for scenes with
single BVH step.

Traversal is currently only implemented for QBVH, meaning old CPUs
and GPU do not benefit from this change.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
c4890cd354 Cycles: Add option to split triangle motion primitives by time steps
Similar to the previous commit, the statistics goes as:

BVH Steps     Render time (sec)       Memory usage (MB)
    0                46                    260
    1                27                    373
    2                18                    598
    3                15                    826

Scene used for the tests is the agent's body from one of the barber
shop scenes (no textures or anything, just a diffuse material).

Once again this is limited to regular (non-spatial split) BVH,
Support of spatial split to this feature will come later.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
5298853e95 Cycles: Add option to split curve motion primitives by time steps
The idea is to create several smaller BVH nodes for each of the motion
curve primitives. This acts as a forced spatial split for the single
primitive.

This gives up render time speedup of motion blurred hair in the cost
of extra memory usage. The numbers goes as:

BVH Steps     Render time (sec)       Memory usage (MB)
    0               258                    191
    1               123                    278
    2                69                    453
    3                43                    627

Scene used for the tests is the agent's hair from one of the barber
shop scenes.

Currently it's only limited to scenes without spatial split enabled,
since the spatial split builder requires some changes to work properly
with motion steps coordinates.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
d50d370755 Cycles: Add utility function to calculate curve boundbox from given 4 keys
Also fixed some issues with motion keys calculation:

- Clamp lower and upper limits of curves so we can safely call those
  functions for the very first and very last curve segment.
- Fixed wrong indexing for the curve radius array.
- Fixed wrong motion attribute offset calculation.
2017-01-20 12:46:18 +01:00
Sergey Sharybin
6f900c383a Cycles: Cleanup, trailing whitespace 2017-01-20 12:46:18 +01:00
Sergey Sharybin
26cdc64a7f Cycles: Split motion triangle file once again, avoids annoying forward declarations 2017-01-20 12:46:17 +01:00
Sergey Sharybin
14d343a8f9 Cycles: Move motion triangle intersection functions to own file
Mimics how regular triangles are working and makes it more clear where
the stuff is located in the kernel.

Needed to have some forward declarations because of the current placement
of things in the kernel.
2017-01-20 12:46:17 +01:00
Sergey Sharybin
ebc695ef2c Cycles: Cleanup, better variable name 2017-01-20 12:46:17 +01:00
Sergey Sharybin
20eb1fe3c1 Cycles: Add utility function to fetch motion keys while on CPU side 2017-01-20 12:46:17 +01:00
Sergey Sharybin
938ec3a743 Cycles: Cleanup, comments 2017-01-20 12:46:16 +01:00
Sergey Sharybin
461214508c Cycles: Add utility function to fetch motion triangle when on CPU side 2017-01-20 12:46:15 +01:00