Commit Graph

4397 Commits

Author SHA1 Message Date
Mai Lavelle
e389ae9dca Cycles: Set error if a split kernel fails to load
To help catch cases where adding a new kernel is missed for one of the
device implementations.
2017-11-11 01:01:14 -05:00
Sergey Sharybin
db7a78a2be Cycles: Fix compilation error with latest OIIO
There was some changes about namespaces, which causes ambiguities.

Replaces using namespace with an explicit symbols we need. Is good idea to NOT
pull in the whole namespace anyway!
2017-11-10 10:04:33 +01:00
a466d7ae24 Cycles: better distance sampling for chromatic volume extinction.
Previously we picked one of the RGB channels with equal probability, but this
works poorly in a dense volume after many bounces. Now we take into account
the throughput and single scattering albedo.

This makes it a little more practical to do brute force SSS with volumes, but
is still very inefficient because we do direct light sampling at every volume
bounce even when inside an opaque mesh. In theory there could be a light inside
the mesh so we can't automatically disable direct lighting.
2017-11-10 01:37:10 +01:00
21a535840d Fix T53270: crash with multiscatter GGX after recent refactoring.
In fact this was an existing issue when exceeding the number of available
closure, but it's more common now that we set the number to 0 for shadows
and emission
2017-11-09 20:28:00 +01:00
1ffa01b6f8 Fix (harmless) valgrind warning. 2017-11-09 20:28:00 +01:00
bd4bea3e98 Cycles: avoid reallocating tile denoising memory many times during render. 2017-11-09 20:28:00 +01:00
Dalai Felinto
08a023d7ca Cycles: Silence warning when building without OSL 2017-11-09 08:39:30 -02:00
Mai Lavelle
087331c495 Cycles: Replace __MAX_CLOSURE__ build option with runtime integrator variable
Goal is to reduce OpenCL kernel recompilations.

Currently viewport renders are still set to use 64 closures as this seems to
be faster and we don't want to cause a performance regression there. Needs
to be investigated.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2775
2017-11-09 01:04:06 -05:00
26f39e6359 Cycles: add bevel shader, for raytrace based rounded edges.
The algorithm averages normals from nearby surfaces. It uses the same
sampling strategy as BSSRDFs, casting rays along the normal and two
orthogonal axes, and combining the samples with MIS.

The main concern here is that we are introducing raytracing inside
shader evaluation, which could be quite bad for GPU performance and
stack memory usage. In practice it doesn't seem so bad though.

Note that using this feature can easily slow down renders 20%, and
that if you care about performance then it's better to use a bevel
modifier. Mainly this is useful for baking, and for cases where the
mesh topology makes it difficult for the bevel modifier to work well.

Differential Revision: https://developer.blender.org/D2803
2017-11-07 22:35:12 +01:00
f79f386731 Code refactor: rename subsurface to local traversal, for reuse. 2017-11-07 22:35:12 +01:00
d0af56fe3b Cycles: antialias normal baking if the mesh has a bump map. 2017-11-07 22:35:12 +01:00
ff34e48911 Cycles: add an extra CUDA synchronize before rendering.
It should not be needed as far as I know, but just in case it fixes any
of the recent issues like T52572.
2017-11-07 22:35:12 +01:00
e74b229342 Fix incorrect MIS weights in Cycles with multiple lights.
This causes some difference in the classroom scene, where ray visibility
tricks are used and break the MIS balance. Otherwise there doesn't seem
to be much effect, but better to use the right formulas. Problem originally
identified by Lukas.
2017-11-07 22:35:12 +01:00
Sergey Sharybin
1a1fb5a47c Cycles: Cleanup, style 2017-11-07 13:55:58 +01:00
8a72be7697 Cycles: reduce closure memory usage for emission/shadow shader data.
With a Titan Xp, reduces path trace local memory from 1092MB to 840MB.
Benchmark performance was within 1% with both RX 480 and Titan Xp.

Original patch was implemented by Sergey.

Differential Revision: https://developer.blender.org/D2249
2017-11-05 20:48:33 +01:00
c571be4e05 Code refactor: sum transparent and absorption weights outside closures. 2017-11-05 18:13:44 +01:00
2c02a04c46 Code refactor: remove emission and background closures, sum directly. 2017-11-05 18:13:44 +01:00
cac3d4d166 Cycles: fix inefficient attribute map storage, saves 615MB in victor scene. 2017-11-05 18:00:48 +01:00
5801ef71e4 Code refactor: device memory cleanups, preparing for mapped host memory. 2017-11-05 15:22:04 +01:00
5475314f49 Cycles: reserve CUDA local memory ahead of time.
This way we can log the amount of memory used, and it will be important
for host mapped memory support.
2017-11-05 15:22:04 +01:00
33b5e8daff Code refactor: replace CUDA array with linear memory for 1D and 2D textures.
This is a prequisite for getting host memory allocation to work. There appears
to be no support for 3D textures using host memory. The original version of
this code was written by Stefan Werner for D2056.
2017-11-04 02:23:00 +01:00
6ec599c682 Fix T53247: mixed CPU + GPU render wrong texture limits. 2017-11-03 20:32:29 +01:00
50c129760d Fix Cycles showing empty tiles while they are being denoised. 2017-11-02 15:23:55 +01:00
ff97dcebf3 Fix T53182: cancelling save buffers + denoising render clears image. 2017-11-02 14:31:05 +01:00
Mai Lavelle
5cb8730689 Cycles: Add another limit to OpenCL memory usage
Some drivers may report very large allocation sizes, which could cause
unnecessary memory usage. This is now limited to 2gb which should
still be enough to get the needed performance benefits without waste.
2017-11-02 08:14:21 -04:00
Sergey Sharybin
71f46bc367 Cycles: Add utility function to distinguish between scatter and absorption volume ID 2017-11-01 11:10:51 +01:00
Sergey Sharybin
5d7138c08a Cycles: Cleanup, make it more obvious what preprocessor belongs to 2017-11-01 11:10:10 +01:00
Sergey Sharybin
7f45acee80 Cycles: Cleanup, delete trailing whitespace 2017-11-01 11:06:55 +01:00
Sergey Sharybin
5296c2e099 Experiment with adding output file meta data from render engine
The idea is to make it possible to report extra meta data from
render engine to the file writing. This way we can provide
additional information such as number of samples rendered by
resumable Cycles rendering so we can easily combine files back.

Currently only report number of samples from Cycles when rendering
a single render-layer scene. This is something what was required
here at the studio. We can easily extend that further.

Ideally we would also need to support non-string metadata, but
that's for later.

Reviewers: mont29, campbellbarton

Reviewed By: mont29, campbellbarton

Subscribers: sybren, candreacchio

Differential Revision: https://developer.blender.org/D2502
2017-10-31 15:05:53 +01:00
Sergey Sharybin
46963f359d Cycles: Bump version number to 1.9.0
This matches Blender Release 2.79.
2017-10-31 13:34:34 +01:00
Sergey Sharybin
39671ac504 Fix crash of standalone app after recent refactor 2017-10-31 13:34:23 +01:00
bbc7eb8ae5 Cycles: restore SOBOL_SKIP hack, for some cases where it helps still. 2017-10-29 16:44:20 +01:00
171c4e982f Cycles: use AO factor to let user adjust intensity of AO bounces.
We are already using the AO distance, so might as well offer this extra
control over the intensity. Useful when an interior scene is supposed to
be significantly darker than the background shader.
2017-10-25 21:46:23 +02:00
83877632a3 Fix one more assert being triggered due to recent changes. 2017-10-25 01:22:16 +02:00
34fe3f9c06 Code refactor: remove MEM_WRITE_ONLY, always use MEM_READ_WRITE.
It's unlikely the driver can do useful optimizations with this, and if
we sum multiple samples we are reading from the memory anyway.
2017-10-24 23:53:09 +02:00
fe253389e0 Fix Cycles gtests build on macOS. 2017-10-24 17:52:20 +02:00
ec49503a33 Fix T53146: incomplete multi GPU and CPU + GPU memory statistics.
Part due to recent changes, part old bug.
2017-10-24 17:40:43 +02:00
Sergey Sharybin
e03df90bf3 Cycles: Fix compilation in debug mode
Please check compilation before committing refactor changes!
2017-10-24 12:09:02 +02:00
Sergey Sharybin
eccd18a91f Cycles: Fix compilation error without C++11 2017-10-24 11:14:01 +02:00
Sergey Sharybin
d0f48d33f4 Cycles: Fix memory leak in test and simplify code 2017-10-24 11:12:28 +02:00
Sergey Sharybin
1dd33b2f23 Cycles: Fix test compilation failure after recent refactor
The test will leak CPU devices, but is all passing other than that.
Leak will be fixed shortly.

P.S. Committing code refactor without running regression tests, tsk ;)
2017-10-24 10:48:16 +02:00
a1aad1f8d1 Fix T53134: denoising with CPU + GPU render leaves some tiles noisy. 2017-10-24 04:09:48 +02:00
070a668d04 Code refactor: move more memory allocation logic into device API.
* Remove tex_* and pixels_* functions, replace by mem_*.
* Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices.
* No longer create device_memory and call mem_* directly, always go
  through device_only_memory, device_vector and device_pixels.
2017-10-24 01:25:19 +02:00
aa8b4c5d81 Code refactor: use device_only_memory and device_vector in more places. 2017-10-24 01:25:13 +02:00
7ad9333fad Code refactor: store device/interp/extension/type in each device_memory. 2017-10-24 01:03:59 +02:00
ae41f38f78 Code refactor: pass device to scene, check OSL with device info. 2017-10-24 01:03:59 +02:00
57a0cb797d Code refactor: avoid some unnecessary device memory copying. 2017-10-21 20:58:28 +02:00
92ec4863c2 Code refactor: simplify image device memory allocation. 2017-10-21 20:58:28 +02:00
0836795a0d Fix issue with resumable rendering in recent changes. 2017-10-21 20:57:52 +02:00
6199a606a6 Cycles: disable progressive refine if denoising or save buffers is used.
Progressive refine undoes memory saving from save buffers, so enabling
both does not make much sense. Previously enabling progressive refine
would disable denoising, but it should be the other way around since
denoise actually affects the render result.

Includes some code refactor for progressive refine render buffers, and
avoids recomputing tiles for each progressive sample.
2017-10-21 20:29:21 +02:00
dc9eb8234f Cycles: combined CPU + GPU rendering support.
CPU rendering will be restricted to a BVH2, which is not ideal for raytracing
performance but can be shared with the GPU. Decoupled volume shading will be
disabled to match GPU volume sampling.

The number of CPU rendering threads is reduced to leave one core dedicated to
each GPU. Viewport rendering will also only use GPU rendering still. So along
with the BVH2 usage, perfect scaling should not be expected.

Go to User Preferences > System to enable the CPU to render alongside the GPU.

Differential Revision: https://developer.blender.org/D2873
2017-10-21 20:13:44 +02:00
3df2e6d76b Fix T53109: denoising variance debug passes not working after recent changes. 2017-10-20 14:41:24 +02:00
Sergey Sharybin
910dd7fb1b Cycles: Add extra logging in CUDA device detection code 2017-10-19 11:26:10 +02:00
d85a0a722e Fix part of T53038: principled BSDF clearcoat weight has no effect with 0 roughness. 2017-10-18 23:35:54 +02:00
Sergey Sharybin
01a0649354 Cycles: Fix wrong shading when some mesh triangle has non-finite coordinate
This is fully unpredictable for artists when one damaged object makes the whole
scene to render incorrectly. This involves two main changes:

- It is not enough to check triangle bounds to be valid when building BVH.
  This is because triangle might have some finite vertices and some non-finite.

- We shouldn't add non-finite triangle area to the overall area for MIS.
2017-10-18 12:19:53 +02:00
92611dada6 Fix T53098, T53079: OpenCL world texture errors after recent changes. 2017-10-18 03:13:25 +02:00
Campbell Barton
99520e3f92 Cleanup: use 'e' prefix for enum typedefs
Convention was only followed loosely,
apply to DNA where changes aren't likely to conflict.

(Skipped ModifierType for eg).
2017-10-17 13:49:20 +11:00
811dbf5525 Code cleanup: deduplicate primitive refit code. 2017-10-15 21:53:58 +02:00
2e50add164 Fix OpenCL performance regression after cubic interpolation.
Reorganize code to reduce register pressure.
2017-10-15 17:46:50 +02:00
Sergey Sharybin
5ea729845d Fix T53048: OSL Volume is broken in Blender 2.79
Was a mistake in optimization commit which was disconnecting closures and nodes
which does not make sense for volume output.

OSL script we can't ignore and can't currently know in advance if it's a proper
volume shader or not. So we never disconnect OSL nodes from volume output.

This is a good candidate for corrective release.
2017-10-11 15:22:40 +05:00
Sergey Sharybin
4fce3c7ac0 Cycles: Speedup up tangent space calculation
This patch goes away form using C++ RNA during tangent space calculation which
avoids quite a bit of overhead. Now all calculation is done using data which
already exists in ccl::Mesh. This means, tangent space is now calculated from
triangles, which doesn't seem to be any different (at least as far as regression
tests are concerned).

One of the positive sides is that this change makes it possible to move tangent
space calculation from blender/ to render/ so we will have Cycles standalone
supporting tangent space.

Reviewers: brecht, lukasstockner97, campbellbarton

Differential Revision: https://developer.blender.org/D2810
2017-10-11 13:19:15 +05:00
Sergey Sharybin
a421607569 Cycles: Add utility function to calculate triangle's normal 2017-10-11 13:18:59 +05:00
Sergey Sharybin
552d15c976 Cycles: Add utility function to remove given attribute 2017-10-11 13:18:59 +05:00
Sergey Sharybin
4782000fd5 Cycles: Fix possible race condition when initializing devices list 2017-10-11 12:48:19 +05:00
Sergey Sharybin
8d73ba58b6 Cycles: Fix compilation of sm_20 and sm_21 kernels
Was broken since the bicubic commit for GPU support.
2017-10-10 12:26:02 +05:00
e360d003ea Cycles: schedule more work for non-display and compute preemption CUDA cards.
This change affects CUDA GPUs not connected to a display or connected to a
display but supporting compute preemption so that the display does not
freeze. I couldn't find an official list, but compute preemption seems to be
only supported with GTX 1070+ and Linux (not GTX 1060- or Windows).

This helps improve small tile rendering performance further if there are
sufficient samples x number of pixels in a single tile to keep the GPU busy.
2017-10-08 21:12:16 +02:00
Mathieu Menuet
5aa08eb3cc Fix T53017: Cycles not detecting AMD GPU when there is an NVidia GPU too.
Best guess is that cuInit() somehow interferes with the AMD graphics driver
on Windows, and switching the initialization order to do OpenCL first seems
to solve the issue.
2017-10-08 18:36:02 +02:00
cdb0b3b1dc Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading. 2017-10-08 13:17:33 +02:00
f61c340bc1 Cycles: OpenCL bicubic and tricubic texture interpolation support. 2017-10-08 02:55:44 +02:00
c040dedc12 Fix incorrect MIS with principled BSDF and specular roughness 0. 2017-10-07 22:10:02 +02:00
d7eabc6765 Code cleanup: simplify cmake kernel install. 2017-10-07 15:32:20 +02:00
2d92988f6b Cycles: CUDA bicubic and tricubic texture interpolation support.
While cubic interpolation is quite expensive on the CPU compared to linear
interpolation, the difference on the GPU is quite small.
2017-10-07 15:30:57 +02:00
23098cda99 Code refactor: make texture code more consistent between devices.
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-07 14:53:14 +02:00
Sergey Sharybin
83ce02879f Cycles: Fix possible race condition when generating Beckmann table
Two issues here:

- Checking table size to be non-zero is not a proper way to go here. This is
  because we first resize the table and then fill it in. So it was possible that
  non-initialized table was used.

  Trickery with using temporary memory and then doing table.swap() might work,
  but we can not guarantee that table size will be set after the data pointer.

- Mutex guard was useless, because every thread was using own mutex. Need to
  make mutex guard static so all threads are using same mutex.
2017-10-06 21:06:15 +05:00
Sergey Sharybin
837383ac78 Cycles: Cleanup, indendation 2017-10-06 19:33:59 +05:00
Sergey Sharybin
a950af8e24 Fix T53012: Shadow catcher creates artifacts on contact area
The issue was caused by light sample being evaluated to nan at some point.
This is root of the cause which is to be fixed, but is very hard to trace down
especially via ssh (the issue only happens on AVX2 release build). Will give it
a closer look when back to my AVX2 machine.

For until then this is a good check to have anyway, it corresponds to what's
happening in regular radiance sum.
2017-10-06 17:27:34 +05:00
Sergey Sharybin
0d3c8d0701 Cycles: Cleanup, indentation and wrapping 2017-10-06 16:54:37 +05:00
4537e85584 Fix T53001: more workarounds for crash in AMD compiler with recent drivers. 2017-10-05 17:57:58 +02:00
fb99ea79f8 Code refactor: split displace/background into separate kernels, remove luma. 2017-10-05 17:57:58 +02:00
49199963bf Fix incorrect CUDA remaining time estimate after previous commit. 2017-10-04 23:25:51 +02:00
6da6f8d33f Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.

Differential Revision: https://developer.blender.org/D2856
2017-10-04 21:58:47 +02:00
77f300e2a9 Fix use of uninitialized memory in Cycles normal baking. 2017-10-04 21:11:14 +02:00
5bb677e592 Code refactor: zero render buffers outside of kernel.
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
2017-10-04 21:11:14 +02:00
12f4538205 Code refactor: use split variance calculation for mega kernels too.
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
2017-10-04 21:11:14 +02:00
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
5b7d6ea54b Code refactor: add WorkTile struct for passing work to kernel.
This makes sharing some code between mega/split in following commits a bit
easier, and also paves the way for rendering multiple tiles later.
2017-10-04 21:11:14 +02:00
660e8e59e7 Fix T52645, T52645: AMD OpenCL compiler crash with recent drivers.
Work around the bug by reshuffling code.
2017-10-04 21:00:46 +02:00
Sergey Sharybin
61d5c5a64f Fix T52981: 2D Curve shapes do not render untill extruded
Regression since 9298c53.
2017-10-03 15:29:39 +05:00
f55735e533 CMake: support CUDA 9 toolkit, and automatically disable sm_2x binaries.
Fermi cards (GTX 4xx and 5xx) are no longer supported with this version, so
we can keep supporting both CUDA 8 and 9 for a while.
2017-10-01 14:14:53 +02:00
9298c53e4c Fix T52943: don't export curves objects with no faces to Cycles.
Also skip any objects with zero ray visibility and meshes with
zero faces.
2017-09-29 14:54:34 +02:00
d2bbd41b4e Fix Cycles OpenCL compiler error after recent changes. 2017-09-29 14:54:10 +02:00
Kim Christensen
2a36ee16c1 Fix T52574: make Cycles rendered tile counter more clear.
Differential Revision: https://developer.blender.org/D2853
2017-09-28 15:18:53 +02:00
400e6f37b8 Cycles: reduce subsurface stack memory usage.
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
2017-09-28 15:18:43 +02:00
88520dd5b6 Code refactor: simplify CUDA context push/pop.
Makes it possible to call a function like mem_alloc() when the context is
already active. Also fixes some missing pops in case of errors.
2017-09-27 13:43:21 +02:00
Sergey Sharybin
cb6f07f59e Cycles: Cleanup, indentation 2017-09-25 11:15:54 +05:00
Sergey Sharybin
c0480bc972 Cycles: Fix compilation error of OpenCL megakernel on Apple 2017-09-23 17:07:19 +05:00
Sergey Sharybin
b460b8fb4a Cycles: Fix compilation error of megakernel on NVidia device
It is more readable to explicitly compare to NULL anyway.
2017-09-23 17:03:02 +05:00
07ec0effb6 Code cleanup: simplify kernel side work stealing code. 2017-09-21 22:29:18 +02:00
18a353dd24 Fix T52368: Cycles OSL trace() failing on Windows 32 bit. 2017-09-20 19:38:08 +02:00
14223357e5 Fix T52853: harmless Cycles test failure in debug mode. 2017-09-20 19:38:08 +02:00
90d4b823d7 Cycles: use defensive sampling for picking BSDFs and BSSRDFs.
For the first bounce we now give each BSDF or BSSRDF a minimum sample weight,
which helps reduce noise for a typical case where you have a glossy BSDF with
a small weight due to Fresnel, but not necessarily small contribution relative
to a diffuse or transmission BSDF below.

We can probably find a better heuristic that also enables this on further
bounces, for example when looking through a perfect mirror, but I wasn't able
to find a robust one so far.
2017-09-20 19:38:08 +02:00
095a01a73a Cycles: slightly improve BSDF sample stratification for path tracing.
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
2017-09-20 19:38:08 +02:00
b3afc8917c Code cleanup: refactor BSSRDF closure sampling, for next commit. 2017-09-20 19:38:08 +02:00
d029399e6b Code cleanup: remove SOBOL_SKIP hack, seems no longer needed. 2017-09-20 19:38:08 +02:00
d750d182e5 Code cleanup: remove hack to avoid seeing transparent objects in noise.
Previously the Sobol pattern suffered from some correlation issues that
made the outline of objects like a smoke domain visible. This helps
simplify the code and also makes some other optimizations possible.
2017-09-20 19:38:08 +02:00
Carlo Andreacchio
ab9079f459 Fix Cycles adaptive compile without volumes broken after recent changes.
Differential Revision: https://developer.blender.org/D2847
2017-09-18 12:52:32 +02:00
Hristo Gueorguiev
6798a061b7 Cycles: Fix compilation error with OpenCL split kernel 2017-09-16 12:33:03 +02:00
Sergey Sharybin
7aafa32c09 Fix T51416: Blender Crashes while moving Sliders
The issue here was that removing datablock from main database will poke editors
update, which includes buttons context to free users of texture. Since Cycles
will free datablocks from job thread, it might crash Blender since main thread
might be in the middle of drawing.

Solved by exposing extra arguments to bpy.data.foo.remove() which indicates
whether we want to perform ID user count and interface updates. While scripts
shouldn't be using those normally, this is the only way to allow Cycles to skip
interface update when removing datablock.

Reviewers: mont29

Reviewed By: mont29

Differential Revision: https://developer.blender.org/D2840
2017-09-14 17:03:40 +05:00
32449e1b21 Code cleanup: store branch factor in PathState. 2017-09-13 15:24:14 +02:00
9e258fc641 Code cleanup: avoid used of uninitialized value in case of precision issue. 2017-09-13 15:24:14 +02:00
37d9e65ddf Code cleanup: abstract shadow catcher logic more into accumulation code. 2017-09-13 15:24:14 +02:00
f77cdd1d59 Code cleanup: deduplicate some branched and split kernel code.
Benchmarks peformance on GTX 1080 and RX 480 on Linux is the same for
bmw27, classroom, pabellon, and about 2% faster on fishy_cat and koro.
2017-09-13 15:24:14 +02:00
c4c450045d Code cleanup: tweak inlining for 2% better CUDA performance with hair. 2017-09-13 15:24:14 +02:00
Mathieu Menuet
659ba012b0 Cycles: change AO bounces approximation to do more glossy and transmission.
Rather than treating all ray types equally, we now always render 1 glossy
bounce and unlimited transmission bounces. This makes it possible to get
good looking results with low AO bounces settings, making it useful to
speed up interior renders for example.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2818
2017-09-12 15:37:35 +02:00
de6ecc82ed Fix rare firefly in volume equiangular sampling when sampling short distance. 2017-09-12 12:50:44 +02:00
cd6c9e9e5f Cycles: improve sample stratification on area lights for path tracing.
Previously we used a 1D sequence to select a light, and another 2D sequence
to sample a point on the light. For multiple lights this meant each light
would get a random subset of a 2D stratified sequence, which is not
guaranteed to be stratified anymore.

Now we use only a 2D sequence, split into segments along the X axis, one for
each light. The samples that fall within a segment then each are a stratified
sequence, at least in the limit. So for example for two lights, we split up
the unit square into two segments [0,0.5[ x [0,1[ and [0.5,1[ x [0,1[.

This doesn't make much difference in most scenes, mainly helps if you have a
few large area lights or some types of HDR backgrounds.
2017-09-12 12:45:29 +02:00
d454a44e96 Fix Cycles bug in RR termination, probability should never be > 1.0.
This causes render differences in some scenes, for example fishy_cat
and pabellon scenes render brighter in a few spots. This is an old
bug, not due to recent RR changes.
2017-09-12 12:43:26 +02:00
Sergey Sharybin
467d92b8f1 Cycles: Tweaks to avoid compilation error of megakernel
Also moved code out of deep-inside ifdef block, otherwise it was quite confusing.
2017-09-12 13:33:46 +05:00
Sergey Sharybin
ae41a08288 Cycles: Attempt to work around compilation of sm_20 and sm_21
Disabled forceinline for those architectures, which seems to be compiling
successfully more often.

There might be ~3% slowdown based on quick tests, but better be rendering
something rather than failing to compile kernels again and again.

Those architectures will be doomed for abandon once we'll switch to toolkit 9.
2017-09-08 18:37:54 +02:00
ce1f2e271d Cycles: disable fast math flags, only use a subset.
Empty BVH nodes are set to NaN which must be preserved all the way to the
tnear <= tfar test which can then give false for empty nodes. This needs
strict semantices and careful argument ordering for min() and max(), so
the second argument is used if either of the arguments is NaN.

Fixes T52635: crash in BVH traversal with SSE4.1.

Differential Revision: https://developer.blender.org/D2828
2017-09-08 15:12:37 +02:00
c10ea88420 Fix T52660: CUDA volume texture rendering not working on Fermi GPUs. 2017-09-06 18:12:45 +02:00
2d407fc288 Fix T52661: mesh light shader using backfacing not working, after new sampling. 2017-09-06 13:51:48 +02:00
dd8016f708 Fix T52652: Cycles image box mapping has flipped textures.
This breaks backwards compatibility some in that 3 sides will be mapped
differently now, but difficult to avoid and can be considered a bugfix.
2017-09-06 13:51:45 +02:00
Sergey Sharybin
750e38a526 Cycles: Fix compilation error with CUDA after recent changes 2017-09-05 16:52:45 +02:00
Sergey Sharybin
f01e43fac3 Fix T52433: Volume Absorption color tint
Need to exit the volume stack when shadow ray laves the medium.

Thanks Brecht for review and help in troubleshooting!
2017-09-05 15:48:34 +02:00
Sergey Sharybin
b0bbb5f34f Cycles: Cleanup, style 2017-09-05 12:43:02 +02:00
Sergey Sharybin
885c0a5f90 Cycles: Fix compilation warning 2017-09-04 13:28:15 +02:00
Sergey Sharybin
33249f6987 Fix T52533: Blender shuts down when rendering duplicated smoke domain 2017-09-04 13:14:54 +02:00
Campbell Barton
18b7f05480 Cycles: follow strict class naming convention 2017-09-01 16:08:25 +10:00
Sergey Sharybin
018137f762 Cycles: Cleanup, indentation and trailing whitespace 2017-08-31 14:47:49 +02:00
Sergey Sharybin
8b9e1707a1 Cycles: Fix typo in comment 2017-08-31 13:24:32 +02:00
Stefan Werner
68dfa0f1b7 Fixing T52477 - switching from custom ray/triangle intersection code to the one from util_intersection.h. This fixes the bug and makes the code more readable and maintainable. 2017-08-30 11:48:49 +02:00
Mai Lavelle
124ffb45a6 Cycles: Fix build with networking enabled 2017-08-30 00:19:44 -04:00
1457e5ea73 Fix Cycles Windows render errors with BVH2 CPU rendering.
One problem is that it was always using __mm_blendv_ps emulation even if the
instruction was supported. The other that the emulation function was wrong.

Thanks a lot to Ray Molenkamp for tracking this one down.
2017-08-29 22:55:35 +02:00
Sergey Sharybin
12f627cd9f Cycles: Cleanup, naming of variable
Always use b_ prefix for C++ RNA data.
2017-08-25 21:30:20 +02:00
Sergey Sharybin
ee61a97632 Cycles: Add assert to catch possibly wrong logic 2017-08-25 21:30:20 +02:00
Lukas Stockner
f9a3d01452 Cycles: Mark pixels with negative values as outliers
If a pixel has negative components, something already went wrong, so the best option is to just ignore it.

Should be good for 2.79.
2017-08-25 17:46:15 +02:00
Sergey Sharybin
90299e4216 Cycles: Add utility function to query current value of scoped timer 2017-08-25 14:27:34 +02:00
Sergey Sharybin
12d527f327 Cycles: Correct logging of sued CPU intrisics 2017-08-25 14:27:34 +02:00
Sergey Sharybin
dfae3de6bd Cycles: Fix stack overflow during traversal caused by floating overflow
Would be nice to be able to catch this with assert as well, will see what would
be the best way to do this/.\

Need to verify with Mai that this solves crash for her and maybe consider
porting this to 2.79.
2017-08-25 14:27:34 +02:00
Sergey Sharybin
436d1b4e90 Cycles: FIx issue with -0 being considered a non-finite value 2017-08-24 14:32:56 +02:00
76b74a93a8 Fix Cycles CUDA transparent shadow error after recent fix in c22b52c.
Fishy cat benchmark was rendering with wrong shadows. Cause is unclear,
adding printf or rearranging code seems to avoid this issue, possibly a
compiler bug. This reverts the fix and solves the OSL bug elsewhere.
2017-08-24 03:43:02 +02:00
b85d36d811 Code cleanup: remove shader context.
This was needed when we accessed OSL closure memory after shader evaluation,
which could get overwritten by another shader evaluation. But all closures
are immediatley converted to ShaderClosure now, so no longer needed.
2017-08-24 03:43:02 +02:00
Mai Lavelle
579edb1510 Cycles: Add maximum depth stat to bvh builder 2017-08-23 06:54:26 -04:00
Mai Lavelle
2540741dee Fix implementation of atomic update max and move to a central location
While unlikely to have had any serious effects because of limited use, the
previous implementation was not actually atomic due to a data race and
incorrectly coded CAS loop. We also had duplicates of this code in a few
places, it's now been moved to a single location with all other atomic
operations.
2017-08-23 06:54:25 -04:00
Sergey Sharybin
5c60721c9e Fix T51805: Overlapping volumes renders incorrect on AMD GPU
We need to make sure we can store all volume closures for all objects in volume
stack. This is a bit tricky to detect what would be the "nestness" level of
volumes so for now use maximum possible stack depth. Might cause some slowdown,
but better to give reliable render output than to fail quickly.

Should be safe for 2.79 after extra eyes.
2017-08-23 12:35:23 +02:00
049932c4c3 Fix panorama render crash with split kernel, due to incorrect buffer pointer.
Also some refactoring to clarify variable usage scope.
2017-08-22 00:41:07 +02:00
296d74c4b1 Cycles: reorganize Performance panel layout, move viewport BVH type to debug. 2017-08-21 19:05:17 +02:00
43a6cf1504 Cycles: attempt to recover from crashing CUDA/OpenCL drivers on Windows.
I don't know if this will actually work, needs testing. Ref T52064.
2017-08-20 23:18:25 +02:00
41e6068c76 Revert "Cycles: remove square samples option."
This reverts commit 757c24b6bceaeeae95f743b72b6a7040880a0ebf.

We'll revisit this when doing deeper sampling changes.
2017-08-20 23:46:05 +02:00
1d1ddd48db Fix T52470: cycles OpenCL hair rendering not working after recent changes. 2017-08-20 23:32:20 +02:00
ce0fce2207 Code cleanup: deduplicate some bsdf node methods. 2017-08-20 17:37:22 +02:00
b5f8063fb9 Cycles: support baking normals plugged into BSDFs, averaged with closure weight. 2017-08-20 16:51:53 +02:00
0b07c2c8a2 Code cleanup: remove copy of shader graph for bump, no longer needed. 2017-08-20 14:27:51 +02:00
c22b52cd36 Fix T52452: OSL trace broken after shadow catcher recent changes.
We should only early out with any hit in BVH traversal if the only visibility
bits used are opaque shadow. Not when opaque shadow is one of multiple bits.
2017-08-19 18:14:16 +02:00
cfa8b762e2 Code cleanup: move rng into path state.
Also pass by value and don't write back now that it is just a hash for seeding
and no longer an LCG state. Together this makes CUDA a tiny bit faster in my
tests, but mainly simplifies code.
2017-08-19 18:14:16 +02:00
4d428d14af Fix T52443: Cycles OpenCL build error after recent mesh lights changes. 2017-08-19 01:02:55 +02:00
Stefan Werner
7a4696197d Cycles: Fix for a division by zero that could happen with solid angle triangle light sampling 2017-08-17 15:07:59 +02:00
Stefan Werner
8141eac2f8 Improved triangle sampling for mesh lights
This implements Arvo's "Stratified sampling of spherical triangles". Similar to how we sample rectangular area lights, this is sampling triangles over their solid angle. It does significantly improve sampling close to the triangle, but doesn't do much for more distant triangles. So I added a simple heuristic to switch between the two methods. Unfortunately, I expect this to add render time in any case, even when it does not make any difference whatsoever. It'll take some benchmarking with various scenes and hardware to estimate how severe the impact is and if it is worth the change.

Reviewers: #cycles, brecht

Reviewed By: #cycles, brecht

Subscribers: Vega-core, brecht, SteffenD

Tags: #cycles

Differential Revision: https://developer.blender.org/D2730
2017-08-17 12:44:32 +02:00
Lukas Stockner
5492d2cb67 Cycles: Calculate correct remaining time when using a larger pixel size 2017-08-17 02:00:44 +02:00
Lukas Stockner
66c1b23aa1 Cycles/BI: Add a pixel size option for speeding up viewport rendering
This patch adds "Pixel Size" to the performance options, which allows to render
in a smaller resolution, which is especially useful for displays with high DPI.

Reviewers: Severin, dingto, sergey, brecht

Reviewed By: brecht

Subscribers: Severin, venomgfx, eyecandy, brecht

Differential Revision: https://developer.blender.org/D1619
2017-08-15 01:22:40 +02:00
Stefan Werner
86eb8980d3 Cycles: Fixed broken camera motion blur when motion was not set to center on frame
Reviewers: #cycles, sergey

Reviewed By: #cycles, sergey

Subscribers: sergey

Differential Revision: https://developer.blender.org/D2787
2017-08-14 20:24:30 +02:00
Sergey Sharybin
4e6324dd59 Cycles: Guard memcpy to potentially re-allocating memory with lock
Basically, make re-alloc and memcpy from the same lock, otherwise one
thread might be re-allocating thread while another one is trying to
copy data there.

Reported by Mohamed Sakr in IRC, thanks!
2017-08-14 14:55:47 +02:00
dc7fcebb33 Code cleanup: make L_transparent part of PathRadiance. 2017-08-13 01:19:07 +02:00
7542282c06 Code cleanup: make DebugData part of PathRadiance. 2017-08-13 01:19:07 +02:00
fce405059f Code cleanup: make it easier to test only Sobol, CMJ or Pseudorandom. 2017-08-13 01:19:07 +02:00
8f97108353 Cycles: optimize CPU split kernel data init. 2017-08-12 20:43:34 +02:00
601f94a3c2 Code cleanup: remove unused Cycles random number code. 2017-08-12 20:40:38 +02:00
6919393a51 Fix T52372: CUDA build error after recent changes. 2017-08-12 20:37:06 +02:00
d7639d57dc Fix T52368: OSL trace() crash after recent changes. 2017-08-12 14:32:52 +02:00
85ad248c36 Code cleanup: fix warning and improve terminology. 2017-08-12 13:18:05 +02:00
Sergey Sharybin
2e25754ecd Cycles: Clarify new argument in PathRadiance 2017-08-11 13:49:50 +02:00
Sergey Sharybin
bd069a89aa Fix T52229: Shadow Catcher artifacts when under transparency
Added some extra tirckery to avoid background being tinted dark with transparent
surface. Maybe a bit hacky, but seems to work fine.
2017-08-11 13:49:50 +02:00
757c24b6bc Cycles: remove square samples option.
It doesn't seem that useful in practice, was mostly added to match some
other renderers but also seems to be causing user confusing and accidental
long render times. So let's just keep the UI simple and remove this.

Differential Revision: https://developer.blender.org/D2768
2017-08-11 01:10:56 +02:00
8a7c207f0b Cycles: change defaults for filter glossy, clamp and branched path AA.
We're adding some bias by default, which now I think is the right thing
to do from a usability point of view since you really need to use those
options anyway to get clean renders in a practical time.

Differential Revision: https://developer.blender.org/D2769
2017-08-11 01:10:50 +02:00
267e75158a Fix T52322: denoiser broken on Windows after recent changes.
It's not clear why this only happened on Windows, but the code
was wrong and should do a bitcast here instead of conversion.
2017-08-11 01:09:35 +02:00
Sergey Sharybin
422fddab87 Cycles: Fix instanced shadow catcher objects influencing each other 2017-08-10 09:22:33 +02:00
Sergey Sharybin
5a618ab737 Cycles: De-duplicate trace-time object visibility calculation
We already have enough files to worry about in BVH builders. no need to add yet
another copy-paste code which is tempting to be running out of sync.
2017-08-10 09:21:02 +02:00
Sergey Sharybin
176ad9ecdd Cycles: Remove ulong usage
This is a bit confusing, especially when one mixes OpenCL code where ulong equals
to uint64_t with CPU side code where ulong is expected to be something else from
the naming.

This commit makes it so we use explicit name, common on all platforms.
2017-08-09 14:08:58 +02:00
Mai Lavelle
55d28e604e Cycles: Proper fix for recent OpenCL image crash
Problem was that some code checks to see if device_pointer is null or
not and the new allocator wasn't even setting the pointer to anything
as it tracks memory location separately. Setting the pointer to non
null keeps all users of device_pointer happy.
2017-08-09 04:27:39 -04:00
Mai Lavelle
06bf34227b Revert "Cycles: Fix crash changing image after recent OpenCL changes"
This reverts commit f2809ae0a671057caa1005e2b9cc91648c33dd1f.
2017-08-09 04:24:03 -04:00
Sergey Sharybin
99c13519a1 Cycles: More fixes for Windows 32 bit
- Apparently MSVC does not support compound literals
  in C++ (at least by the looks of it).

- Not sure how opencl_device_assert was managing to
  set protected property of the Device class.
2017-08-08 22:32:51 +02:00
Sergey Sharybin
c961737d0f Cycles: Fix compilation error of filter kernels on 32 bit Windows
We don't enable global SSE optimizations in regular kernel, and we
keep those disabled on Linux 32bit.

One possible workaround would be to pass arguments by ccl_ref, but
that is quite a few of code which better be done accurately.
2017-08-08 22:01:17 +02:00
Sergey Sharybin
f2809ae0a6 Cycles: Fix crash changing image after recent OpenCL changes
Steps to reproduce:
- Create shader Image texture -> Diffuse BSDF -> Output. Do NOT select image yet!
- Start viewport render.
- Select image from the ID browser of Image Texture node.

Thing is: with the memory manager we always need to inform device that memory
was freed.
2017-08-08 17:17:04 +02:00
Sergey Sharybin
0e57282999 Cycles: Fix compilation error without C++11
Common folks, nobody considered master a C++11 only branch. Such decision is to
be done officially and will involve changes in quite a few infrastructure related
areas.
2017-08-08 17:02:26 +02:00
Sergey Sharybin
19d19add1e Cycles: Cleanup, de-duplicate function parameter list
Was only needed to sue const reference on CPU. Now it is done using ccl_ref.
2017-08-08 15:27:25 +02:00
Sergey Sharybin
fd397a7d28 Cycles: Add utility macro ccl_ref
It is defined to & for CPU side compilation, and defined to an empty for any GPU
platform. The idea here is to use this macro instead of #ifdef block with bunch
of duplicated lines just to make it so CPU code is efficient.

Eventually we might switch to references on CUDA as well, but that would require
some intensive testing.
2017-08-08 15:27:25 +02:00
Mai Lavelle
ec8ae4d5e9 Cycles: Pack kernel textures into buffers for OpenCL
Image textures were being packed into a single buffer for OpenCL, which
limited the amount of memory available for images to the size of one
buffer (usually 4gb on AMD hardware). By packing textures into multiple
buffers that limit is removed, while simultaneously reducing the number
of buffers that need to be passed to each kernel.

Benchmarks were within 2%.

Fixes T51554.

Differential Revision: https://developer.blender.org/D2745
2017-08-08 07:12:04 -04:00
Sergey Sharybin
451ccf7396 Cycles: Cleanup, move curve intersection functions to own file
This way curve file becomes much shorter and it's also easier to write a
benchmark application to check performance before/after future changes.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
77a7a7f455 Cycles: Cleanup, trailign whitespace 2017-08-07 20:53:30 +02:00
Sergey Sharybin
95fe9b2617 Cycles: Cleanup, remove bvh prefix from curve functions
Those are nothing to do with BVH, and can be used separately.
2017-08-07 20:53:30 +02:00
Sergey Sharybin
a4bbce8949 Cycles: Fix compilation error on NVidia OpenCL after recent refactor
Still need to verify this is proper thing to do for AMD OpenCL. At least now
i can compile OpenCL kernel on my laptop with sm21 card.
2017-08-07 20:52:24 +02:00
fc38276d74 Fix Cycles shadow catcher objects influencing each other.
Since all the shadow catchers are already assumed to be in the footage,
the shadows they cast on each other are already in the footage too. So
don't just let shadow catchers skip self, but all shadow catchers.

Another justification is that it should not matter if the shadow catcher
is modeled as one object or multiple separate objects, the resulting
render should be the same.

Differential Revision: https://developer.blender.org/D2763
2017-08-07 17:54:26 +02:00
dc4d850d10 Fix Windows build errors with recent Cycles SIMD refactoring. 2017-08-07 17:54:26 +02:00
Sergey Sharybin
580741b317 Cycles: Cleanup, space after keyword 2017-08-07 14:47:51 +02:00
ee77c1e917 Code refactor: use float4 instead of intrinsics for CPU denoise filtering.
Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
a24fbf3323 Code refactor: add, remove, optimize various SSE functions.
* Remove some unnecessary SSE emulation defines.
* Use full precision float division so we can enable it.
* Add sqrt(), sqr(), fabs(), shuffle variations, mask().
* Optimize reduce_add(), select().

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
a8cc0d707e Code refactor: split defines into separate header, changes to SSE type headers.
I need to use some macros defined in util_simd.h for float3/float4, to emulate
SSE4 instructions on SSE2. But due to issues with order of header includes this
was not possible, this does some refactoring to make it work.

Differential Revision: https://developer.blender.org/D2764
2017-08-07 14:01:24 +02:00
5e4bad2c00 Cycles: remove option to disable transparent shadows globally.
We already detect this automatically based on shading nodes and per shader
settings, and performance of this option is ok now all devices.

Differential Revision: https://developer.blender.org/D2767
2017-08-07 14:01:24 +02:00
2a74f36dac Fix Cycles CUDA adaptive megakernel build error. 2017-08-07 00:27:08 +02:00
45dcd20ca9 Cycles: CUDA split performance tweaks, still far from megakernel.
On Pabellon, 25.8s mega, 35.4s split before, 32.7s split after.
2017-08-05 14:32:59 +02:00
cd023b6cec Cycles: remove min bounces, modify RR to terminate less.
Differential Revision: https://developer.blender.org/D2766
2017-08-04 23:11:03 +02:00
Sergey Sharybin
0d01cf4488 Cycles: Extra tweaks to performance of header expansion
Two main things here:

1. Replace all unsafe for #line directive characters into a single loop,
   avoiding multiple iterations and multiple temporary strings created.

2. Don't merge token char by char but calculate start and end point and
   then copy all substring at once.

This gives about 15% speedup of source processing time. At this point
(with all previous commits from today) we've shrinked down compiled
sources size from 108 MB down to ~5.5 MB and lowered processing time
from 4.5 sec down to 0.047 sec on my laptop running Linux (this was a
constant time which Blender will always spent first time loading kernel,
even if we've got compiled clbin).
2017-08-03 08:07:06 +02:00
Sergey Sharybin
f879cac032 Cycles: Avoid some expensive operations in header expansions
Basically gather lines as-is during traversal, avoiding allocating
memory for all the lines in headers.

Brings additional performance improvement abut 20%.
2017-08-02 20:59:19 +02:00
Sergey Sharybin
a280697e77 Cycles: Support "precompiled" headers in include expansion algorithm
The idea here is that it is possible to mark certain include statements
as "precompiled" which means all subsequent includes of that file will
be replaced with an empty string.

This is a way to deal with tricky include pattern happening in single
program OpenCL split kernel which was including bunch of headers about
10 times.

This brings preprocessing time from ~1sec to ~0.1sec on my laptop.
2017-08-02 20:59:19 +02:00
Sergey Sharybin
4ad39964fd Cycles: Speed up #include expansion algorithm
The idea is to re-use files which were already processed. Gives about 4x speedup
of processing time (~4.5sec vs 1.0sec) on my laptop for the whole OpenCL kernel.

For users it will mean lower delay before OpenCL rendering might start.
2017-08-02 20:59:19 +02:00
Jeff Knox
e93804318f Fix T51450: viewport render time keeps increasing after render is done.
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2747
2017-07-25 01:47:04 +02:00
9e929c911e Fix Cycles multi scatter GGX different render results with Clang and GCC.
The order of evaluation of function arguments is undefined, and the order
was reversed between these compilers. This was causing regressions tests
to give different results between Linux and macOS.
2017-07-23 23:25:12 +02:00
e982ebd6d4 Fix T52152: allow zero roughness for Cycles principled BSDF, don't clamp. 2017-07-22 23:58:51 +02:00
ec831ee7d1 Fix Cycles denoising NaNs with a 1 sample renders.
This was causing different render results with different compilers. We
can't do much useful with 1 sample, but better for debugging.
2017-07-22 23:58:51 +02:00
b4528d8897 Fix use of uninitialized value in Cycles, probably did not cause a bug. 2017-07-22 21:29:10 +02:00
db8bc1d982 Fix a few harmless maybe uninitialized warnings with GCC 5.4.
GCC seems to detect uninitialized into function calls now, but then isn't
always smart enough to see that it is actually initialized. Disabling this
warning entirely seems a bit too much, so initialize a bit more now.
2017-07-21 00:54:58 +02:00
2b132fc3f7 Fix T52135: Cycles should not keep generated/packed images in memory after render. 2017-07-20 23:47:05 +02:00
a4cd7b7297 Fix potential memory leak in Cycles loading of packed/generated images. 2017-07-20 23:46:58 +02:00
3b12a71972 Fix T52125: principled BSDF missing with macOS OpenCL. 2017-07-20 15:15:43 +02:00
Stefan Werner
c1ca3c8038 Cycles: fixed the SM_2x CUDA kernel build that I broke in my previous commit 2017-07-20 13:28:34 +02:00
Stefan Werner
4bc6faf9c8 Fix T52107: Color management difference when using multiple and different GPUs together
This commit unifies the flattened texture slot names for bindless and regular CUDA textures. Texture indices are now identical across all CUDA architectures, where before Fermi used different indices, which lead to problems when rendering on multi-GPU setups mixing Fermi with newer hardware.
2017-07-20 10:03:27 +02:00
Sergey Sharybin
5f35682f3a Fix T52021: Shadow catcher renders wrong when catcher object is behind transparent object
Tweaked the path radiance summing and alpha to accommodate for possible contribution of
light by transparent surface bounces happening prior to shadow catcher intersection.

This commit will change the way how shadow catcher results looks when was behind semi
transparent object, but the old result seemed to be fully wrong: there were big artifacts
when alpha-overing the result on some actual footage.
2017-07-18 09:46:21 +02:00
Sergey Sharybin
d8906f30d3 Cycles: Remove meaningless camera ray check
In branched path tracing main loop is always a camera ray, with varying
number of transparent bounces.
2017-07-18 09:27:36 +02:00
Mai Lavelle
87164114a3 Cycles: Enable SSS from Principled BSDF only when actually in use
This gives speed up for the split kernel in scenes using the principled BSDF
but without subsurface scattering.
2017-07-12 04:40:24 -04:00
Mai Lavelle
1f933c94a7 Cycles: Fix comparison in principled BSDF
Could have lead to black pixels.
2017-07-11 23:41:22 -04:00
29ec0b1162 Fix T52027: OSL getattribute() crash, when optimizer calls it before rendering. 2017-07-11 22:39:51 +02:00
Sergey Sharybin
e26f61a2b5 Cycles: Disable OpenCL clFlush workarounds
This is something which was reported to work fine by Mai, Benjamin and
confirmed by myself. Disabling this workaround gains us some speedup:

                      Before           Now
bmw27                04:28.42        04:07.79
classroom            09:26.48        08:54.53
fishy_cat            08:44.01        08:18.70
koro                 09:17.98        08:57.18
pavillon_barcelone   12:26.64        11:52.81

Test environment is:
- Ubuntu 16.04, with all updates installed
- AMD RX 480 GPU
- amdgpu pro driver version 17.10-450821
2017-07-11 12:16:58 +02:00
3361f2107b Fix T51967: OSL crash after rendering finished (mainly on Windows). 2017-07-08 02:46:06 +02:00
Sergey Sharybin
fee7f688c3 Cycles: Fix ambiguity in call of min() function 2017-07-07 10:40:19 +02:00
Mai Lavelle
9c3f1ad003 Cycles: Add artificial memory limit debug option for OpenCL 2017-07-06 05:25:46 -04:00
Mai Lavelle
95b345b2fe Revert "Cycles: use std::min and max for extra overloads"
We already have this in util_algorithm.h

This reverts commit cff172c7621d89773baa99a9460f19056efb5f1e.
2017-07-06 04:21:29 -04:00
Mai Lavelle
f9963f29e8 Cycles: Dont allow global size to fall to zero 2017-07-05 20:19:15 -04:00
Mai Lavelle
222b96e5c7 Cycles: Detect out of memory before buffer allocation in OpenCL devices 2017-07-05 20:19:12 -04:00
Mai Lavelle
cff172c762 Cycles: use std::min and max for extra overloads 2017-07-05 19:43:34 -04:00
Sergey Sharybin
31f8ca5034 Cycles: Fix compilation error after recent logging changes
This file uses std::ostream for helper << operators, so need to make sure
corresponding header is included.
2017-07-05 20:40:55 +02:00
Sergey Sharybin
d37dd97e45 Cycles: Pass string by const reference rather than by value
Some of the functions might have been inlined, but others i don't see
how that was possible (don't think virtual functions can be inlined here).

In any case, better be explicitly optimal in the code.
2017-07-05 12:27:41 +02:00
Sergey Sharybin
58c456b12d Cycles: Fix compilation error when building without Glog and no C++11 2017-07-05 12:01:12 +02:00
Lukas Stockner
15fd758bd6 Fix T51950: Abnormally long Cycles OpenCL GPU render times with certain panoramic camera settings
The problem here was that when a "invalid" path is generated by the panoramic camera, it was tagged
as RAY_TO_REGENERATE with the intention of generating a new path in kernel_buffer_update.

However, since that state was not handled in kernel_queue_enqueue, kernel_buffer_update did not
process the path which resulted in an infinite loop.
2017-07-03 18:26:19 +02:00
Lukas Stockner
6782a6076c Cycles: Add missing split kernel to CPUDevice 2017-07-03 18:26:18 +02:00
Luca Rood
d48a9528ca Fix missing return error introduced by last commit
End of non-void function was being reached since
f5535fcb83fd7c1374697923b43565c9e303d225
2017-07-03 12:12:27 +02:00
f5535fcb83 Fi T51023: MixRGB constant folding not effective with clamp option. 2017-07-03 05:25:27 +02:00
cda24d0853 Fix T51855: Cycles emssive objects with NaN transform break lighting. 2017-07-03 05:04:43 +02:00
29c8c50442 Fix T51956: color noise with principled sss, radius 0 and branched path. 2017-07-02 19:21:08 +02:00
52b9516e03 Fix principled BSDF incorrectly missing subsurface component with base color black. 2017-07-02 18:22:24 +02:00
Mai Lavelle
c8fa716c06 Cycles: Use float constants instead of double 2017-06-29 23:07:18 -04:00
Mai Lavelle
56dcfcce05 Cycles: Disable baking in mega kernel when not in use to improve build times 2017-06-29 23:07:18 -04:00
Lukas Stockner
1f3fd8e60a Fix T51909: Cycles: Uninitialized closure normals for the Hair BSDF
As the title says, the normal wasn't set for the Hair BSDF because it wasn't
needed before. However, the denoiser uses it to store the feature passes, so
it needs to be set now.
2017-06-28 21:32:02 +02:00
Lukas Stockner
1979176088 Cycles: Fix excessive sampling weight of glossy Principled BSDF components
If there was any specularity in the Principled BSDF, it would get a sampling
weight of one regardless of its actual impact.

This commit makes Cycles estimate the contribution of the component and adjust
the weighting accordingly, which greatly improves the noise characteristics of
the Principled BSDF in many cases.

Note that this commit might slightly change the brightness of areas when using
MultiGGX and high roughnesses, but the new brightness is more accurate and
closer to the result of Branched Path Tracing. See T51836 for details.

Differential Revision: https://developer.blender.org/D2677
2017-06-22 00:09:56 +02:00
Lukas Stockner
8cb741a598 Fix T51836: Cycles: Fix incorrect PDF approximations of the MultiGGX closures
The PDF of the MultiGGX sampling is approximated by the singlescattering GGX
term as well as a scaled diffuse term that makes up for the energy in the
multiscattering component that's missed by GGX.

However, there were two problems with the glossy terms: The diffuse term missed
a normalization factor, and the singlescattering term was not properly scaled
down based on the albedo estimate.

The glass term was completely wrong and has been rewritten. It uses the fresnel
factor to weight reflection vs. refraction and uses the glossy MultiGGX model
for reflection.
For refraction, the correct singlescattering term is now used, and a new
albedo approximation is used that was derived by evaluating GGX albedo for
roughnesses from 0 to 1 and IORs from 1 to 3 and fitting numerical
approximations to it. The resulting model has a mean relative error of 9e-5,
but could probably be simplified without losing noticable accuracy in the
final render.

The improved PDFs help with glossy highlights (due to better light sampling vs.
closure sampling MIS) and fix the situation described in T51836 where mixing
MultiGGX with other closures (as it happens in e.g. the Principled
BSDF) causes incorrect darkening.
2017-06-22 00:09:56 +02:00
14ea0c5fcc Fix T51849: change Cycles clearcoat gloss to roughness.
This is compatible with UE4 and more consistent with specular and transmission
roughness, even if it deviates from the original Disney BRDF.
2017-06-21 19:55:20 +02:00
Sergey Sharybin
794311c92b Cycles: Fix race condition happening in progress utility
This is not enough to mutex-guard modification code of integer values,
since this operation is NOT atomic. This is not even safe for a single
byte data types.

For now guarded the getter functions, similar to other functions in
this module.

Ideally we want to switch modification to an atomic operations, so we
wouldn't need any locks in the getters.
2017-06-16 10:22:35 +02:00
Sergey Sharybin
64aa0cff89 Cycles: Fix typo in comment 2017-06-14 09:54:07 +02:00
Hristo Gueorguiev
6cfa3ecd4d Fix T51791: Point Density doesn't work on GPU 2017-06-13 13:50:27 +02:00
Sergey Sharybin
40c04dd649 Cycles: Cleanup, indentation 2017-06-13 10:28:38 +02:00
Sergey Sharybin
0aa5431998 Cycles: Fix compilation error of OpenCL mega kernel
Was some mismatch in address space. Seems to be caused by recent additions.

Additionally, moved decoupled ray marching functions under ifdef, so they
don't try to use malloc() functions.

Thanks Mai for testing the patch!
2017-06-13 10:26:45 +02:00
Campbell Barton
00c4f49a6d Cleanup: indentation, long lines 2017-06-12 13:38:21 +10:00
Hristo Gueorguiev
04530c9383 Cycles: adjust supported driver version for AMD GPUs
On Windows 17.Q1 and 17.Q2 return driver version 2236.10.
2017-06-11 23:17:46 +02:00
Lukas Stockner
558bea2252 Cycles Denoising: Add more failsafes for invalid pixels
Now, when there is no usable neighboring pixel for denoising, the noisy value
is preserved instead of producing a NaN.
Also, negative results are clamped to zero.

Note that there are just workarounds that don't fix the underlying problems,
but these issues are very rare and I'm not sure if it's even possible to fix
the underlying problems without introducing a significant slowdown or quality
decrease in other situations.
Because of that and since 2.79 is happening very soon, I just went for these
workarounds for now.
2017-06-11 01:51:39 +02:00
Sergey Sharybin
e097fc4aa6 Cycles: Selectively include denoising in kernel 2017-06-10 04:45:13 -04:00
Mai Lavelle
eb293f59f2 Cycles: Pass all buffers to each kernel call for OpenCL
Technically not passing all buffers used by a kernel is undefined
behavior. We haven't had any issues with this so far on AMD or
Nvidia, but it's known to be a problem with Intel and we received
a report from AMD that this is a problem on newer hardware, so we
need to make this change at some point.

Unfortunately there a cost to being correct, about 5% for the
benchmark scenes. For low sample counts it's even worse, I've
seen up to 50% slowdown. For the latter case I think adjusting
tile updating logic can help, but not sure what that would look
like yet (it would be just a few lines change however).
2017-06-10 04:08:49 -04:00
Mai Lavelle
6238214159 Cycles: Faster split branched path tracing by sharing samples with inactive threads
Unlike regular path tracing, branched path tracing is usually used with lower
sample counts, at least for primary rays. This means that are less samples for
the GPU to work on in parallel and rendering is slower. As there is less work
overall there is also more inactive threads during rendering with BPT. This
patch makes use of those inactive rays to render branched samples in parallel
with other samples.

Each thread that is preparing for a branched sample will attempt to find an
inactive thread and if one is found the state for the sample is copied to that
thread. Potentially, if there are enough inactive threads, 100s of branched
samples could be generated from the same originating thread and ran in
parallel giving large speed ups.

Gives 70% faster render for pavillion midday scene. 20-60% faster on BMW
with car paint replaced with SSS/volumes.
2017-06-10 04:08:49 -04:00
Mai Lavelle
32299d32e7 Cycles: Modify path_radiance_accum_sample to use atomics for split kernel
Samples ran in parallel need a safe way to accumulate their results
with the results of other threads.
2017-06-10 04:08:02 -04:00
Mai Lavelle
6995b50e41 Cycles: Add function to dequeue a ray 2017-06-10 03:51:18 -04:00
Mai Lavelle
4360e8ce13 Cycles: Add atomic decrement functions to util_atomic.h 2017-06-10 03:51:18 -04:00
Mai Lavelle
ea846a4dfc Cycles: Add kernel to enqueue inactive rays
The queue will be used to make reuse of inactive threads to keep
the GPU more busy.
2017-06-10 03:51:18 -04:00
Hristo Gueorguiev
1f0998baa7 Cycles: Blacklist unsupported OpenCL devices
Due to various driver issues with AMD GCN 1 cards we can no longer support
these GPUs. This patch makes them unavailable to select for Cycles rendering.

GCN cards 2 and higher are still supported. Please use the most recent
drivers available to ensure proper functionality.

See here for a list to check which GPUs are supported:
https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units
2017-06-10 03:51:18 -04:00
Lukas Stockner
c73206acc5 Cycles: Fix denoising passes being written when they're not actually generated 2017-06-09 23:02:56 +02:00
Lukas Stockner
0a898e2405 Cleanup Cycles Denoising platform-specific defines 2017-06-09 22:38:16 +02:00
Lukas Stockner
7dc51f87ed Cycles Denoising: Speedup reconstruction by skipping near-zero weights 2017-06-09 22:38:16 +02:00
Lukas Stockner
705c43be0b Cycles Denoising: Merge outlier heuristic and confidence interval test
The previous outlier heuristic only checked whether the pixel is more than
twice as bright compared to the 75% quantile of the 5x5 neighborhood.
While this detected fireflies robustly, it also incorrectly marked a lot of
legitimate small highlights as outliers and filtered them away.

This commit adds an additional condition for marking a pixel as a firefly:
In addition to being above the reference brightness, the lower end of the
3-sigma confidence interval has to be below it.
Since the lower end approximates how low the true value of the pixel might be,
this test separates pixels that are supposed to be very bright from pixels that
are very bright due to random fireflies.

Also, since there is now a reliable outlier filter as a preprocessing step,
the additional confidence interval test in the reconstruction kernel is no
longer needed.
2017-06-09 03:46:11 +02:00
Sergey Sharybin
6a546fc73e Cycles: Don't leave multiple spaces in the device name 2017-06-08 12:15:24 +02:00
Sergey Sharybin
45d3e22204 Cycles: Display optional board name in system info 2017-06-08 12:10:15 +02:00
Sergey Sharybin
78c0f09d4f Cycles: Cleanup, indentation 2017-06-08 12:03:08 +02:00
Pascal Schoen
c91d2d30df Improve backscatter color of subsurface scattering in Principled BSDF
Differential Revision: https://developer.blender.org/D2685
2017-05-31 07:29:17 +02:00
Sergey Sharybin
46da985c8e Cycles: Cleanup, trailing whitespace 2017-05-30 10:58:12 +02:00
Lukas Stockner
9b914764a9 Fix T51652: Cycles - Persistant Images not storing images
Denoising was setting session parameters for every frame, which was detected as
a change and therefore caused a resync.

Since the parameter modification change is only needed for viewport rendering
(which doesn't support denoising anyways) and resyncing after a frame change
(which isn't affected by denoising settings), an easy fix is to just ignore
the denoising parameters like it's currently done with the samples.
2017-05-30 06:34:53 +02:00
Lukas Stockner
0021268311 Cycles: Cleanup: Remove semicolons from line endings in Python code 2017-05-26 02:15:09 +02:00
Lukas Stockner
3722ed13cd Cycles: Update compositor when debug or denoising passes are changed 2017-05-26 02:13:21 +02:00
Lukas Stockner
2bc008e8a9 Cycles: Cleanup: b_srlay is always used now, no more need to silence warning 2017-05-26 01:55:32 +02:00
Sergey Sharybin
55c15ad9de Cycles: Use falltrhough attribute to help catching missing break statements 2017-05-24 17:23:54 +02:00
Pascal Schoen
e20a33b89d Fix T51589: Principled Subsurface Scattering, wrong shadow color
Apply mix of subsurface and base color (wrt subsurface) for rays that
have transmitted the surface.
2017-05-24 07:37:02 +02:00
Sergey Sharybin
7add6b89bc Fix T51592: Simplify AO Cycles setting remains active while Simplify is disabled 2017-05-23 10:34:03 +02:00
Sergey Sharybin
34b689892b Fix T51568: CUDA error in viewport render after fix for for OpenCL
Seems re-loading module invalidates memory pointers by the looks of it,
which gives an error on the next kernel call.

Not sure how to move memory pointer from one CUDA module to another one,
so for now simply disabling kernel re-load for CUDA devices. Not ideal,
but better than failing render.

Feature-selective option for CUDA is not an official feature anyway.
2017-05-22 12:28:21 +02:00
Lukas Stockner
3bf69b26ef Cycles Denoising: Skip feature pass writing for volume-only shaders
Volume shaders without anything connected to the surface output are treated
as if they had a transparent BSDF as the surface shader in Cycles, so the
denoiser should skip feature pass writing for them just as it does with an
actual transparent BSDF.
2017-05-21 05:40:13 +02:00
Lukas Stockner
96769f3b19 Cycles Denoising: Skip confidence interval test for outlier central pixels
If the central pixel is an outlier, the denoiser is supposed to predict its
value from the surrounding pixels. However, in some cases the confidence
interval test would reject every single surrounding pixel, which leaves the
model fitting with no data to work with.
2017-05-21 05:26:13 +02:00
Sergey Sharybin
38a2bf665b Cycles: Cleanup, style and unused arguments
- Some arguments were inapproriatry tagged as unused
  using (void)foo semantic.

  Only use such semantic in tricky casses, when something
  needs to be ignored in release builds or something is
  dependent on tricky ifndef policy.

  For rest of the cases just use void foo(int /bar*/)
  semantic, which ensures variable is not used. Solves
  confusion and code running out of sync with later
  development.

- Used proper unused semantic to some arguments.

- Added braces to make code easier to follow, tricky
  indentation with ifdef, uh.
2017-05-20 05:21:27 -07:00
Lukas Stockner
6cd1d34dc1 Cycles Denoising: Prevent overfitting when using a very low radius
For example, when using a radius of 1, only 9 pixels (due to weighting maybe
even less) will be used, but the transform code may still decide to use a
5-dimensional (or even higher) fit.
This causes severe overfitting and therefore weird pixel values.

To avoid this, this commit limits the amount of dimensions to a third of the
pixel number. For a radius of 3 or more, this doesn't change anything, but
for 1 and 2 it can prevent fireflies and/or negative values being produced.
2017-05-19 23:33:22 +02:00
Lukas Stockner
3dee1f079f Fix T51560: Black pixels on a denoising render
Once again, numerical instabilities causing the Cholesky decomposition to fail.

However, further increasing the diagonal correction just because of a few
pixels in very specific scenes and settings seems unjustified.
Therefore, this commit simply falls back to the basic NLM-filtered pixel
if the more advanced model fails.
2017-05-19 23:31:49 +02:00
Mai Lavelle
177385dc43 Cycles: Reload kernels from Session when requested features change
This fixes T49496.
2017-05-19 16:24:19 -04:00
Sergey Sharybin
8d4aff31ce Cycles: Fix compilation error after recent changes
Spotted by Steffen Dünner, thanks@
2017-05-19 16:48:42 +02:00
Sergey Sharybin
ef549b9e55 Cycles: Cleanup, always use parenthesis
Easier to read/follow, and more robust for the further changes.
2017-05-19 12:57:51 +02:00
Sergey Sharybin
908bb8bd82 Cycles: Cleanup, indentation in preprocessor 2017-05-19 12:54:46 +02:00
Sergey Sharybin
90a62404cb Cycles: Cleanup, variable names
Don't use camel case for variable names. Leave that for the structures.
2017-05-19 12:52:12 +02:00
Sergey Sharybin
3a634524e3 Cycles: Cleanup, useless new lines 2017-05-19 12:45:22 +02:00
Sergey Sharybin
de86da521c Cycles: Cleanup, braces after function definition
I wouldn't mind switching fully to Google style, but i am against of
mixing two different styles in same project. So just stick to brace
at the new line after function definition.
2017-05-19 12:43:26 +02:00
Sergey Sharybin
803337f3f6 \0;115;0cCycles: Cleanup, use ccl_restrict instead of ccl_restrict_ptr
There were following issues with ccl_restrict_ptr:

- We already had ccl_restrict for all platforms.

- It was secretly adding `const` qualifier to the declaration,
  which is quite weird since non-const pointer can also be
  declared as restricted.

- We never in Blender are using foo_ptr or FooPtr type definitions,
  so not sure why we should introduce such a thing here.

- It is absolutely wrong from semantic point of view to put pointer
  into the restrict macro -- const is a part of type, not part of
  hint for compiler that some pointer is never aliased.
2017-05-19 12:41:03 +02:00
Sergey Sharybin
8e655446d1 Fix T51537: Light passes are summed twice for split kernel since denoise commit
Denoise commit introduced kernel_write_result() which saves light passes, so
no need to call both kernel_write_result() and kernel_write_light_passes() from
the split kernel.

Weirdly enough. kernel_write_result() does not take care about debug passes.
2017-05-19 12:14:03 +02:00
Lukas Stockner
4a04d7ae89 Fix T51553: Cycles Volume Emission turns black when strength is 0 or color is black
The problem was that Cycles implicitly uses a transparent surface shader when only
volume nodes are used, but since the black emission shader gets optimized away,
it was no longer detected and therefore no transparent surface was used.

Therefore, the shader now stores whether volume nodes were connected before
optimizing.
2017-05-19 04:59:35 +02:00
Lukas Stockner
cf1127f380 Fix T51506: Wrong shadow catcher color when using selective denoising 2017-05-19 04:04:54 +02:00
Mai Lavelle
29f4a8510c Cycles: Fix random noise pattern seen with multiscatter bsdf and split kernel
Differentials were unset if roughness was low giving undefined behavior.
2017-05-18 21:39:23 -04:00
Lukas Stockner
a21277b996 Fix T51555: Cycles tile count is incorrect when denoising is enabled
Now rendered and denoised tiles are counted and displayed separately.
2017-05-19 03:29:18 +02:00
Lukas Stockner
ffd83a34ab Fix T51502: Cycles denoising not using correctly aligned width for NLM on CUDA 2017-05-19 02:06:54 +02:00
Lukas Stockner
740cd28748 Cycles Denoising: Add more robust outlier heuristic to avoid artifacts
Extremely bright pixels in the rendered image cause the denoising algorithm
to produce extremely noticable artifacts. Therefore, a heuristic is needed
to exclude these pixels from the filtering process.

The new approach calculates the 75% percentile of the 5x5 neighborhood of
each pixel and flags the pixel if it is more than twice as bright.

During the reconstruction process, flagged pixels are skipped. Therefore,
they don't cause any problems for neighboring pixels, and the outlier pixels
themselves are replaced by a prediction of their actual value based on their
feature pass values and the neighboring pixels.

Therefore, the denoiser now also works as a smarter despeckling filter that
uses a more accurate prediction of the pixel instead of a simple average.
This can be used even if denoising isn't wanted by setting the denoising
radius to 1.
2017-05-18 21:55:56 +02:00
Lukas Stockner
b3a3459e1a Cycles Denoising: Fix wrong order of denoising feature passes 2017-05-18 21:55:56 +02:00