Commit Graph

7025 Commits

Author SHA1 Message Date
Sergey Sharybin
8d73ba58b6 Cycles: Fix compilation of sm_20 and sm_21 kernels
Was broken since the bicubic commit for GPU support.
2017-10-10 12:26:02 +05:00
e360d003ea Cycles: schedule more work for non-display and compute preemption CUDA cards.
This change affects CUDA GPUs not connected to a display or connected to a
display but supporting compute preemption so that the display does not
freeze. I couldn't find an official list, but compute preemption seems to be
only supported with GTX 1070+ and Linux (not GTX 1060- or Windows).

This helps improve small tile rendering performance further if there are
sufficient samples x number of pixels in a single tile to keep the GPU busy.
2017-10-08 21:12:16 +02:00
Mathieu Menuet
5aa08eb3cc Fix T53017: Cycles not detecting AMD GPU when there is an NVidia GPU too.
Best guess is that cuInit() somehow interferes with the AMD graphics driver
on Windows, and switching the initialization order to do OpenCL first seems
to solve the issue.
2017-10-08 18:36:02 +02:00
cdb0b3b1dc Code refactor: use DeviceInfo to enable QBVH and decoupled volume shading. 2017-10-08 13:17:33 +02:00
f61c340bc1 Cycles: OpenCL bicubic and tricubic texture interpolation support. 2017-10-08 02:55:44 +02:00
c040dedc12 Fix incorrect MIS with principled BSDF and specular roughness 0. 2017-10-07 22:10:02 +02:00
d7eabc6765 Code cleanup: simplify cmake kernel install. 2017-10-07 15:32:20 +02:00
2d92988f6b Cycles: CUDA bicubic and tricubic texture interpolation support.
While cubic interpolation is quite expensive on the CPU compared to linear
interpolation, the difference on the GPU is quite small.
2017-10-07 15:30:57 +02:00
23098cda99 Code refactor: make texture code more consistent between devices.
* Use common TextureInfo struct for all devices, except CUDA fermi.
* Move image sampling code to kernels/*/kernel_*_image.h files.
* Use arrays for data textures on Fermi too, so device_vector<Struct> works.
2017-10-07 14:53:14 +02:00
Sergey Sharybin
83ce02879f Cycles: Fix possible race condition when generating Beckmann table
Two issues here:

- Checking table size to be non-zero is not a proper way to go here. This is
  because we first resize the table and then fill it in. So it was possible that
  non-initialized table was used.

  Trickery with using temporary memory and then doing table.swap() might work,
  but we can not guarantee that table size will be set after the data pointer.

- Mutex guard was useless, because every thread was using own mutex. Need to
  make mutex guard static so all threads are using same mutex.
2017-10-06 21:06:15 +05:00
Sergey Sharybin
837383ac78 Cycles: Cleanup, indendation 2017-10-06 19:33:59 +05:00
Sergey Sharybin
a950af8e24 Fix T53012: Shadow catcher creates artifacts on contact area
The issue was caused by light sample being evaluated to nan at some point.
This is root of the cause which is to be fixed, but is very hard to trace down
especially via ssh (the issue only happens on AVX2 release build). Will give it
a closer look when back to my AVX2 machine.

For until then this is a good check to have anyway, it corresponds to what's
happening in regular radiance sum.
2017-10-06 17:27:34 +05:00
Sergey Sharybin
0d3c8d0701 Cycles: Cleanup, indentation and wrapping 2017-10-06 16:54:37 +05:00
4537e85584 Fix T53001: more workarounds for crash in AMD compiler with recent drivers. 2017-10-05 17:57:58 +02:00
fb99ea79f8 Code refactor: split displace/background into separate kernels, remove luma. 2017-10-05 17:57:58 +02:00
49199963bf Fix incorrect CUDA remaining time estimate after previous commit. 2017-10-04 23:25:51 +02:00
6da6f8d33f Cycles: CUDA faster rendering of small tiles, using multiple samples like OpenCL.
The work size is still very conservative, and this doesn't help for progressive
refine. For that we will need to render multiple tiles at the same time. But this
should already help for denoising renders that require too much memory with big
tiles, and just generally soften the performance dropoff with small tiles.

Differential Revision: https://developer.blender.org/D2856
2017-10-04 21:58:47 +02:00
77f300e2a9 Fix use of uninitialized memory in Cycles normal baking. 2017-10-04 21:11:14 +02:00
5bb677e592 Code refactor: zero render buffers outside of kernel.
This was originally done with the first sample in the kernel for better
performance, but it doesn't work anymore with atomics. Any benefit was
very minor anyway, too small to measure it seems.
2017-10-04 21:11:14 +02:00
12f4538205 Code refactor: use split variance calculation for mega kernels too.
There is no significant difference in denoised benchmark scenes and
denoising ctests, so might as well make it all consistent.
2017-10-04 21:11:14 +02:00
e3e16cecc4 Code refactor: remove rng_state buffer and compute hash on the fly.
A little faster on some benchmark scenes, a little slower on others, seems
about performance neutral on average and saves a little memory.
2017-10-04 21:11:14 +02:00
5b7d6ea54b Code refactor: add WorkTile struct for passing work to kernel.
This makes sharing some code between mega/split in following commits a bit
easier, and also paves the way for rendering multiple tiles later.
2017-10-04 21:11:14 +02:00
660e8e59e7 Fix T52645, T52645: AMD OpenCL compiler crash with recent drivers.
Work around the bug by reshuffling code.
2017-10-04 21:00:46 +02:00
Ray Molenkamp
57d7e5b6ee Fix T42489 and T52936: Loading blend with minimized window results in crash or empty screen on windows.
Reviewed By: @brecht , @sergey

Differential Revision: http://developer.blender.org/D2866
2017-10-04 11:44:22 -06:00
Sergey Sharybin
61d5c5a64f Fix T52981: 2D Curve shapes do not render untill extruded
Regression since 9298c53.
2017-10-03 15:29:39 +05:00
f55735e533 CMake: support CUDA 9 toolkit, and automatically disable sm_2x binaries.
Fermi cards (GTX 4xx and 5xx) are no longer supported with this version, so
we can keep supporting both CUDA 8 and 9 for a while.
2017-10-01 14:14:53 +02:00
9298c53e4c Fix T52943: don't export curves objects with no faces to Cycles.
Also skip any objects with zero ray visibility and meshes with
zero faces.
2017-09-29 14:54:34 +02:00
d2bbd41b4e Fix Cycles OpenCL compiler error after recent changes. 2017-09-29 14:54:10 +02:00
Campbell Barton
5a1954a5cb Drop platform support for Solaris & AIX
These platforms didn't see maintenance in years.
This commit just removes ifdef's & cmake check.
2017-09-29 19:16:34 +10:00
c10ac1bb5c macOS: officially upgrade to 10.9 libraries from lib/darwin.
This removes a bunch of code that is no longer needed, and running
"make update" will now automatically download the new libraries.

Differential Revision: https://developer.blender.org/D2861
2017-09-28 20:53:06 +02:00
Kim Christensen
2a36ee16c1 Fix T52574: make Cycles rendered tile counter more clear.
Differential Revision: https://developer.blender.org/D2853
2017-09-28 15:18:53 +02:00
400e6f37b8 Cycles: reduce subsurface stack memory usage.
This is done by storing only a subset of PathRadiance, and by storing
direct light immediately in the main PathRadiance. Saves about 10% of
CUDA stack memory, and simplifies subsurface indirect ray code.
2017-09-28 15:18:43 +02:00
88520dd5b6 Code refactor: simplify CUDA context push/pop.
Makes it possible to call a function like mem_alloc() when the context is
already active. Also fixes some missing pops in case of errors.
2017-09-27 13:43:21 +02:00
Sergey Sharybin
0d4e519b74 OpenVDB: Fix compilation error against OpenVDB 4
One crucial thing here: OpenVDB shoudl be compiled WITHOUT
OPENVDB_ENABLE_3_ABI_COMPATIBLE flag. This is how OpenVDB's Makefile is
configured and it's not really possible to detect this for a compiled library.

If we ever want to support that option, we need to add extra CMake argument and
use old version 3 API everywhere.
2017-09-25 14:44:17 +05:00
Bastien Montagne
1d8aebaa09 Add an 'atomic cas' wrapper for pointers.
Avoids having to repeat obfuscating castings everywhere...
2017-09-25 10:40:50 +02:00
Sergey Sharybin
cb6f07f59e Cycles: Cleanup, indentation 2017-09-25 11:15:54 +05:00
Sergey Sharybin
c0480bc972 Cycles: Fix compilation error of OpenCL megakernel on Apple 2017-09-23 17:07:19 +05:00
Sergey Sharybin
b460b8fb4a Cycles: Fix compilation error of megakernel on NVidia device
It is more readable to explicitly compare to NULL anyway.
2017-09-23 17:03:02 +05:00
Aaron Carlisle
efd5e3c254 Remove quicktime support
It has been deprecated since at least macOS 10.9 and fully removed in 10.12.

I am unsure if we should remove it only in 2.8. But you cannot build blender with it supported when using a modern xcode version anyway so I would tend towards just removing it also for 2.79 if that ever happens.

Reviewers: mont29, dfelinto, juicyfruit, brecht

Reviewed By: mont29, brecht

Subscribers: Blendify, brecht

Maniphest Tasks: T52807

Differential Revision: https://developer.blender.org/D2333
2017-09-22 16:40:05 -04:00
07ec0effb6 Code cleanup: simplify kernel side work stealing code. 2017-09-21 22:29:18 +02:00
Stefan Werner
ee30a4381f Added extra "const" to satisfy the strict clang version in Xcode 9 2017-09-20 21:47:45 +02:00
18a353dd24 Fix T52368: Cycles OSL trace() failing on Windows 32 bit. 2017-09-20 19:38:08 +02:00
14223357e5 Fix T52853: harmless Cycles test failure in debug mode. 2017-09-20 19:38:08 +02:00
90d4b823d7 Cycles: use defensive sampling for picking BSDFs and BSSRDFs.
For the first bounce we now give each BSDF or BSSRDF a minimum sample weight,
which helps reduce noise for a typical case where you have a glossy BSDF with
a small weight due to Fresnel, but not necessarily small contribution relative
to a diffuse or transmission BSDF below.

We can probably find a better heuristic that also enables this on further
bounces, for example when looking through a perfect mirror, but I wasn't able
to find a robust one so far.
2017-09-20 19:38:08 +02:00
095a01a73a Cycles: slightly improve BSDF sample stratification for path tracing.
Similar to what we did for area lights previously, this should help
preserve stratification when using multiple BSDFs in theory. Improvements
are not easily noticeable in practice though, because the number of BSDFs
is usually low. Still nice to eliminate one sampling dimension.
2017-09-20 19:38:08 +02:00
b3afc8917c Code cleanup: refactor BSSRDF closure sampling, for next commit. 2017-09-20 19:38:08 +02:00
d029399e6b Code cleanup: remove SOBOL_SKIP hack, seems no longer needed. 2017-09-20 19:38:08 +02:00
d750d182e5 Code cleanup: remove hack to avoid seeing transparent objects in noise.
Previously the Sobol pattern suffered from some correlation issues that
made the outline of objects like a smoke domain visible. This helps
simplify the code and also makes some other optimizations possible.
2017-09-20 19:38:08 +02:00
Sergey Sharybin
3241905f40 Fix T52818: Tangent space calculation is really slow for high-density mesh with degenerated topology
Now we replace O(N^2) computational complexity with O(N) extra memory penalty.
Memory is much cheaper than CPU time. Keep in mind, memory penalty is like
4 megabytes per 1M vertices.
2017-09-19 17:50:09 +05:00
Sergey Sharybin
2dab6f499c Mikkspace: Cleanup, reduce indentation level 2017-09-19 17:50:09 +05:00