Commit Graph

6291 Commits

Author SHA1 Message Date
Sergey Sharybin
10a25b655a Cycles: Add AVX2 path to subsurface triangle intersection
Similar to regular triangle intersection case. Gives about 3% speedup rendering
SSS object on my desktop,

Question: how to avoid such a code duplication in a nice way without speed loss?
2016-10-24 16:56:41 +02:00
Sergey Sharybin
da8f5d6eac Cycles: Don't use guarded vector for statically initialized data
This will confuse hell of a guarded allocators because it is possible
to have allocation happened prior to Blender's guarded allocator is
fully initialized.

This was causing crashes and assert failures when running blender
with fully guarded memory allocator.
2016-10-24 14:18:22 +02:00
Sergey Sharybin
14a55bc059 Cycles: Fix shadowing variable which also causes use of uninitialized variable
Was causing wrong aperture for panorama cameras.

Seems to be a regression in 371d357.
2016-10-24 14:04:31 +02:00
Sergey Sharybin
cde18cf3b3 Cycles: Fix static initialization order fiasco
Initialization order of global stats and node types was not strictly
defined and it was possible to have node types initialized first and
stats after that. This will zero out memory which was allocated from
the statistics causing assert failure when de-initializing node types.
2016-10-24 13:47:39 +02:00
Sergey Sharybin
963aa7e270 Cycles: Fix uninitialized variable from the previous commit 2016-10-24 12:54:24 +02:00
Sergey Sharybin
80a6e5beb5 Cycles: Remove explicit std:: from types where possible
We have our own abstraction level on top of the STL's implementation.
This commit will guarantee our tweaks are used for all cases.
2016-10-24 12:31:11 +02:00
Sergey Sharybin
48997d2e40 Cycles: Cleanup, style 2016-10-24 12:26:12 +02:00
Sergey Sharybin
3f29259676 Fix T49818: Crash when rendering with motion blur
It was possible to have non-initialized unaligned BVH split
to be used when regular BVH split SAH was inf. Now we ensure
that unaligned splitter is only used when it's really initialized.

It's a regression and should be in 2.78a.
2016-10-24 11:47:32 +02:00
Sergey Sharybin
1e1811357d Cycles: Cleanup, spaces 2016-10-24 11:47:32 +02:00
Hristo Gueorguiev
8905c5c874 Cycles: OpenCL 3d textures support.
Note that volume rendering is not supported yet, this is a step towards that.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2299
2016-10-22 23:49:29 +02:00
371d3570e0 Fix Cycles address space OpenCL error after recent fix. 2016-10-22 23:36:30 +02:00
9d0ac94d52 Fix T49750: Cycles wrong ray differentials for perspective and stereo cameras. 2016-10-22 16:37:26 +02:00
Jörg Müller
132478d4b8 Fix T49657: Audio backend "Jack" should be named "JACK". 2016-10-22 14:20:47 +02:00
Jörg Müller
d5ee031f76 Fix T49764: Audio strips crackle when animating the volume
- Implemented linear interpolation for volume changes in the software
mixer.
- Using this in the software device.
2016-10-22 13:39:55 +02:00
Lukas Stockner
f7ce482385 Cycles: Fix another OpenCL logging issue
Previously an error message would be printed whenever the OpenCL build produced output.
However, some frameworks seem to print extra information even if the build succeeded, so now the actual returned error is checked as well.
When --debug-cycles is activated, the build output will always be printed, otherwise it only gets printed if there was an error.
2016-10-21 02:49:00 +02:00
Lukas Stockner
cd843409d3 Fix T49630: Cycles: Swapped shader and bake kernels
The problem here was, as the title says, that the two kernels were swapped.
Since shader evaluation is only used for building the samling map when World MIS is enabled, rendering without it would still work fine, although baking also was broken.
2016-10-17 12:28:01 +02:00
Lukas Stockner
d5dd12e56c Cycles: Improve OpenCL kernel compilation logging
The previous refactor changed the code to use a separate logging mechanism to support multithreaded compilation.
However, since that's not supported by any frameworks yes, it just resulted in bad logging behaviour.
So, this commit changes the logging to go diectly to stdout/stderr once again by default.
2016-10-17 11:51:18 +02:00
Scott Wu
7fec7eee20 Cycles: use near clipping distance in panorama camera.
Reviewed By: sergey, brecht, dfelinto

Differential Revision: https://developer.blender.org/D1952
2016-10-15 00:26:59 +02:00
Sergey Sharybin
0ddb8d9b13 Cycles: Disable optimization of operator / for float3
This was giving some speedup but made intersection tests to fail
from watertight point of view.

Needs deeper investigation, but need to quickly get it fixed for
the studio.
2016-10-14 13:53:26 +02:00
7f5441b916 Fix T49640: Cycles constant folding incorrect for texture coordinates. 2016-10-12 18:42:38 +02:00
21e65d7457 Fix build error with WITH_CYCLES_NATIVE_ONLY and recent AVX2 changes. 2016-10-12 17:35:03 +02:00
Sergey Sharybin
22cdf44101 Cycles: Use const reference for register variables in non-OpenCL code
This is something tested by @LazyDodo and suggested by Maxym to make
MSVC happier.
2016-10-12 14:48:59 +02:00
Sergey Sharybin
e588106d45 Cycles: Use more SSE intrinsics for float3 type
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.

The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.

This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.

Another patch based on inspiration from Maxym Dmytrychenko, thanks!
2016-10-12 14:43:00 +02:00
Sergey Sharybin
42aeb608e7 Cycles: Implement AVX2 version of triangle_intersect
This commit basically vectorizes existing code using AVX2 instructions
(without modifying algorithm itself). This gives quite nice speedups:

  BMW:        -8%
  Classroom:  -5%
  Cat:        -5%
  Koro:       +1%
  Barcelona:  -8%

That's on Linux machine, reported performance improvement on Windows
goes up to 20%.

Not currently sure why Koro is somewhat slower because it mainly uses
curve intersection tests, could be a time noise? Or osmething with the
cache utilization perhaps? In any case speedup in other scenes makes
me thinking that current state is acceptable for initial implementation.

This is again inspired by Maxym Dmytrychenko.
2016-10-12 14:11:55 +02:00
Sergey Sharybin
6a4ec3ca43 Cycles: Add new avxf vectorized data type
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.

Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.

Hopefully the copyright message is correct.
2016-10-12 13:54:13 +02:00
Sergey Sharybin
fa62a989b4 Cycles: Enable SSE options of math module for AVX2 kernels
Currently this does not give measurable difference, but is required
ground work for some upcoming further optimization of AVX2 kernels.
2016-10-12 12:54:31 +02:00
Sergey Sharybin
87d08a5dc1 Cycles: Get rid of ifdef-ed noinline policy 2016-10-12 12:15:24 +02:00
Sergey Sharybin
cc95172667 Cycles: Fix use of uninitialized variable in SSS
When ray hits curve segment with SSS shader it was possible to have
uninitialized hit_P variable used for sampling.

Seems that was a reason of our headache of difference between AVX2
and SSE4 render results here, so now we can revert all the nasty
ifdef-ed inline policies.
2016-10-12 12:12:28 +02:00
Sergey Sharybin
edd9d89673 Cycles: Cleanup, style 2016-10-12 11:54:33 +02:00
Lukas Stockner
9ea71bc674 Cycles: Split device_opencl.cpp into multiple files for easier maintenance
There are no user-visible changes, just some internal restructuring.

Differential Revision: https://developer.blender.org/D2231
2016-10-09 15:49:50 +02:00
74e0f900c5 Fix a few compile errors with C++11 on macOS. 2016-10-08 15:03:53 +02:00
Lukas Stockner
2dccf5a6e8 Cycles: Fix OpenCL split kernel compilation after recent CUDA 8 performance fix 2016-10-07 18:50:43 +02:00
5a0f397eaa Fix T49523: very slow normal map tangent computation for rendering in 2.78. 2016-10-06 03:12:04 +02:00
b4f9766ed1 Cycles CUDA: make CUDA 8.0 the officially supported version for all platforms. 2016-10-03 22:15:26 +02:00
a3abb020e3 Fix Cycles CUDA performance on CUDA 8.0.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.

On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2269
2016-10-03 22:15:25 +02:00
49ad4215ba Fix fluid sim build error with MSVC. 2016-10-03 22:15:24 +02:00
lazydodo
3ee5ce155c [Windows/Cycles/Clang] Fix compilation error with clang-cl on windows 2016-10-02 14:01:23 -06:00
3f9b69287d Fluids: improve multithreaded CPU usage.
Fixes for clamp-omp, fewer shared variables, fix some cases of threads writing
to the same memory location. Issue found by Jens Verwiebe, who reports 30%
speedup with 16 core CPU, when using this with a recent clang-omp version.
2016-10-02 16:38:14 +02:00
Alexander Gavrilov
40eedd5df9 Cycles: implement partial constant folding for exponentiation.
This is also an important mathematical operation that can be folded
if it is known that one argument is a certain constant. For colors
the operation is provided as a Gamma node.

The SVM Gamma node needs a small fix to make it follow the 0 ^ 0 == 1
rule, same as the Power node, or the Gamma node itself in OSL mode.

Reviewers: #cycles

Differential Revision: https://developer.blender.org/D2263
2016-10-01 14:37:03 +03:00
20c6d5e3cb Fix MSVC compiler warning due to using */* to start comment. 2016-10-01 01:55:34 +02:00
Sergey Sharybin
80837d06de Cycles: Support earlier tile rendering termination on cancel
It will discard the whole tile, but it's still kind of more friendly than
fully locked interface (sort of) for until tile is fully sampled.

Sorry if it causes PITA to merge for the opencl split work, but this issue
bothering a lot when collecting benchmarks.
2016-09-29 16:00:25 +02:00
Sergey Sharybin
333366dbcf Cycles: Fix typo in shader cancel routines 2016-09-29 15:48:10 +02:00
Sergey Sharybin
31ebbe40a0 Cycles: Improve OpenCL line information handling
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
2016-09-29 10:20:24 +02:00
Sergey Sharybin
94c919349b Cycles: Cleanup file headers
Some of the files were wrongly attributing code to some other
organizations and in few places proper attribution was missing.

This is mainly either a copy-paste error (when new file was
created from an existing one and header wasn't updated) or due
to some refactor which split non-original-BF code with purely
BF code.

Should solve some confusion around.
2016-09-29 10:11:40 +02:00
lazydodo
26d7d995db Fix Windows mouse wheel scroll speed
In Windows, event dispatching code is throwing out the wheel scroll count value.
Despite of how many fast you move the wheel, it only make one-notch scroll event.

This patch convert wheel event to multiple 1-notch wheel events.

This also correct the handling of smooth scroll mouse wheel (which can report smaller than 1-notch wheel movement) by accumulating the small wheel delta values.

Reviewers: djnz, shadowrom, elubie, #platform:_windows, sergey, juicyfruit, brecht

Reviewed By: shadowrom, elubie, #platform:_windows, brecht

Subscribers: dingto, elubie, brachi, brecht

Differential Revision: https://developer.blender.org/D143
2016-09-28 17:22:02 -06:00
Sergey Sharybin
0ec87f1227 Cycles: Cleanup, indentation 2016-09-28 17:05:33 +02:00
Sergey Sharybin
e1bfb89da2 Cycles: Fix compilation error with minimal feature set 2016-09-28 17:03:59 +02:00
Mike Erwin
5d0de39238 fix Mac build for Xcode < 8
We need a long-term fix, but this will get 2.78 out the door.
2016-09-27 16:16:47 +02:00
Mike Erwin
2ebb367b0f cleanup: spacing & alignment 2016-09-27 14:56:58 +02:00
Lukas Stockner
07de832e22 Cycles: Use correct light sampling PDF for MIS calculation with Branched Path Tracing
The light sampling functions calculate light sampling PDF for the case that the light has been randomly selected out of all lights.
However, since BPT handles lamps and meshlights separately, this isn't the case. So, to avoid a wrong result, the code just included the 0.5 factor in the throughput.

In theory, however, the correction should be made to the sampling probability, which needs to be doubled. Now, for the regular calculation, that's no real difference since the throughput is divided by the pdf.
However, it does matter for the MIS calculation - it's unbiased both ways, but including the factor in the PDF instead of the throughput should give slightly better results.

Reviewers: sergey, brecht, dingto, juicyfruit

Differential Revision: https://developer.blender.org/D2258
2016-09-25 23:16:05 +02:00