Commit Graph

2626 Commits

Author SHA1 Message Date
Sergey Sharybin
28f798f86e Cycles: Initial support for OpenCL capabilities reports
For now it's just generic information, still need to expose memory, workgorup
sizes and so on.
2015-06-05 14:17:30 +02:00
Sergey Sharybin
23b068ce8a Fix T44922: Split kernel renders black when using Bump node
Was missing feature detection in the BumpNode in the previous selective nodes
compilation commit.
2015-06-02 11:53:10 +05:00
Sergey Sharybin
27c1262e21 Fix T44908: Blender crashes when trying to use cycles experimental displacement
The issue was caused by the reshuffle needed to make objects flags have proper
object's bounding box to solve regressions in SSS objects intersecting volumes.

There's actually a feedback loop happening here, which is now solved in quite
naive way -- for the true displacement we consider all objects are capable of
intersecting volumes, synchronize object flags prior to displacement shader
tasks runs and then re-update object flags for proper bounding box.

Not sure what will be the proper solution here, we can't do preliminary check
of intersection for displacement shader, but on the other hand we don't really
need this flag for displacement shader anyway.
2015-06-02 00:04:30 +05:00
Thomas Dinges
a6101cde06 Cycles XML API: * Add Bump and Holdout Node * Add todo comments for various things. * SSS falloff now works. 2015-06-01 19:56:39 +05:00
Thomas Dinges
b10bc3a6ec Cycles: Number keys 0-3 can be used in interactive mode now to set max bounces. 2015-06-01 19:56:36 +05:00
Sergey Sharybin
3127d47029 Cycles: Fix wrong max nodes group used for the viewport render 2015-06-01 19:49:53 +05:00
Sergey Sharybin
9d4d55e78b Cycles: Strip meaningless empty output form the MVidia OpenCL compiler 2015-06-01 19:49:53 +05:00
Sergey Sharybin
f0a0b1eaac Cycles: Assert in the cases when SVM node was not handled
This will help figuring out cases when node was not properly handled by the SVM
by aborting execution on CPU, where all the nodes are expected to be supported.
2015-06-01 19:49:52 +05:00
Sergey Sharybin
ecd4ee75af Cycles: Implement selective nodes compilation
This commits finishes initial selective nodes compilation into kernel, which
helps a lot performance-wise for AMD OpenCL kernels.

Split by node groups is based on statistics from simple scenes like BMW and
more complex scenes like mango and gooseberry production files. Further
tweaks are always possible, but it should be a good starting point.

TODO: Still need to ignore unused nodes when calculating requested shader
features.
2015-06-01 19:49:52 +05:00
Sergey Sharybin
c0235da53c Cycles: Fix some typos in the selective modes compilation 2015-06-01 19:49:52 +05:00
Sergey Sharybin
399a27b261 Cycles: Code cleanup, spaces around keyword and brace 2015-06-01 19:49:52 +05:00
Sergey Sharybin
f45f2ac687 Cycles: Fix missing features gathering from the bump graph 2015-06-01 19:49:52 +05:00
Sergey Sharybin
4d8cf1329d Cycles: Add bump feature for selective nodes compilation
For now it is unused in the kernel, actual usage will come with
the next commits.
2015-06-01 19:49:52 +05:00
Sergey Sharybin
36ef6d1532 Cycles: Report build flags used for the OpenCL kernel compilation
For now it's reported to the stdout, matching to the CUDA behavior.
In the future we can hide this into GLog logging once the kernels
are considered all stable and so.
2015-06-01 19:49:52 +05:00
Sergey Sharybin
14251e8b45 Cycles: Shader node features are to be inherited from the base class 2015-06-01 19:49:52 +05:00
Thomas Dinges
3511e2d6ae Cycles: Enable Object Motion on AMD OpenCL.
Like Camera Motion, only available in the Experimental kernel.
This should be it for the upcoming release, we now support almost everything, apart from Transparent Shadows, SSS and Volume.
2015-05-28 22:10:53 +02:00
Thomas Dinges
46d8bcb617 Cleanup: Remove unused Noise Basis texture code.
Same as last commit, code is unused and this one actually would have required some fixes,
as these variants output values outside the 0-1 value range, which doesn't fit Cycles shader design.
2015-05-28 01:07:37 +02:00
Thomas Dinges
20f6a0f2d7 Cleanup: Remove unused Voronoi texture code.
Let's finally delete this code, after 4 years of being unused,
there really is no excuse anymore.

If we decide to extend the procedural textures in SVM, we can do this anytime in the future.
2015-05-28 00:36:33 +02:00
Sergey Sharybin
92022218c2 Cycles: Code cleanup, split kernel 2015-05-27 13:08:17 +05:00
Sergey Sharybin
84ad20acef Fix T44833: Can't use ccl_local space in non-kernel functions
This commit re-shuffles code in split kernel once again and makes it so common
parts which is in the headers is only responsible to making all the work needed
for specified ray index. Getting ray index, checking for it's validity and
enqueuing tasks are now happening in the device specified part of the kernel.

This actually makes sense because enqueuing is indeed device-specified and i.e.
with CUDA we'll want to enqueue kernels from kernel and avoid CPU roundtrip.

TODO:
- Kernel comments are still placed in the common header files, but since queue
  related stuff is not passed to those functions those comments might need to
  be split as well.

  Just currently read them considering that they're also covering the way how
  all devices are invoking the common code path.

- Arguments might need to be wrapped into KernelGlobals, so we don't ened to
  pass all them around as function arguments.
2015-05-26 22:54:02 +05:00
Sergey Sharybin
6245f4a39c Cycles: Enable advanced shading for NVidia OpenCL kernel
It was kept disabled due to render artifacts which weer in fact caused by bad
memory access, which is fixed in the previous commit.

We now also can make it enabled in regular AMD split kernel after someone tests
the updated code.
2015-05-26 21:29:21 +05:00
Sergey Sharybin
cf19012fb0 Fix T44831: Crash when using Intel OpenCL with split kernel
The issue was caused by underallocation of object motion related arrays,
which happened by accident.
2015-05-26 21:29:21 +05:00
Sergey Sharybin
7487a4d4ac Fix T44763: Surface Panel does not update correctly according to Node Output for Cycles UI 2015-05-26 16:15:34 +05:00
Campbell Barton
2c3c477223 Cleanup: warning, spelling 2015-05-26 16:46:33 +10:00
Sergey Sharybin
62f2d9b566 Cycles: Fix compilation error of split kernel
The code was failing to compile on runtime because of some path differences,
and it seems we don't need to specify full path to the file which originally
seemed to be needed to make include directives expansion working correct.
2015-05-25 14:18:01 +05:00
Thomas Dinges
a3ef51bba5 Fix T44833, OpenCL compile error on AMD.
This was broken after the kernel file restructure.
Variables allocated in the __local address space can only be defined
inside a __kernel function.

We probably need to solve this a bit differently once we do the CUDA
kernel split, but this fix shoud be good enough until then.
2015-05-25 01:02:06 +02:00
Thomas Dinges
c3ab5b3089 Fix T44830, wrong sample progress number when using split device.
Value was not set, moved it out of the constructor into
device_opencl_create() now.
2015-05-25 00:37:01 +02:00
Sergey Sharybin
2c503d8303 Cycles: Restructure kernel files organization
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.

Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.

Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.

There should be no functional changes.

Reviewers: juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1314
2015-05-22 16:31:34 +05:00
Thomas Dinges
a934730368 Cycles: Remove TM / R and whitespace from OpenCL device names.
Was already done for CPU devices, now we also do this for OpenCL.
2015-05-21 23:43:18 +02:00
Thomas Dinges
53eab562b4 Cleanup: Remove some outdated comments related to split kernel. 2015-05-21 20:32:20 +02:00
Sergey Sharybin
7938bd1877 Cycles: Remove OSL from split headers
Split kernel is mainly useful for GPUs which can not support OSL in visible
future anyway.
2015-05-21 16:12:50 +05:00
Sergey Sharybin
329f704601 Cycles: Move utility atomics function to util_atomic.h
No functional changes, just better to keep all atomic function in a single place,
they might become handy later.
2015-05-21 16:12:50 +05:00
Sergey Sharybin
d4c676e81b Cycles: CYCLES_OPRNCL_DEBUG now affects on split kernel as well 2015-05-21 14:30:33 +05:00
Sergey Sharybin
f18d77b874 Cycles: Restore some lost custom cflags passed to the kernel compilation
They were lost during simplification of kernel loading but might be rather
crucial for the performance.

Also made it so cflags are shared across kernels. Surely it might lead to
some unwanted kernel re-compilation but at the same time they might easily
run out of sync with the changes in kernel and so.
2015-05-21 14:05:53 +05:00
Sergey Sharybin
148ed4e05e Cycles: Cleanup, synchronize name across file name, program and kernel names 2015-05-20 23:10:07 +05:00
Sergey Sharybin
6f48df45ee Cycles: Simplify code around kernel loading 2015-05-20 23:10:07 +05:00
Martijn Berger
8dd9b7cc5f Cycles standalone, add device type in output listing 2015-05-20 17:11:09 +02:00
Sergey Sharybin
da34136de1 Cycles: Check for validity of the tiles arrays in progressive refine
In certain configurations (for example when start resolution is set to small
value for background render and progressive refine enabled) number of tiles
might change in the tile manager. This situation will confuse progressive
refine feature and likely cause crash.

We might also add some settings verification in the session constructor, but
having an assert with brief explanation about what's wrong should already be
much better than nothing.
2015-05-19 12:42:07 +05:00
Sergey Sharybin
f868be6295 Cycles: Check for whether update/write callbacks are set prior to calling them
This changes the progressive refine part, regular update was already checking
for whether callbacks are set.
2015-05-19 12:42:07 +05:00
Sv. Lockal
88acb3c599 Fix T44707: cycles border render regression 2015-05-18 11:37:19 +10:00
Martijn Berger
3ed009af96 Change behavior of cycles xml to conform the spec: "Each XML document has exactly one single root element" 2015-05-17 23:41:38 +02:00
Thomas Dinges
105b87a3f7 Cycles: Enable advanced shading on AMD / OpenCL.
That is needed for Motion Blur and Render Passes to work properly.
I hope there are no nasty side effects, but we need to test this.
2015-05-17 19:29:33 +02:00
Thomas Dinges
dae566894a Cycles / OpenCL: Enable Camera Motion and Hair for AMD.
Only enabled for the Experimental kernel though, so the feature set must
be changed in the UI to use the features.
2015-05-17 18:46:25 +02:00
Thomas Dinges
14c2bc53c0 Cleanup: Typos, typos everywhere. :D 2015-05-17 18:32:31 +02:00
Thomas Dinges
effb912061 Cycles Standalone: Expose various light settings. 2015-05-17 12:36:42 +02:00
Thomas Dinges
347843f6fe Cycles Standalone: Update help screen. 2015-05-17 12:10:30 +02:00
Campbell Barton
847ec075eb Cleanup: pep8 2015-05-17 17:26:01 +10:00
Campbell Barton
daeb3069cf Cleanup: typos 2015-05-17 16:09:32 +10:00
Campbell Barton
31e96cbf96 Cleanup: style, spelling 2015-05-15 23:38:53 +10:00
Thomas Dinges
7c06190882 Cycles: Make animated seed a builtin feature.
For animations, you often want an animated render seed (noise pattern).

This could be done by e.g. setting a driver on the seed value.
Now it's a little checkbox, that can be enabled.

The animated seed is based on the current Blender frame and
the seed value itself. Simply enabling it, will already result in an animated
seed (different on each Blender frame), but it can be randomized further
by setting a different seed value.

Disabled per default, so no backward compatibility break.

Differential Revision: https://developer.blender.org/D1285
2015-05-15 13:54:59 +02:00
Sergey Sharybin
c86a6f3efb Cycles: Enable CMJ for Intel/NVidia experimental split kernels
It is still disabled for AMD devices since can't test if it works fine
on this hardware.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
c2b9f78415 Cycles: Pass __KERNEL_EXPERIMENTAL__ to OpenCL split kernels
Experimental feature set id currently unavailable for megakernel, it'll
require some changes to the cache system to distinguish cached regular
kernels from cached experimental kernels.

Currently unused, but some features will be enabled soon.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
2ab909a88c Cycles: Make experimental kernel build option more generic
Previously it was explicitly mentioning it's NVidia kernel related option,
but in fact it's also handy for the OpenCL kernel.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
c9e8888f87 Cycles: Disable bake OpenCL kernel for NVidia devices prior to sm_30
Driver fails to compile kernel in reasonable time for those devices here,
so for easier testing of the OpenCL split kernel work disabling bake kernel
for now.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
960d7df56f Cycles: Pass device compute capabilities to kernel via build options
This way it's possible to do device-selective feature disabling/enabling.
Currently only supported for NVidia devices via OpenCL extension.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
03f9d5a4cf Cycles: Cleanup, move build options string calculation into the device class
This way it's easier to access platform name, device ID and other stuff which
might be needed to define build options.
2015-05-15 13:22:47 +05:00
Julian Eisel
a92d8a34a8 Add material reorder buttons for Cycles as well 2015-05-15 01:25:03 +02:00
Sergey Sharybin
3c10ec96b5 Cycles: Enable object motion blur on Intel OpenCL platform
This required allocating some memory related on object transform needed
by ShaderData and currently it is done for all the platforms. Since we're
targeting full feature-complete platforms this is rather acceptable at
this point and in the future we'll do selective NO_HAIR/NO_SSS/NO_BLUR
kernels.

This is experimental still and in fact there're some major issues on
NVidia platform and it's not really clear if it's a bug in compiler,
some uninitizlied variable or other kind of issue.
2015-05-15 00:48:12 +05:00
Sergey Sharybin
03565218d5 Cycles: Various fixes
Some stupid fixes like spaces around operator and missing semicolon,
plus fix for wrong detecting of ShaderData SOA size. Thar was harmless
since there's only one closure array, but still better to fix this.
2015-05-15 00:42:05 +05:00
Sergey Sharybin
f6c6dd44de Cycles: Remove meaningless ifdef checks for features in device_opencl
This file was actually checking for features enabled on CPU and surely all
of them were enabled, so removing them does not cause any difference.

ideally we'll need to do runtime feature detection and just pass some stuff
as NULL to the kernel, or maybe also have variadic kernel entry points which
is also possible quite easily.
2015-05-14 23:44:19 +05:00
Sergey Sharybin
5c34266383 Cycles: Enable camera motion blur in split kernel for Intel/NVidia
It's good for testing and seems to work quite reliably here.

This probably not totally cheap in terms of performance, but this we
could solve quite easily by selective kernel compilation once other
things are tested/proved to be reliable.
2015-05-14 23:35:19 +05:00
Sergey Sharybin
0a60c7d8ee Cycles: Fix missing camera-in-volume update when using certain render layers configurations 2015-05-14 19:08:13 +05:00
Sergey Sharybin
3d3d805b64 Cycles: Prepare code for OpenCL camera/motion blur
The kernels are now compiling just fine, but there're some issues
during rendering. This is still to be investigated.
2015-05-14 18:48:56 +05:00
Sergey Sharybin
5a63edb929 Cycles: Use special _auto versions of transform function in motion blur code
Doing this as a separate commit so it's easier to revert in the future, once
OpenCL 2.0 is becoming our requirement.
2015-05-14 18:48:56 +05:00
Sergey Sharybin
33439626f1 Cycles: Add transformation functions with specified addrspace
This is required for OpenCL prior to 2.0 and those functions will become
handy when working on camera/motion blur support in split kernel.
2015-05-14 18:48:56 +05:00
Sergey Sharybin
79aa50dc53 Cycles: Enable hair for split kernels when using Intel or NVidia drivers
Apart from simply enabling this features needed changes to the code were done.
Technical change, replacing SD access from "simple" structure to SOA.
2015-05-14 18:48:56 +05:00
Thomas Dinges
0e80eb82e0 Cycles: Resize light_data after possible light removal. 2015-05-14 01:13:40 +02:00
Thomas Dinges
67eb2c7897 Cycles: Remove Emission shaders from the graph if color or strength is 0. 2015-05-14 01:13:40 +02:00
Thomas Dinges
fc31bae66f Cleanup: Avoid temp variable in portal sampling code. 2015-05-13 19:54:52 +02:00
Sergey Sharybin
93867ae549 Cycles: Cleanup: use generic utility function to set kernel arguments 2015-05-13 19:56:24 +05:00
Sergey Sharybin
51a6bc8faa Cycles: Inline sizeof of elements needed for the split kernel
No need to store them in the class, they're unlikely to be changed
and if they do change we're in big trouble anyway.

More appropriate approach would be then to typedef this things in
kernel_types.h, but still use inlined sizeof(),
2015-05-13 19:56:24 +05:00
Thomas Dinges
0a6e32173e Cleanup / Cycles: De-Duplicate Portal data fetch and side check. 2015-05-13 16:05:30 +02:00
Sergey Sharybin
f0f481031c Fix T44616: Cycles crashes loading 42k by 21k textures
Simple integer overflow issue.

TODO(sergey): Check on CPU cubic sampling, it might also need size_t.
2015-05-12 18:48:55 +05:00
Sv. Lockal
c7bccb30bf Cycles: check for F16C support with __cpuid, as we do for BMI and BMI2 2015-05-11 15:49:36 +00:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
Antony Riakiotakis
e38f914421 Cycles: use vertex buffers when possible to draw tiles on the screen.
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
2015-05-11 16:28:41 +02:00
Antony Riakiotakis
5588a51c9c Cycles OpenGL: Don't use full matrix transform when we can just use
simple addition.
2015-05-11 13:10:19 +02:00
Sv. Lockal
d55868c3b2 Cycles: And yet another compilation fix after half-float commit for clang.
Suggested by Brecht, tested with gcc > 4.4 and Clang
2015-05-10 19:32:32 +00:00
Sv. Lockal
3ec168465d Cycles: fix compilation on 32-bit Windows for half-floats
Reported by IRC user HG1.
2015-05-10 19:06:43 +00:00
Sv. Lockal
8db2a9a352 Cycles: Add -mf16c for previous commit for Scons
Thanks to Dingto for noticing!
2015-05-10 17:51:04 +00:00
Sv. Lockal
2ec221aa28 Cycles: Use native float->half conversion instructions for Haswell CPUs.
This makes OCIO viewport color correction a little bit faster (about -0.5s for 100 samples)
Also set max half float value to 65504.0 to conform with IEEE 754.
2015-05-10 16:35:51 +00:00
Sergey Sharybin
3a2c0ccdd0 Cycles: Correction to opencl whitelist check
Was using platform as a device id accidentally.
2015-05-10 20:02:06 +05:00
Thomas Dinges
a47ade34c2 Cycles: Fix tiny greying out inconsistency for Volume settings. 2015-05-10 12:59:18 +02:00
Thomas Dinges
e8be170e79 Cycles: Do not show Branched Path integrator for OpenCL.
Branched Path is not supported, neither in the Split nor Megakernel.
2015-05-10 12:59:18 +02:00
Sergey Sharybin
583fd3af65 Cycles: Fix typo in global space version of normal transform
It was using direction transform, which is obviously wrong.
2015-05-10 00:53:32 +05:00
Sergey Sharybin
136d7a4f62 Cycles: Only whitelist AMD GPU devices in the OpenCL section
Only those ones are priority for now, all the rest are still testable
if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment
variables are set.
2015-05-09 23:40:26 +05:00
Sergey Sharybin
2840a5de8f Cycles: Workaround for AMD compiler crashing building the split kernel
It's a but in compiler but it's nice to have working kernel for until
that bug is fixed.
2015-05-09 19:56:38 +05:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Sergey Sharybin
f680c1b54a Cycles: Communicate number of closures and nodes feature set to the device
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
2015-05-09 19:28:00 +05:00
Sergey Sharybin
6fc1669679 Cycles: Initial work towards selective nodes support compilation
The goal is to be able to compile kernel with nodes which are actually needed
to render current scene, hence improving performance of the kernel,

The idea is:

- Have few node groups, starting with a group which contains nodes are used
  really often, and then couple of groups which will be extension of this one.

- Have feature-based nodes disabling, so it's possible to disable nodes related
  to features which are not used with the currently used nodes group.

This commit only lays down needed routines for this approach, actual split will
happen later after gathering statistics from bunch of production scenes.
2015-05-09 19:22:16 +05:00
Sergey Sharybin
17c95d0a96 Cycles: Add utility function to count maximum number of closures used by session
This will be used by split kernel in order to compile most optimal kernel.

Maximum number of closures is actually being cached in the session, so viewport
rendering will not trigger kernel re-loading when number of closures goes down.
2015-05-09 19:17:49 +05:00
Sergey Sharybin
5068f7dc01 Cycles: Add utility function to graph to query number of closures used in it
Currently unused but will be needed soon for the split kernel work.
2015-05-09 19:13:32 +05:00
Sergey Sharybin
b3299bace0 Cycles: Pass requested tile size to the device via device task
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.

Maybe not really best place to store it, but consider it good enough for now.
2015-05-09 19:09:07 +05:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
d69c80f717 Cycles: Presumably correct workaround for addrspace in camera motion blur 2015-05-09 19:04:19 +05:00
Sergey Sharybin
c9133778cf Cycles: Add CPU compat headers to some of the OSL implementation files
This header was already included into some of the implementation files already,
and this change is needed for some upcoming changes in the way how kernel_types.h
works.
2015-05-09 19:04:16 +05:00
Sergey Sharybin
7eac672e4f Cycles: Set default closure values to some of the nodes
Previously it was only set at compilation time which is all fine but does
not let us to check which closure the node corresponds to prior to the
compilation.
2015-05-09 19:04:09 +05:00
Thomas Dinges
900fc43bb4 Cleanup: Remove unused ray type flags.
They were added for completeness, but it seems we don't need them.
2015-05-08 12:10:26 +02:00
Sergey Sharybin
9ca2b76a9f Cycles: Cleanup, make it more clear what endif closes what ifdef 2015-05-07 15:02:43 +05:00
Campbell Barton
165598e49e Correct typo: ifdef'd now, but obviously wrong 2015-05-07 10:12:12 +10:00
Sergey Sharybin
b45ad4b214 Cycles: Fix for wrong clamp usage in fast math 2015-05-06 00:01:40 +05:00
Thomas Dinges
d01b226870 Cleanup: Remove leftover from Distorted Noise node in XML reader. 2015-05-05 10:38:45 +02:00
Sv. Lockal
7201f6d14c Cycles: Use curve approximation for blackbody instead of lookup table
Now we calculate color in range 800..12000 using an approximation a/x+bx+c for R and G and ((at + b)t + c)t + d) for B.
Max absolute error for RGB for non-lut function is less than 0.0001, which is enough to get the same 8 bit/channel color as for OSL with a noticeable performance difference.
However there is a slight visible difference between previous non-OSL implementation because of lookup table interpolation and offset-by-one mistake.
The previous implementation gave black color outside of soft range (t > 12000), now it gives the same color as for 12000.

Also blackbody node without input connected is being converted to value input at shader compile time.

Reviewers: dingto, sergey

Reviewed By: dingto

Subscribers: nutel, brecht, juicyfruit

Differential Revision: https://developer.blender.org/D1280
2015-05-05 06:11:54 +00:00
Campbell Barton
e59bd19fa7 Cleanup: style & const's 2015-05-05 05:19:49 +10:00
Thomas Dinges
66f96e555c Cycles: Fix copy / paste mistake in XML reader. 2015-05-04 14:31:20 +02:00
Sergey Sharybin
b7d0ff0ad6 Separate scene simplification into viewport and render
This way it is possible to have viewport simplification bumped all the way up,
making viewport really responsive but still have final render to use highest
subdivision possible.

Reviewers: lukastoenne, campbellbarton, dingto

Reviewed By: campbellbarton, dingto

Subscribers: dingto, nutel, eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D1273
2015-05-04 16:31:10 +05:00
Sergey Sharybin
16794f908f Cycles: Fix possible uninitialized XML read state which might cause crashes 2015-04-30 15:46:09 +05:00
Sergey Sharybin
41d817f15d Fix T44548: Cycles Tube Mapping off / not compatible with BI
Was a typo in original implementation, probably a result of some code reshuffle
happened for optimization reasons.
2015-04-30 14:27:16 +05:00
Thomas Dinges
4eab0e72b3 Cleanup: Update some comments and add ToDo. 2015-04-29 23:56:46 +02:00
Thomas Dinges
b3def11f5b Cycles: Record all possible volume intersections for SSS and camera checks
This replaces sequential ray moving followed with scene intersection with
single BVH traversal, which gives us all possible intersections.

Only implemented for CPU, due to qsort and a bigger memory usage on GPU
which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code.

This improves render performance for scenes with:
a) Camera inside volume mesh
b) SSS mesh intersecting a volume mesh/domain

In simple volume files (not much geometry) performance is roughly the same
(slightly faster). In files with a lot of geometry, the performance
increase is larger. bmps.blend with a volume shader and camera inside the
mesh, it renders ~10% faster here.

Patch by Sergey and myself.

Differential Revision: https://developer.blender.org/D1264
2015-04-29 23:31:06 +02:00
Sergey Sharybin
7aab5c6ca9 Cycles: Fix wrong termination criteria in SSS volume stack update
Another issue spotted with Thomas.
2015-04-30 01:20:17 +05:00
Sergey Sharybin
e5f3193df3 Cycles: Fix wrong order in object flags calculations
Object flags are depending on bounding box which is only available after
mesh synchronization.

This was broken since 7fd4c44 which happened quite close to the release
and oddly enough was not sopped by anyone. Render test is coming for this.

Was spotted by Thomas Dinges while working on another patch.
2015-04-30 01:09:48 +05:00
Sergey Sharybin
d6b28bbb1d Cycles: Fix crashes when loading cache created with pre-leaf split builds 2015-04-29 15:48:49 +05:00
Sergey Sharybin
2e91bcfb9d Fix T44544: Cached BVH is broken since BVH leaf split
Still need to solve issues with reading old cache with new builds.
2015-04-29 15:38:07 +05:00
Thomas Dinges
5e423775da Cleanup: Move Cycles volume stack update for subsurface into kernel_volume.h. 2015-04-28 11:20:27 +02:00
Thomas Dinges
58a2b10a65 Cycles: Initialize portal variable directly, so we can avoid the one NULL check. 2015-04-27 23:12:53 +02:00
Lukas Stockner
f478c2cfbd Cycles: Added support for light portals
This patch adds support for light portals: objects that help sampling the
environment light, therefore improving convergence. Using them tor other
lights in a unidirectional pathtracer is virtually useless.

The sampling is done with the area-preserving code already used for area lamps.
MIS is used both for combination of different portals and for combining portal-
and envmap-sampling.

The direction of portals is considered, they aren't used if the sampling point
is behind them.

Reviewers: sergey, dingto, #cycles

Reviewed By: dingto, #cycles

Subscribers: Lapineige, nutel, jtheninja, dsisco11, januz, vitorbalbio, candreacchio, TARDISMaker, lichtwerk, ace_dragon, marcog, mib2berlin, Tunge, lopataasdf, lordodin, sergey, dingto

Differential Revision: https://developer.blender.org/D1133
2015-04-28 01:30:16 +05:00
Sergey Sharybin
ae7d84dbc1 Cycles: Use native saturate function for CUDA
This more a workaround for CUDA optimizer which can't optimize clamp(x, 0, 1)
into a single instruction and uses 4 instructions instead.

Original patch by @lockal with own modification:

  Don't make changes outside of the kernel. They don't make any difference
  anyway and term saturate() has a bit different meaning outside of kernel.

This gives around 2% of speedup in Barcelona file, but in more complex shader
setups with lots of math nodes with clamping speedup could be much nicer.

Subscribers: dingto

Projects: #cycles

Differential Revision: https://developer.blender.org/D1224
2015-04-28 00:38:32 +05:00
Thomas Dinges
bc160d8a85 Cleanup: Code style. 2015-04-26 00:42:26 +02:00
Thomas Dinges
8dd055cd47 Cleanup: Update Lookup table comments. 2015-04-26 00:06:38 +02:00
Lukas Stockner
60c5a2f2d2 Cycles: Add Mirror ball mapping to camera panorama options
The projection code was already in place, so this just exposes the option.

Differential Revision: https://developer.blender.org/D1079
2015-04-25 23:51:56 +02:00
Campbell Barton
b82d571c85 Cleanup: style 2015-04-21 15:53:32 +10:00
Sergey Sharybin
828abaf11c Cycles: Split BVH nodes storage into inner and leaf nodes
This way we can get rid of inefficient memory usage caused by BVH boundbox
part being unused by leaf nodes but still being allocated for them. Doing
such split allows to save 6 of float4 values for QBVH per leaf node and 3
of float4 values for regular BVH per leaf node.

This translates into following memory save using 01.01.01.G rendered
without hair:

                   Device memory size   Device memory peak   Global memory peak
Before the patch:  4957                 5051                 7668
With the patch:    4467                 4562                 7332

The measurements are done against current master. Still need to run speed tests
and it's hard to predict if it's faster or not: on the one hand leaf nodes are
now much more coherent in cache, on the other hand they're not so much coherent
with regular nodes anymore.

Reviewers: brecht, juicyfruit

Subscribers: venomgfx, eyecandy

Differential Revision: https://developer.blender.org/D1236
2015-04-20 17:29:51 +05:00
Sergey Sharybin
cd44449578 Cycles: Synchronize images after building mesh BVH
This way memory overhead caused by the BVH building is not so visible and peak
memory usage will be reduced.

Implementing this idea is not so straightforward actually, because we need to
synchronize images used for true displacement before meshes. Detecting whether
image is used for true displacement is not so striaghtforward, so for now all
all displacement types will synchronize images used for them.

Such change brings memory usage from 4.1G to 4.0G with the 01_01_01_D scene
from gooseberry. With 01_01_01_G scene it's 7.6G vs. 6.8G (before and after
the patch).

Reviewers: campbellbarton, juicyfruit, brecht

Subscribers: eyecandy

Differential Revision: https://developer.blender.org/D1217
2015-04-20 17:29:51 +05:00
Dalai Felinto
394c5318c6 Bake-API: reduce memory footprint when baking more than one object (Fix T41092)
Combine all the highpoly pixel arrays into a single array with a lookup
object_id for each of the highpoly objects.

Note: This changes the Bake API, external engines should refer to the
bake_api.c for the latest API.

Many thanks for Sergey Sharybin for the complete review, changes
suggestion and feedback. (you rock!)

Reviewers: sergey

Subscribers: pildanovak, marcclintdion, monio, metalliandy, brecht

Maniphest Tasks: T41092

Differential Revision: https://developer.blender.org/D772
2015-04-17 12:25:37 -03:00
Campbell Barton
d1f9fcaabc Cleanup: style 2015-04-13 22:08:51 +10:00
Sergey Sharybin
35812e65f4 Cycles: Fix compilation error on windows after recent logging changes 2015-04-10 22:35:10 +05:00
Sergey Sharybin
aac0df956f Cycles: Cleanup, make more clear what camera utility functions are private/public 2015-04-10 16:25:35 +05:00
Sergey Sharybin
e073562f80 Cycles: Make transform from viewplane a generic utility function 2015-04-10 15:53:14 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00
Sergey Sharybin
7ea4163e1e Cycles: Fix BVH counter on mesh updates 2015-04-09 22:23:59 +05:00
Sergey Sharybin
cca4405437 Cycles: Fix wrong render result in certain configuration of render layer's surface/hair
There were some synchronization missing in cases when only one of those settings
was disabled.

Also added a render test for such configurations now.
2015-04-09 21:22:48 +05:00
Sergey Sharybin
bf11e362c5 Fix T44046: Cycles speed regression in 2.74 (CPU only)
Issue was caused by MSVC not being able to optimize some code out in the same
way as GCC/Clang does, so now that parts of code are explicitly unfolded in
order to help compilers out.

This makes speed loss much less drastic on my laptop. That's probably as good
as we can do with MSVC without investing infinite amount of time looking trying
to workaround the optimizer.
2015-04-08 18:47:25 +05:00
Sergey Sharybin
7621ff7e55 Cycles: Code cleanup, indentation. Was wrong in the multiview commit 2015-04-08 15:35:01 +05:00
Sergey Sharybin
09a746b857 Cycles: Cleanup, typos 2015-04-08 01:15:38 +05:00
Sergey Sharybin
858f54f16e Cycles: Cleanup, indentation 2015-04-07 22:41:08 +05:00
Sergey Sharybin
e2354e64d2 Cycles: Cleanup, spaces around assignment operator
Did some bad spacing in recent commits, better to get rid of those so
they does not confuse those who're working on sources.
2015-04-07 00:25:54 +05:00
Sergey Sharybin
c1d8ddacaf Cycles: Avoid doing paranoid checks in filepath of builtin images
Originally we thought it's needed in order to distinguish builtin file from
filename which starts with '@', but the filepath is actually full path there
and it's unlikely to have file system where '@' is a proper root character.

Surprisingly this does not give visible speed differences, but it's still
nice to get rid of redundant check.
2015-04-07 00:11:47 +05:00
Sergey Sharybin
7c19239bf9 Cycles: Support bultin 3d textures with OSL backend 2015-04-06 23:29:29 +05:00
Sergey Sharybin
d0aae79505 Cycles: More instant feedback on progressive rendering for first sample
Main purpose of this change is to make material preview appearing more
instant after the shader tweaks.
2015-04-06 19:28:25 +05:00
Sergey Sharybin
b5f58c1ad9 Cycles: Experiment with making previews more interactive
There were two major problems with the interactivity of material previews:

- Beckmann tables were re-generated on every material tweak.
  This is because preview scene is not set to be persistent, so re-triggering
  the render leads to the full scene re-sync.

- Images could take rather noticeable time to load with OIIO from the disk
  on every tweak.

This patch addressed this two issues in the following way:

- Beckmann tables are now static on CPU memory.

  They're couple of hundred kilobytes only, so wouldn't expect this to be
  an issue. And they're needed for almost every render anyway.

  This actually also makes blackbody table to be static, but it's even smaller
  than beckmann table.

  Not totally happy with this approach, but others seems to complicate things
  quite a bit with all this render engine life time and so..

- For preview rendering all images are considered to be built-in. This means
  instead of OIIO which re-loads images on every re-render they're coming
  from ImBuf cache which is fully manageable from blender side and unused
  images gets freed later.

  This would make it impossible to have mipmapping with OSL for now, but we'll
  be working on that later anyway and don't think mipmaps are really so crucial
  for the material preview.

  This seems to be a better alternative to making preview scene persistent,
  because of much optimal memory control from blender side.

Reviewers: brecht, juicyfruit, campbellbarton, dingto

Subscribers: eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D1132
2015-04-06 19:22:17 +05:00
Dalai Felinto
d5f1b9c222 Multi-View and Stereo 3D
Official Documentation:
http://www.blender.org/manual/render/workflows/multiview.html

Implemented Features
====================
Builtin Stereo Camera
* Convergence Mode
* Interocular Distance
* Convergence Distance
* Pivot Mode

Viewport
* Cameras
* Plane
* Volume

Compositor
* View Switch Node
* Image Node Multi-View OpenEXR support

Sequencer
* Image/Movie Strips 'Use Multiview'

UV/Image Editor
* Option to see Multi-View images in Stereo-3D or its individual images
* Save/Open Multi-View (OpenEXR, Stereo3D, individual views) images

I/O
* Save/Open Multi-View (OpenEXR, Stereo3D, individual views) images

Scene Render Views
* Ability to have an arbitrary number of views in the scene

Missing Bits
============
First rule of Multi-View bug report: If something is not working as it should *when Views is off* this is a severe bug, do mention this in the report.

Second rule is, if something works *when Views is off* but doesn't (or crashes) when *Views is on*, this is a important bug. Do mention this in the report.

Everything else is likely small todos, and may wait until we are sure none of the above is happening.

Apart from that there are those known issues:
* Compositor Image Node poorly working for Multi-View OpenEXR
(this was working prefectly before the 'Use Multi-View' functionality)
* Selecting camera from Multi-View when looking from camera is problematic
* Animation Playback (ctrl+F11) doesn't support stereo formats
* Wrong filepath when trying to play back animated scene
* Viewport Rendering doesn't support Multi-View
* Overscan Rendering
* Fullscreen display modes need to warn the user
* Object copy should be aware of views suffix

Acknowledgments
===============
* Francesco Siddi for the help with the original feature specs and design
* Brecht Van Lommel for the original review of the code and design early on
* Blender Foundation for the Development Fund to support the project wrap up

Final patch reviewers:
* Antony Riakiotakis (psy-fi)
* Campbell Barton (ideasman42)
* Julian Eisel (Severin)
* Sergey Sharybin (nazgul)
* Thomas Dinged (dingto)

Code contributors of the original branch in github:
* Alexey Akishin
* Gabriel Caraballo
2015-04-06 10:40:12 -03:00
Sergey Sharybin
74df307ca4 Cycles: Free unused image buffers when rendering with locked interface
It is still possible to free a bit more memory by detecting buildin images
which are not used by shaders, but that's not going to improve memory usage
that much to bother about this now.

Such change brings peak memory usage from 4.1GB to 3.4GB when rendering
01_01_01_D layout scene from the Gooseberry project. Mainly because of
freeing memory used by rather huge environment map in the viewport.

Reviewers: campbellbarton, juicyfruit

Subscribers: eyecandy

Differential Revision: https://developer.blender.org/D1215
2015-04-06 17:47:08 +05:00
Sergey Sharybin
3639a70eae Fix T44222: Crash using pointiness attribute for volume shaders
This attribute is not really supported for volumes, so it get's converted to
constant 0 at shader compile time.

TODO: We should consider doing the same for tangent attribute in order to save
some annoying checks at tracing time.
2015-04-06 14:11:28 +05:00
Sergey Sharybin
a9bb8d8a73 Cycles: de-duplicate fast/approximate erf function calculation
Our own implementation is in fact the same performance as in fast_math from
OpenShadingLanguage, but implementation from fast_math is using explicit madd
function, which increases chance of compiler deciding to use intrinsics.
2015-04-06 12:49:44 +05:00
Sergey Sharybin
ab2d05d958 Fix T44269: Typo in volume_attribute_float:geom_volume.h
Was rather harmless typo since we either pass both dx,dy or pass both NULL.
2015-04-05 19:07:45 +05:00
Sergey Sharybin
b06962fcfe Cycles: Avoid using lookup table for Beckmann slopes on GPU
This patch is based on some work done in D788 and re-formulation from Beckmann
implementation in OpenShadingLanguage.

Skipping texture lookup helps a lot on GPUs where it's more expensive to access
texture memory than to do some extra calculation in threads.

CPU code still uses lookup-table based approach since this seems to be still
faster (at least on computers i've got access to).

This change gives about 2% speedup on BMW scene with GTX560TI.
2015-04-05 19:07:45 +05:00
Sergey Sharybin
252b36ce77 Cycles: Remove unused Beckmann slope sampling code
It did not preserve stratification too well and lookup-table approach was
working much better. There are now also some more interesting forumlation
from Wenzel and OpenShadingLanguage which should work better than old code.
2015-04-05 19:07:44 +05:00
Thomas Dinges
e5392069cc Cleanup: Typo fix in HSV code. 2015-04-04 07:50:09 +02:00
Sergey Sharybin
fd2ea3a909 Cycles: Make guarded allocator happy about strict C++ flags 2015-04-02 15:51:43 +05:00
Sergey Sharybin
f1494edf78 Cycles: Make SSS intersection closer to regular triangle intersection 2015-04-01 21:20:04 +05:00
Sergey Sharybin
394b947a50 Cycles: Remove unused direction from triangle intersection functions
This argument was unused and got nicely optimized out. But once it
starts to be using registers are getting stressed really crazy,
causing slow down of render.
2015-04-01 21:08:12 +05:00
Sergey Sharybin
af399884e1 Fix T44113: Ashikhmin-Shirley distribution of glossy shader at 0 roughness causes artifacts when background uses MIS
Was a division by zero error, solved in the same way as beckmann/ggx
deals with small roughness values.
2015-04-01 14:21:21 +05:00
Sergey Sharybin
79918e0577 Cycles: Avoid float/int conversion in few places 2015-03-31 19:52:14 +05:00
Sergey Sharybin
7da4c2637d Cycles: Fix typo in distance heuristic for shadow rays
It's not that bad because this typo could only caused not really
efficient BVH traversal, causing higher render times. Not as if
it was causing render artifacts.
2015-03-31 19:52:14 +05:00
Sergey Sharybin
dd0604c606 Fix T44193: Hair intersection with duplis causes flickering
It was an issue with what bounds to use for BVH node during construction.

Also corrected case when there are all 4 primitive types in the range and
also there're objects in the same range.
2015-03-31 00:24:43 +05:00
Antony Riakiotakis
bfe63bbfc4 Grey out high quality depth of field when it's not supported by GPU 2015-03-30 12:49:05 +02:00
Sergey Sharybin
b663f1f1cf Cycles: Correction to previous commit: non-msvc compilers also should use nullptr 2015-03-30 15:17:09 +05:00
Sergey Sharybin
131912dc73 Cycles: Fix compilation error with MSVC after recent C++11 changes 2015-03-30 15:06:45 +05:00
Sergey Sharybin
afbc45ed93 Cycles: Attempt to fix osl+scons compilation
Defines (and other cflags) are not inherited by scons to the subdirectories,
need to take care of them in all nested SConscripts.
2015-03-30 14:00:03 +05:00
Martijn Berger
3204aff6d0 Fix compilation of cycles network server when logging is enabled 2015-03-29 22:22:53 +02:00
Martijn Berger
f01456aaa4 Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though
Reviewers: dingto, sergey

Reviewed By: sergey

Subscribers: #cycles

Differential Revision: https://developer.blender.org/D1185
2015-03-29 22:12:40 +02:00
Sergey Sharybin
e1bcc2d779 Cycles: Code cleanyp, sky model
For as long as code stays in official folders it should follow
our code style.
2015-03-28 00:28:37 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
3d305b5a37 Cycles: Code cleanup, make strict flags happy about disabled OSL 2015-03-27 19:10:36 +05:00
Sergey Sharybin
6cd82dbf57 CMake: Enable strict flags for C++ 2015-03-27 18:23:31 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Jens Verwiebe
9fc1a29de3 Fix 2 typos ( shakin' hands ) 2015-03-25 16:56:51 +01:00
Sergey Sharybin
22dfb50622 Fix T44128: Ray visibility only enables diffuse if glossy is also enabled
Issue was caused by accident in c8a9a56 which not only disabled glossy
reflection if Glossy visibility is disabled, but also Diffuse reflection.

Quite safe and should go to final release branch.
2015-03-25 14:53:20 +05:00
Sergey Sharybin
8d0b104f43 Fix T44064: Reroute two-node loop crash
Issue was caused by cycles in shader graph confusing it's
simplification stage. Now we're ignoring links which are
marked as invalid from blender side so we don't run into
such cycles and keep graph code simple.
2015-03-25 13:46:59 +05:00
Sergey Sharybin
87cff57207 Fix T44123: Cycles SSS renders black in recent builds
Issue was introduced in 01ee21f where i didn't notice *_setup()
function only doing partial initialization, and some of parameters
are expected to be initialized by callee function.

This was hitting only some setups, so tests with benchmark scenes
didn't unleash issues. Now it should all be fine.

This is to go to the 2.74 branch and we actually might re-AHOY.
2015-03-25 02:33:49 +05:00
Antony Riakiotakis
c48ebb44ae Tidy up the user interface for depth of field based on feedback by
NudelZ on irc, thanks!
2015-03-23 12:48:19 +01:00
Sergey Sharybin
ed7e593a4b Fix T43926: Volume scatter: intersecting objects GPU rendering artifacts
Fix T44007: Cycles Volumetrics: block artifacts with overlapping volumes

The issue was caused by uninitialized parameters of some closures, which
lead to unpredictable behavior of shader_merge_closures().
2015-03-23 12:48:33 +05:00
Sergey Sharybin
919a665497 Cycles: Avoid memcpy of intersecting memory
Could happen when assignment happens to self during sorting.
2015-03-20 21:14:50 +05:00
Antony Riakiotakis
fd559ed94f Missed this last commit. 2015-03-19 21:10:41 +01:00
Antony Riakiotakis
ea12b87afd Fix cycles dof settings 2015-03-19 20:49:18 +01:00
Antony Riakiotakis
3e9947c4d4 Depth of field high quality:
A new checkbox "High quality" is provided in camera settings to enable
this. This creates a depth of field that is much closer to the rendered
result and even supports aperture blades in the effect, but it's more
expensive too. There are optimizations to do here since the technique is
very fill rate heavy.

People, be careful, this -can- lock up your screen if depth of field
blurring is too extreme.

Technical details:

This uses geometry shaders + instancing and is an adaptation of
techniques gathered from

http://bartwronski.com/2014/04/07/bokeh-depth-of-field-going-insane-

 http://advances.realtimerendering.com/s2011/SousaSchulzKazyan%20-
%20in%20Real-Time%20Rendering%20Course).ppt

TODOs:

* Support dithering to minimize banding.
* Optimize fill rate in geometry shader.
2015-03-19 15:18:14 +01:00
Sergey Sharybin
948bc66a00 Cycles: Improve readability of dumped graphs 2015-03-17 21:15:17 +05:00
Sergey Sharybin
a43d00d51e Cycles: Fix displacement code creating cyclic dependencies in graph
Bump result was passed to set_normal node and then set_node was connected
to all unconnected Normal inputs, including the one from original Bump
node, causing cycles.
2015-03-17 19:39:09 +05:00
Sergey Sharybin
60df4d10ff Fix T43999: MIS for environment broken after multi-threading commit
Typo in task start row calculation.
2015-03-16 13:31:27 +05:00
Sergey Sharybin
2ef2f085fb Add an option to mesh.calc_tessface() to get rid of polygons and loops
The purpose of this change is to add extra possibility to render engines and
export scripts to reduce peak memory footprint during their operation.

This new argument should be used with care since it'll leave mesh in not really
compatible with blender format, but it's ok to be used on temp meshes.

Unfortunately, it's hard to get scene where it'll show huge benefit because
in my tests with cycles peak memory is reached in MEM_printmemlist_stats().

However, in the file with sintel dragon it gives around 1gig of memory benefit
after removing the polys which would allow other heavy to compute stuff such as
hair (or even pointiness calculation) to not be a peak memory usage.

In any case, this change is nice to have IMO, and only means more parts of
scene export code should be optimized memory-wise.

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D1125
2015-03-13 17:39:21 +05:00
Sergey Sharybin
0e18a56432 Cycles: Free caches used by the synchronized objects
Issue this commit is addressed to is that particle system and particle modifier
will contain caches once derived mesh was requested and this cached data will
never be freed.

This could easily lead to unwanted memory peaks during synchronization stage
of rendering.

The idea is to have RNA function in object which would free caches which can't
be freed otherwise. This function is not intended to deal with derived final
since it might be used by other objects (for example by object with boolean
modifier).

This cache freeing is only happening in the background rendering and locked
interface rendering.

From quick tests with victor file this change reduces peak memory usage by
command line rendering by around 6% (1780MB vs. 1883MB). For rendering from
the interface it's about 12% (1763MB vs. 1998MB).

Reviewers: campbellbarton, lukastoenne

Differential Revision: https://developer.blender.org/D1121
2015-03-13 17:38:03 +05:00
Sergey Sharybin
63ea8dd156 Initial compilation support with C++11 featureset enabled
This commit makes some preliminary fixes and tweaks aimed to make blender
compilable with C++11 feature set. This includes:

- Build system attribute to enable C++11 featureset.

  It's for sure default OFF, but easy to enable to have a play around with
  it and make sure all the stuff is compilable before we go C++11 for real.

- Changes in Compositor to use non-named cl_int structure fields.

  This is because __STRICT_ANSI__ is defined by default by GCC and OpenCL
  does not use named fields in this case.

- Changes to TYPE_CHECK() related on lack of typeof() in C++11

  This uses decltype() instead with some trickery to make sure returned type
  is not a reference.

- Changes for auto_ptr in Freestyle

  This actually conditionally switches between auto_ptr and unique_ptr since
  auto_ptr is deprecated in C++11. Seems to be not strictly needed but still
  nice to be ready for such an update anyway/

This all based on changes form depsgraph_refactor branch apart from the weird
changes which were made in order to support MinGW compilation. Those parts of
change would need to be carefully reviewed again after official move to gcc49
in MinGW.

Tested on Linux with GCC-4.7 and Clang-3.5, other platforms are not tested and
likely needs some more tweaks.

Reviewers: campbellbarton, juicyfruit, mont29, lukastoenne, psy-fi, kjym3

Differential Revision: https://developer.blender.org/D1089
2015-03-13 16:47:40 +05:00
Sergey Sharybin
61eab743f1 Cycles: Optimization for CMJ in CUDA kernels
Two things:
- Use intrinsics for clz/ctz (ctz is implemented via ffs()).
- Use faster sqrt() function which precision is enough for
  integer values.
2015-03-13 12:38:14 +05:00
Thomas Dinges
3db0e1ef6a Cycles: Simplify volume light connect code. 2015-03-13 00:09:13 +01:00
Thomas Dinges
0ed914a194 Cleanup: Use differential helper class. 2015-03-12 23:35:01 +01:00
Sergey Sharybin
dce16d57dc Revert "Fix T43865: Cycles: Watertight rendering produces artifacts on a huge plane"
The fix was really flacky, in terms during speed benchmarks i had
abort() in the fallback block to be sure it never runs in production
scenes, but that affected on the optimization as well. Without this
abort there's quite bad slowdown of 5-7% on the renders even tho
the Pleucker fallback was never run.

This is all weird and for now reverting the change which affects on
all the production scenes and will look into alternative fixes for
the original issue with precision loss on huge planes.

This reverts commit 9489205c5c0b9b432d02be4a3d0d15fc62ee6cb9.
2015-03-12 18:24:53 +05:00
Sergey Sharybin
13d443496c Partial fix for T43967: Background is wrong in 2.74
Was missing do-versions code after rotation order change in Cycles.

This is a regression and to be ported to the final release branch.
2015-03-12 18:24:53 +05:00
Thomas Dinges
064fa4baae Cycles / Decoupled Ray Marching: Skip consecutive empty steps.
This merges consecutive empty steps in the decoupled record function,
which can lead to fewer iterations in the scatter functions.

Only helps slightly though (1%), but doesn't hurt to have this.

Differential Revision: https://developer.blender.org/D873
2015-03-12 13:50:12 +01:00
Thomas Dinges
cdb47b9dfc Cycles: Make Background MIS building threaded
Use multiple threads for building the MIS table, if the
resolution is higher than 512.
Also replace division by cdf_total, with a inverse multiplication by
cdf_total_inv. This gives further speedup.

On my Macbook (8 CPU threads) this improves the time to build the table:
Resolution 4096: From 0.16s to 0.03s
Resolution 8096: From 0.61s to 0.11s

This especially helps to reduce the scene update time, when tweaking world
shader while viewport rendering is running.

Patch by Sergey and myself.

Differential Revision: https://developer.blender.org/D1159
2015-03-12 13:50:11 +01:00
Sergey Sharybin
d4c1e98dd4 Fix T43484: Motion blur fails in certain circumstances
The issue was caused by mismatch in how aligned triangles storage was
filled in during BVH construction and how it was used during rendering.

Basically, i  was leaving uninitialized storage for triangles when
there was deformation motion blur detected for the mesh. Was likely
some sort of optimization, but in fact it's still possible that regular
triangles would be needed for rendering.

So now we're storing aligned storage for all triangle primitives and
only skipping motion triangles (the deformation motion blur flag from
mesh is now ignored).
2015-03-09 14:15:35 +05:00
Sergey Sharybin
b13b900d50 Cycles: Improve logging in object motion detection
Reporting mesh name is not really useful, since it's name does not
any relation with the original object/mesh names.
2015-03-09 13:25:27 +05:00
Sv. Lockal
c32ded3654 Cycles: add better specializations for SSE shuffle function and few more wrappers. 2015-03-07 17:25:21 +00:00
Sv. Lockal
c8fb488b08 Fix T41066: An actual fix for curve intersection on FMA-enabled CPUs 2015-03-07 16:20:34 +00:00
Sergey Sharybin
9489205c5c Fix T43865: Cycles: Watertight rendering produces artifacts on a huge plane
The issue was caused by numerical instability whrn having ray origin close to a huge
triangle, which could have aused bad ray distance check.

Watertight Woop intersection isn't really addressing such cases, it's dealing with
small triangles far away from the ray origin instead, so it's a bit tricky yo make
it working reliably.

While we're quite close to the release it's safer to do check in Pleaucker coordinates
if ray close to a huge triangle. Likely this additional check combined with some other
tweaks to the code doesn't cause measurable slowdown in the scenes tested here.

After the release we can play a bit more with this code in order to make it more
stable without Pleucker fallback.
2015-03-05 18:55:30 +05:00
Campbell Barton
da0176614b Fix T43672: Cycles preview stalls when out of view 2015-03-05 15:42:01 +11:00
Sergey Sharybin
d544bc5cd5 Cycles: Fix embarrassing type remained after getting rid of utility SWAP() 2015-03-04 00:16:21 +05:00
Sergey Sharybin
ed5df50192 Cycles: Fix/workaround for toggling world MIS causing CUDA to fail
Seems it's just another issue with the compiler, worked around by explicitly
telling not to inline some function.

In theory we can unify this with CPU, but we're quite close to the release
so better be safe than sorry.
2015-03-03 18:48:37 +05:00
Thomas Dinges
60679a171d Revert "Cleanup: Simplify camera sample motion blur code."
This reverts commit 8197f0bb645f73f41071daaccf205a7583e695f5.
2015-02-26 13:27:02 +01:00
Thomas Dinges
8197f0bb64 Cleanup: Simplify camera sample motion blur code. 2015-02-26 10:30:01 +01:00