Commit Graph

5021 Commits

Author SHA1 Message Date
Sergey Sharybin
79aa50dc53 Cycles: Enable hair for split kernels when using Intel or NVidia drivers
Apart from simply enabling this features needed changes to the code were done.
Technical change, replacing SD access from "simple" structure to SOA.
2015-05-14 18:48:56 +05:00
Thomas Dinges
0e80eb82e0 Cycles: Resize light_data after possible light removal. 2015-05-14 01:13:40 +02:00
Thomas Dinges
67eb2c7897 Cycles: Remove Emission shaders from the graph if color or strength is 0. 2015-05-14 01:13:40 +02:00
Thomas Dinges
fc31bae66f Cleanup: Avoid temp variable in portal sampling code. 2015-05-13 19:54:52 +02:00
Sergey Sharybin
93867ae549 Cycles: Cleanup: use generic utility function to set kernel arguments 2015-05-13 19:56:24 +05:00
Sergey Sharybin
51a6bc8faa Cycles: Inline sizeof of elements needed for the split kernel
No need to store them in the class, they're unlikely to be changed
and if they do change we're in big trouble anyway.

More appropriate approach would be then to typedef this things in
kernel_types.h, but still use inlined sizeof(),
2015-05-13 19:56:24 +05:00
Thomas Dinges
0a6e32173e Cleanup / Cycles: De-Duplicate Portal data fetch and side check. 2015-05-13 16:05:30 +02:00
Sergey Sharybin
f0f481031c Fix T44616: Cycles crashes loading 42k by 21k textures
Simple integer overflow issue.

TODO(sergey): Check on CPU cubic sampling, it might also need size_t.
2015-05-12 18:48:55 +05:00
Sv. Lockal
c7bccb30bf Cycles: check for F16C support with __cpuid, as we do for BMI and BMI2 2015-05-11 15:49:36 +00:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
Antony Riakiotakis
e38f914421 Cycles: use vertex buffers when possible to draw tiles on the screen.
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
2015-05-11 16:28:41 +02:00
Antony Riakiotakis
5588a51c9c Cycles OpenGL: Don't use full matrix transform when we can just use
simple addition.
2015-05-11 13:10:19 +02:00
Sv. Lockal
d55868c3b2 Cycles: And yet another compilation fix after half-float commit for clang.
Suggested by Brecht, tested with gcc > 4.4 and Clang
2015-05-10 19:32:32 +00:00
Sv. Lockal
3ec168465d Cycles: fix compilation on 32-bit Windows for half-floats
Reported by IRC user HG1.
2015-05-10 19:06:43 +00:00
Sv. Lockal
8db2a9a352 Cycles: Add -mf16c for previous commit for Scons
Thanks to Dingto for noticing!
2015-05-10 17:51:04 +00:00
Sv. Lockal
2ec221aa28 Cycles: Use native float->half conversion instructions for Haswell CPUs.
This makes OCIO viewport color correction a little bit faster (about -0.5s for 100 samples)
Also set max half float value to 65504.0 to conform with IEEE 754.
2015-05-10 16:35:51 +00:00
Sergey Sharybin
3a2c0ccdd0 Cycles: Correction to opencl whitelist check
Was using platform as a device id accidentally.
2015-05-10 20:02:06 +05:00
Thomas Dinges
a47ade34c2 Cycles: Fix tiny greying out inconsistency for Volume settings. 2015-05-10 12:59:18 +02:00
Thomas Dinges
e8be170e79 Cycles: Do not show Branched Path integrator for OpenCL.
Branched Path is not supported, neither in the Split nor Megakernel.
2015-05-10 12:59:18 +02:00
Sergey Sharybin
583fd3af65 Cycles: Fix typo in global space version of normal transform
It was using direction transform, which is obviously wrong.
2015-05-10 00:53:32 +05:00
Sergey Sharybin
136d7a4f62 Cycles: Only whitelist AMD GPU devices in the OpenCL section
Only those ones are priority for now, all the rest are still testable
if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment
variables are set.
2015-05-09 23:40:26 +05:00
Sergey Sharybin
2840a5de8f Cycles: Workaround for AMD compiler crashing building the split kernel
It's a but in compiler but it's nice to have working kernel for until
that bug is fixed.
2015-05-09 19:56:38 +05:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Sergey Sharybin
f680c1b54a Cycles: Communicate number of closures and nodes feature set to the device
This way device can actually make a decision of how it can optimize the kernel
in order to make it most efficient.
2015-05-09 19:28:00 +05:00
Sergey Sharybin
6fc1669679 Cycles: Initial work towards selective nodes support compilation
The goal is to be able to compile kernel with nodes which are actually needed
to render current scene, hence improving performance of the kernel,

The idea is:

- Have few node groups, starting with a group which contains nodes are used
  really often, and then couple of groups which will be extension of this one.

- Have feature-based nodes disabling, so it's possible to disable nodes related
  to features which are not used with the currently used nodes group.

This commit only lays down needed routines for this approach, actual split will
happen later after gathering statistics from bunch of production scenes.
2015-05-09 19:22:16 +05:00
Sergey Sharybin
17c95d0a96 Cycles: Add utility function to count maximum number of closures used by session
This will be used by split kernel in order to compile most optimal kernel.

Maximum number of closures is actually being cached in the session, so viewport
rendering will not trigger kernel re-loading when number of closures goes down.
2015-05-09 19:17:49 +05:00
Sergey Sharybin
5068f7dc01 Cycles: Add utility function to graph to query number of closures used in it
Currently unused but will be needed soon for the split kernel work.
2015-05-09 19:13:32 +05:00
Sergey Sharybin
b3299bace0 Cycles: Pass requested tile size to the device via device task
This is currently unused but crucial for things like calculating amount of
device memory required to deal with the tasks.

Maybe not really best place to store it, but consider it good enough for now.
2015-05-09 19:09:07 +05:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
d69c80f717 Cycles: Presumably correct workaround for addrspace in camera motion blur 2015-05-09 19:04:19 +05:00
Sergey Sharybin
c9133778cf Cycles: Add CPU compat headers to some of the OSL implementation files
This header was already included into some of the implementation files already,
and this change is needed for some upcoming changes in the way how kernel_types.h
works.
2015-05-09 19:04:16 +05:00
Sergey Sharybin
7eac672e4f Cycles: Set default closure values to some of the nodes
Previously it was only set at compilation time which is all fine but does
not let us to check which closure the node corresponds to prior to the
compilation.
2015-05-09 19:04:09 +05:00
Thomas Dinges
900fc43bb4 Cleanup: Remove unused ray type flags.
They were added for completeness, but it seems we don't need them.
2015-05-08 12:10:26 +02:00
Sergey Sharybin
9ca2b76a9f Cycles: Cleanup, make it more clear what endif closes what ifdef 2015-05-07 15:02:43 +05:00
Campbell Barton
165598e49e Correct typo: ifdef'd now, but obviously wrong 2015-05-07 10:12:12 +10:00
c641a5563f Fix T44612: add support for mouse button 6 and 7 on OS X. 2015-05-05 21:52:09 +02:00
Sergey Sharybin
b45ad4b214 Cycles: Fix for wrong clamp usage in fast math 2015-05-06 00:01:40 +05:00
Campbell Barton
962d53e144 Workaround ld.gold failing with msgfmt 2015-05-06 03:23:20 +10:00
Thomas Dinges
d01b226870 Cleanup: Remove leftover from Distorted Noise node in XML reader. 2015-05-05 10:38:45 +02:00
Sv. Lockal
7201f6d14c Cycles: Use curve approximation for blackbody instead of lookup table
Now we calculate color in range 800..12000 using an approximation a/x+bx+c for R and G and ((at + b)t + c)t + d) for B.
Max absolute error for RGB for non-lut function is less than 0.0001, which is enough to get the same 8 bit/channel color as for OSL with a noticeable performance difference.
However there is a slight visible difference between previous non-OSL implementation because of lookup table interpolation and offset-by-one mistake.
The previous implementation gave black color outside of soft range (t > 12000), now it gives the same color as for 12000.

Also blackbody node without input connected is being converted to value input at shader compile time.

Reviewers: dingto, sergey

Reviewed By: dingto

Subscribers: nutel, brecht, juicyfruit

Differential Revision: https://developer.blender.org/D1280
2015-05-05 06:11:54 +00:00
Campbell Barton
e59bd19fa7 Cleanup: style & const's 2015-05-05 05:19:49 +10:00
Thomas Dinges
66f96e555c Cycles: Fix copy / paste mistake in XML reader. 2015-05-04 14:31:20 +02:00
Sergey Sharybin
b7d0ff0ad6 Separate scene simplification into viewport and render
This way it is possible to have viewport simplification bumped all the way up,
making viewport really responsive but still have final render to use highest
subdivision possible.

Reviewers: lukastoenne, campbellbarton, dingto

Reviewed By: campbellbarton, dingto

Subscribers: dingto, nutel, eyecandy, venomgfx

Differential Revision: https://developer.blender.org/D1273
2015-05-04 16:31:10 +05:00
Sergey Sharybin
16794f908f Cycles: Fix possible uninitialized XML read state which might cause crashes 2015-04-30 15:46:09 +05:00
Sergey Sharybin
41d817f15d Fix T44548: Cycles Tube Mapping off / not compatible with BI
Was a typo in original implementation, probably a result of some code reshuffle
happened for optimization reasons.
2015-04-30 14:27:16 +05:00
Thomas Dinges
4eab0e72b3 Cleanup: Update some comments and add ToDo. 2015-04-29 23:56:46 +02:00
Thomas Dinges
b3def11f5b Cycles: Record all possible volume intersections for SSS and camera checks
This replaces sequential ray moving followed with scene intersection with
single BVH traversal, which gives us all possible intersections.

Only implemented for CPU, due to qsort and a bigger memory usage on GPU
which we rather avoid. GPU still uses the regular bvh volume intersection code, while CPU now uses the new code.

This improves render performance for scenes with:
a) Camera inside volume mesh
b) SSS mesh intersecting a volume mesh/domain

In simple volume files (not much geometry) performance is roughly the same
(slightly faster). In files with a lot of geometry, the performance
increase is larger. bmps.blend with a volume shader and camera inside the
mesh, it renders ~10% faster here.

Patch by Sergey and myself.

Differential Revision: https://developer.blender.org/D1264
2015-04-29 23:31:06 +02:00
Sergey Sharybin
7aab5c6ca9 Cycles: Fix wrong termination criteria in SSS volume stack update
Another issue spotted with Thomas.
2015-04-30 01:20:17 +05:00
Sergey Sharybin
e5f3193df3 Cycles: Fix wrong order in object flags calculations
Object flags are depending on bounding box which is only available after
mesh synchronization.

This was broken since 7fd4c44 which happened quite close to the release
and oddly enough was not sopped by anyone. Render test is coming for this.

Was spotted by Thomas Dinges while working on another patch.
2015-04-30 01:09:48 +05:00
Sergey Sharybin
d6b28bbb1d Cycles: Fix crashes when loading cache created with pre-leaf split builds 2015-04-29 15:48:49 +05:00