Commit Graph

131 Commits

Author SHA1 Message Date
Sergey Sharybin
3fba620858 Cycles: Prepare for more image extension types support
Basically just replace boolean periodic flag with extension type enum in the
device API.
2015-07-28 14:14:24 +02:00
Sergey Sharybin
1788293a01 Fix T45381: Crash Blender 2.75 in Win7 x64 AMD card
Previous fix didn't work well enough because on Windows Python has different
environment than Blender ans setting variables in there made no effect from
Blender point of view.
2015-07-23 12:10:38 +02:00
Sergey Sharybin
4bca8a6bc5 Fix T45484: Regression OpenCL split: access violation
That was a primary school error caused by moving statements inside assert()
which effectivly disabled crucial code in release builds.
2015-07-18 23:30:19 +02:00
Sergey Sharybin
45b5bf034b Cycles; Make baking a feature-specific option
This means render devices now might skip building baking kernels in cases when
only actual render-related functionality is used.

For now it's only implemented for OpenCL split kernel device and mainly needed
to work around some compiler-specific bugs which crashes on building the kernel.

Using OpenCL for baking might still crash the driver, but at least there is now
higher probability of that GPU will be usable to render the scene.

Real fix should actually be done in the driver side.
2015-07-18 16:02:08 +02:00
Sergey Sharybin
36a952e3e4 Cycles: Use feature-selective base kernel compilation when using split kernel
The idea is to make all kernels as small as possible to work around possible
issues with buggy drivers which might fail building feature-complete kernels.

It's indeed just a workaround to make at last simple test scenes to render
on OpenCL. Real fix should happen from the driver side.
2015-07-18 16:02:08 +02:00
Sergey Sharybin
5e4a8c6a87 Cycles: Some cleanup if OpenCL base kernel load_kernel()
Hopefully makes it less clumzy, should be no functional changes still.
2015-07-18 16:02:08 +02:00
Sergey Sharybin
025eda57da Cycles: Make OpenCL cache follow out code style a bit closer 2015-07-18 16:02:08 +02:00
Sergey Sharybin
548e650252 Cycles: Merging of patch from OSX went wrong in the previous change
That's what happens when you can't commit from a system you're making
changes at and someone is behind your back...

Sorry for the noise.
2015-07-15 15:12:19 +02:00
Sergey Sharybin
2b97ad348c Cycles: Missed this in the previous commit 2015-07-15 15:11:02 +02:00
Sergey Sharybin
56bf25d219 Cycles: Enable OpenCL rendering on Apple OSX
Requires having latest El Capitan beta 3 OSX due to ome crucial fixes made in the
compiler. Supports same features as NVidia OpenCL apart from CMJ (there's no
experimental feature set support in megakernel yet).

Uses megakernel internally, which works much better than the split kernel. Split
kernel is not supported on OSX still, needs to be investigated still.

Some more details can be found there:

  http://wiki.blender.org/index.php/Dev:2.6/Source/Render/Cycles/OpenCL#AMD_on_OSX
2015-07-15 14:20:59 +02:00
Sergey Sharybin
a79d47b14e Cycles: Add logging to detected OpenCL platforms and devices
Happens on verbosity level 2, should help looking into some of the
bug reports in the tracker.
2015-07-14 09:56:00 +02:00
Sergey Sharybin
3dc86f586c Cycles: Add debug print about CLEW initialization status 2015-07-07 14:37:12 +02:00
Sergey Sharybin
37539962fe Cycles: Add an option to force disable all OpenCL devices
This way it's possible to disable OpenCL devices for AMD devices
which are considered whitelisted.
2015-07-07 14:18:45 +02:00
Sergey Sharybin
36426c3ee2 Cycles: Code cleanup, double semicolon 2015-07-03 15:44:57 +02:00
Sergey Sharybin
c864f5d140 Cycles: Error enqueueing split kernels should no longer cause infinite loop 2015-07-03 12:13:38 +02:00
Sergey Sharybin
78de47ca24 Cycles: Fix zero-size buffer allocation with OpenCL devices
This is not really supported by OpenCL but might happen in certain
configurations. There might be some remained cases when this happens
but so far can not find any,
2015-07-01 11:56:48 +02:00
Sergey Sharybin
09dc470982 Cycles: Rework the way how OpenCL devices are created
It was annoying copy-paste happened across OpenCL device constructor, device
enumeration and split kernel checks. Now those areas are using an utility
function which returns pairs of platform and device IDs for devices which are
supported by Cycles and enumeration is happening inside that list.

This makes it so filtering is happening in a single place, so there's no need
to keep 3 different functions in sync.

This commit also fixes a bug with wrong enumeration of devices caused by recent
fixes. Those fixes were in fact wrong and only happened to appear to be working
on laptop with optimus card on Linux. Root of those issues is in fact in bad
Linux driver for optimus cards.
2015-06-27 15:13:08 +02:00
Campbell Barton
c40759e678 Cleanup: warnings 2015-06-24 18:42:16 +10:00
Sergey Sharybin
4ed6605d65 Cycles: Don't show devices which does not support OpenCL 1.1 in the menu
They'll be checked for the version later and that check will fail anyway,
so better to not allow user to see unsupported device in the list.

Also corrected one more issue with the device enumeration.
2015-06-18 11:26:22 +02:00
Sergey Sharybin
ae3e37b899 Cycles: Fix wrong numbering of OpenCL devices when some of them are skipped
Skipped devices did not reflect in the device number, which might result in bad
array indices.

This might also resolve T45037, and need to be ported to a release branch.
2015-06-17 11:35:39 +02:00
Sergey Sharybin
2ebaa69676 Cycles: Move requested feature conversion to an own function
This way it could be used for the shader/baking kernels easily n the future.
making those kernels more optimal.
2015-06-08 11:15:40 +02:00
Sergey Sharybin
8c2750bc82 Cycles: Remove round-up trickery for max closure in split OpenCL kernel
Round-up was only enabled for viewport render, which was for a long time hardcoded to
use 64 closures. This was done in order to avoid unnecessary kernel re-compilations
when tweaking the shader tree.

We could enable selective closure compilation in the viewport later if it'll give
measurable speed improvements, but even then round-up is to happen outside of the
device level,

This commit also removes early output which happened in cases when max closure did
not change. It was wrong because other requested kernel features might have been
changed.
2015-06-08 11:15:39 +02:00
Sergey Sharybin
27ed75271c Cycles: Make hair, object and motion blur selective compiled into OpenCL
This features are now based on the scene settings, so scenes without those features
used are rendered even faster.

This gives about 30% speedup on the AMD A10 APU here, but at the same time it does
not mean such an improvement will happen on all the hardware. That being said, the
Tonga device here seems to have no measurable difference.

In any case it seems handy to have for the future, when we'll want to support SSS
in the kernel or to port selective compilation/split kernel to CUDA devices.
2015-06-08 11:15:39 +02:00
Sergey Sharybin
28f798f86e Cycles: Initial support for OpenCL capabilities reports
For now it's just generic information, still need to expose memory, workgorup
sizes and so on.
2015-06-05 14:17:30 +02:00
Sergey Sharybin
9d4d55e78b Cycles: Strip meaningless empty output form the MVidia OpenCL compiler 2015-06-01 19:49:53 +05:00
Sergey Sharybin
36ef6d1532 Cycles: Report build flags used for the OpenCL kernel compilation
For now it's reported to the stdout, matching to the CUDA behavior.
In the future we can hide this into GLog logging once the kernels
are considered all stable and so.
2015-06-01 19:49:52 +05:00
Sergey Sharybin
cf19012fb0 Fix T44831: Crash when using Intel OpenCL with split kernel
The issue was caused by underallocation of object motion related arrays,
which happened by accident.
2015-05-26 21:29:21 +05:00
Thomas Dinges
c3ab5b3089 Fix T44830, wrong sample progress number when using split device.
Value was not set, moved it out of the constructor into
device_opencl_create() now.
2015-05-25 00:37:01 +02:00
Sergey Sharybin
2c503d8303 Cycles: Restructure kernel files organization
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.

Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.

Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.

There should be no functional changes.

Reviewers: juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1314
2015-05-22 16:31:34 +05:00
Thomas Dinges
a934730368 Cycles: Remove TM / R and whitespace from OpenCL device names.
Was already done for CPU devices, now we also do this for OpenCL.
2015-05-21 23:43:18 +02:00
Sergey Sharybin
d4c676e81b Cycles: CYCLES_OPRNCL_DEBUG now affects on split kernel as well 2015-05-21 14:30:33 +05:00
Sergey Sharybin
f18d77b874 Cycles: Restore some lost custom cflags passed to the kernel compilation
They were lost during simplification of kernel loading but might be rather
crucial for the performance.

Also made it so cflags are shared across kernels. Surely it might lead to
some unwanted kernel re-compilation but at the same time they might easily
run out of sync with the changes in kernel and so.
2015-05-21 14:05:53 +05:00
Sergey Sharybin
148ed4e05e Cycles: Cleanup, synchronize name across file name, program and kernel names 2015-05-20 23:10:07 +05:00
Sergey Sharybin
6f48df45ee Cycles: Simplify code around kernel loading 2015-05-20 23:10:07 +05:00
Thomas Dinges
105b87a3f7 Cycles: Enable advanced shading on AMD / OpenCL.
That is needed for Motion Blur and Render Passes to work properly.
I hope there are no nasty side effects, but we need to test this.
2015-05-17 19:29:33 +02:00
Thomas Dinges
14c2bc53c0 Cleanup: Typos, typos everywhere. :D 2015-05-17 18:32:31 +02:00
Campbell Barton
daeb3069cf Cleanup: typos 2015-05-17 16:09:32 +10:00
Campbell Barton
31e96cbf96 Cleanup: style, spelling 2015-05-15 23:38:53 +10:00
Sergey Sharybin
c2b9f78415 Cycles: Pass __KERNEL_EXPERIMENTAL__ to OpenCL split kernels
Experimental feature set id currently unavailable for megakernel, it'll
require some changes to the cache system to distinguish cached regular
kernels from cached experimental kernels.

Currently unused, but some features will be enabled soon.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
960d7df56f Cycles: Pass device compute capabilities to kernel via build options
This way it's possible to do device-selective feature disabling/enabling.
Currently only supported for NVidia devices via OpenCL extension.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
03f9d5a4cf Cycles: Cleanup, move build options string calculation into the device class
This way it's easier to access platform name, device ID and other stuff which
might be needed to define build options.
2015-05-15 13:22:47 +05:00
Sergey Sharybin
3c10ec96b5 Cycles: Enable object motion blur on Intel OpenCL platform
This required allocating some memory related on object transform needed
by ShaderData and currently it is done for all the platforms. Since we're
targeting full feature-complete platforms this is rather acceptable at
this point and in the future we'll do selective NO_HAIR/NO_SSS/NO_BLUR
kernels.

This is experimental still and in fact there're some major issues on
NVidia platform and it's not really clear if it's a bug in compiler,
some uninitizlied variable or other kind of issue.
2015-05-15 00:48:12 +05:00
Sergey Sharybin
03565218d5 Cycles: Various fixes
Some stupid fixes like spaces around operator and missing semicolon,
plus fix for wrong detecting of ShaderData SOA size. Thar was harmless
since there's only one closure array, but still better to fix this.
2015-05-15 00:42:05 +05:00
Sergey Sharybin
f6c6dd44de Cycles: Remove meaningless ifdef checks for features in device_opencl
This file was actually checking for features enabled on CPU and surely all
of them were enabled, so removing them does not cause any difference.

ideally we'll need to do runtime feature detection and just pass some stuff
as NULL to the kernel, or maybe also have variadic kernel entry points which
is also possible quite easily.
2015-05-14 23:44:19 +05:00
Sergey Sharybin
93867ae549 Cycles: Cleanup: use generic utility function to set kernel arguments 2015-05-13 19:56:24 +05:00
Sergey Sharybin
51a6bc8faa Cycles: Inline sizeof of elements needed for the split kernel
No need to store them in the class, they're unlikely to be changed
and if they do change we're in big trouble anyway.

More appropriate approach would be then to typedef this things in
kernel_types.h, but still use inlined sizeof(),
2015-05-13 19:56:24 +05:00
Sergey Sharybin
3a2c0ccdd0 Cycles: Correction to opencl whitelist check
Was using platform as a device id accidentally.
2015-05-10 20:02:06 +05:00
Sergey Sharybin
136d7a4f62 Cycles: Only whitelist AMD GPU devices in the OpenCL section
Only those ones are priority for now, all the rest are still testable
if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment
variables are set.
2015-05-09 23:40:26 +05:00
George Kyriazis
7f4479da42 Cycles: OpenCL kernel split
This commit contains all the work related on the AMD megakernel split work
which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus
some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely
someone else which we're forgetting to mention.

Currently only AMD cards are enabled for the new split kernel, but it is
possible to force split opencl kernel to be used by setting the following
environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1.

Not all the features are supported yet, and that being said no motion blur,
camera blur, SSS and volumetrics for now. Also transparent shadows are
disabled on AMD device because of some compiler bug.

This kernel is also only implements regular path tracing and supporting
branched one will take a bit. Branched path tracing is exposed to the
interface still, which is a bit misleading and will be hidden there soon.

More feature will be enabled once they're ported to the split kernel and
tested.

Neither regular CPU nor CUDA has any difference, they're generating the
same exact code, which means no regressions/improvements there.

Based on the research paper:

  https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf

Here's the documentation:

  https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit

Design discussion of the patch:

  https://developer.blender.org/T44197

Differential Revision: https://developer.blender.org/D1200
2015-05-09 19:52:40 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00