Commit Graph

53 Commits

Author SHA1 Message Date
Brecht Van Lommel
2e3035dd80 Cycles OpenCL: make displacement and world importance sampling work. 2013-06-21 13:05:08 +00:00
Brecht Van Lommel
16204bd647 Cycles: prepare to make CUDA 5.0 the official version we use
* Add CUDA compiler version detection to cmake/scons/runtime
* Remove noinline in kernel_shader.h and reenable --use_fast_math if CUDA 5.x
  is used, these were workarounds for CUDA 4.2 bugs
* Change max number of registers to 32 for sm 2.x (based on performance tests
  from Martijn Berger and confirmed here), and also for NVidia OpenCL.

Overall it seems that with these changes and the latest CUDA 5.0 download, that
performance is as good as or better than the 2.67b release with the scenes and
graphics cards I tested.
2013-06-19 17:54:23 +00:00
Brecht Van Lommel
0ad88d1001 Fix another windows / msvc build error. 2013-06-01 02:39:34 +00:00
Brecht Van Lommel
4f056d1be7 Fix windows / msvc build error. 2013-06-01 02:28:57 +00:00
Brecht Van Lommel
2d0a586c29 Cycles OpenCL: keep the opencl context and program around for quicker rendering
the second time, as for example Intel CPU startup time is 9 seconds.

* Adds an cache for contexts and programs for each platform and device pair,
  which also ensure now no two threads try to compile and write the binary cache
  file at the same time.
* Change clFinish to clFlush so we don't block until the result is done, instead
  it will block at the moment we copy back memory.
* Fix error in Cycles time_sleep implementation, does not affect any active code
  though.
* Adds some (disabled) debugging code in the task scheduler.

Patch #35559 by Doug Gale.
2013-05-31 16:19:03 +00:00
Thomas Dinges
722680d7cf Cycles / OpenCL:
* Use advanced shading for nvidia as well, works fine on my Geforce 540M with sm_21. 
I tested the files from regression suite.
2013-05-27 17:13:36 +00:00
Brecht Van Lommel
4bdb54a76e Cycles OpenCL: patch #35514 by Doug Gale
* Support using devices from all OpenCL platforms, so that you can use e.g. both
  Intel and NVidia OpenCL implementations if you have them installed.
* Fix compile error due to missing fmodf after recent math node change.
* Enable advanced shading for Intel OpenCL.
* CYCLES_OPENCL_DEBUG environment variable for generating debug symbols so you
  can debug with gdb. This crashes the compiler with Intel OpenCL on Linux though.
  To make this work the preprocessed kernel source code is written out, as gdb
  needs this.
* Show OpenCL compiler warnings even if the build succeeded.
* Some small fixes to initialize cdDevice to NULL, add missing NULL check when
  creating buffer and add missing space at end of build options for Apple OpenCL.
* Fix crash with multi device + opencl, now e.g. CPU + GPU render should work.

I did a few tweaks to the code and also:

* Fix viewport render failing sometimes with Apple CPU OpenCL, was not taking
  workgroup size limits into account properly.
* Add compile error when advanced shading in the Blender binary and OpenCL kernel
  are not in sync.
2013-05-27 16:21:07 +00:00
Thomas Dinges
11707119de Cycles:
* Code cleanup, remove unused "resolution" variable from the DeviceTask class, was never used.
2013-05-14 21:18:20 +00:00
Thomas Dinges
522eeaa6a0 Cycles / OpenCL:
* Remove old comment for sm_13 cards and really check for OpenCL 1.1.
2013-05-09 16:16:41 +00:00
Brecht Van Lommel
d0ffbeec73 Cycles OpenCL: a few fixes to get things compiling after kernel changes,
for Apple OpenCL on OS X 10.8 and simple AO render.

Also environment variable CYCLES_OPENCL_TEST can now be set to CPU, GPU,
ACCELERATOR, DEFAULT or ALL values to test particuler devices.
2013-05-09 14:05:40 +00:00
Thomas Dinges
f146317b09 Cycles:
* CUDA: Make it more clear that sm_12 and below is not supported.
* OpenCL: __KERNEL_SHADING__ was declared twice for nvidia opencl device.
* Some reshuffle of defines in kernel_types.h. No functional changes.
2013-01-15 19:02:17 +00:00
Sergey Sharybin
6eec49ed20 Cycles: memory usage report
This commit adds memory usage information while rendering.

It reports memory used by device, meaning:

- For CPU it'll report real memory consumption
- For GPU rendering it'll report GPU memory consumption, but it'll
  also mean the same memory is used from host side.

This information displays information about memory requested by Cycles,
not memory really allocated on a device. Real memory usage might be
higher because of memory fragmentation or optimistic memory allocator.

There's really nothing we can do against this.

Also in contrast with blender internal's render cycles memory usage
does not include memory used by scene, only memory needed by cycles
itself will be displayed. So don't freak out if memory usage reported
by cycles would be much lower than blender internal's.

This commit also adds RenderEngine.update_memory_stats callback which
is used to tell memory consumption from external engine to blender.
This information is used to generate information line after rendering
is finished.
2012-11-05 08:04:57 +00:00
Sergey Sharybin
3b88a29abf Cycles: progressive refine option
Just makes progressive refine :)

This means the whole image would be refined gradually using as much
threads as it's set in performance settings. Having enough tiles is
required to have this option working as it's expected.

Technically it's implemented by repeatedly computing next sample for
all the tiles before switching to next sample.

This works around 7-12% slower than regular tile-based rendering, so
use this option only if you really need it.

This commit also fixes progressive update of image when Save Buffers
option is enabled.

And one more thing this commit fixes is handling display buffer with
Save Buffers option enabled. If this option is enabled image buffer
wouldn't have neither byte nor float buffer until image is fully
rendered which could backfire in missing image while rendering in
cases color management cache became full.

This issue solved by allocating byte buffer for image buffer from
tile update callback.

Patch was reviewed by Brecht. He also made some minor edits to
original version to patch. Thanks, man!
2012-10-13 12:38:32 +00:00
Lukas Toenne
efaf512406 Revert r50528: "Performance fix for Cycles: Don't wait in the main UI thread when resetting devices."
This commit leads to random freezes in Cycles rendering:
https://projects.blender.org/tracker/index.php?func=detail&aid=32545&group_id=9&atid=498

The goal of this commit was to remove UI lag for OSL, but since that is not officially supported yet, better revert it until a proper fix can be implemented in 2.65.
2012-09-17 12:07:06 +00:00
Brecht Van Lommel
3d38ad1b17 Attempted fix for #32415: tighten up cycles opencl initialization checks to try to
avoid crashes. Don't think these should be needed but maybe it helps.
2012-09-12 11:25:47 +00:00
Lukas Toenne
31ed71cb6b Performance fix for Cycles: Don't wait in the main UI thread when resetting devices.
When the scene is updated Cycles resets the renderer device, cancelling
all existing tasks. The main thread would wait for all running tasks to
finish before continuing. This is ok when tasks can actually cancel in a
timely fashion. For OSL however, this does not work, since the OSL
shader group optimization takes quite a bit of time and can not be
easily be cancelled once running (on my crappy machine in full debug
mode: ~0.12 seconds for simple node trees). This would lead to very
laggy UI behavior and make it difficult to accurately control elements
such as sliders.

This patch removes the wait condition from the device->task_cancel
method. Instead it just sets the do_cancel flag and returns. To avoid
backlog in the task pool of the device it will return early from the
BlenderSession::sync function while the reset is going on (tested in
Session::resetting). Once all existing tasks have finished the do_cancel
flag is finally cleared again (checked in TaskPool::num_decrease).

Care has to be taken to avoid race conditions on the do_cancel flag,
since it can now be modified outside the TaskPool::cancel function
itself. For this purpose the scope of the TaskPool::num_mutex locks has
been extended, in most cases the mutex is now locked by the TaskPool
itself before calling TaskScheduler methods, instead of only locking
inside the num_increase/num_decrease functions themselves. The only
occurrence of a lock outside of the TaskPool methods is in
TaskScheduler::thread_run.

This patch is most useful in combination with the OSL renderer mode, so
it can probably wait until after the 2.64 release. SVM tasks tend to be
cancelled quickly, so the effect is less noticeable.
2012-09-11 11:41:51 +00:00
Brecht Van Lommel
adea12cb01 Cycles: merge of changes from tomato branch.
Regular rendering now works tiled, and supports save buffers to save memory
during render and cache render results.

Brick texture node by Thomas.
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Textures#Brick_Texture

Image texture Blended Box Mapping.
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Textures#Image_Texture
http://mango.blender.org/production/blended_box/

Various bug fixes by Sergey and Campbell.
* Fix for reading freed memory in some node setups.
* Fix incorrect memory read when synchronizing mesh motion.
* Fix crash appearing when direct light usage is different on different layers.
* Fix for vector pass gives wrong result in some circumstances.
* Fix for wrong resolution used for rendering Render Layer node.
* Option to cancel rendering when doing initial synchronization.
* No more texture limit when using CPU render.
* Many fixes for new tiled rendering.
2012-09-04 13:29:07 +00:00
Thomas Dinges
2fcd6827bf Cycles:
* Removed outdated OpenCL comments, kernel features are defined in kernel_types.h now.
2012-08-01 14:56:15 +00:00
Campbell Barton
c6cffe98fa code cleanup: removed/renamed shadow & duplicate variable definitions. 2012-06-09 18:20:40 +00:00
Campbell Barton
0fbb6bff27 style cleanup: block comments 2012-06-09 17:22:52 +00:00
Campbell Barton
857dedbc58 style cleanup 2012-05-27 00:36:50 +00:00
Brecht Van Lommel
dd9c1b7fbf Cycles: OpenCL image texture support, fix an attribute node issue and refactor
feature enabling #defines a bit.
2012-05-13 12:32:44 +00:00
Thomas Dinges
ed47be3bf2 Cycles/OpenCL:
* Reverted the general activation of __KERNEL_SHADING__.
Better to handle this in the device file. This way each platform gets specifically what it is capable of atm. 

* Nvidia has Shading + Multi Closure
* AMD (Apple) has only Clay Render
* AMD (non Apple) has Basic Shading
2012-04-09 17:44:33 +00:00
Thomas Dinges
d024238fb2 Cycles / OpenCL:
* Enable __KERNEL_SHADING__ per default for OpenCL.
This enables basic shading (color, emission, textures...) for AMD cards. 

You need the latest AMD catalyst driver in order to have this work.
2012-04-05 16:19:51 +00:00
Brecht Van Lommel
f4bb31f26b Cycles: tweak for AMD opencl compile of advanced shading, from Daniel Genrich,
still does not work correct but should compile if you have enough memory.
2012-02-24 15:53:19 +00:00
Brecht Van Lommel
dc181ea7e4 Fix: cycles crash with multiple OpenCL platforms installed, tracked down by Sergey. 2012-02-20 14:19:34 +00:00
Brecht Van Lommel
803286dde8 Cycles: render passes for CUDA cards with compute model >= 2.x. 2012-01-26 19:07:01 +00:00
Brecht Van Lommel
d7932ceea8 Cycles: multi GPU rendering support.
The rendering device is now set in User Preferences > System, where you can
choose between OpenCL/CUDA and devices. Per scene you can then still choose
to use CPU or GPU rendering.

Load balancing still needs to be improved, now it just splits the entire
render in two, that will be done in a separate commit.
2012-01-09 16:58:01 +00:00
Brecht Van Lommel
049ab98469 Cycles: device code refactoring, no functional changes. 2012-01-04 18:06:32 +00:00
Brecht Van Lommel
690de79580 Cycles: some tweaks for apple opencl with ATI cards, to get it working up to
the level of ambient occlusion render, shaders still fail. Fixes found with
much help from Jens and Dalai.
2011-12-20 17:36:56 +00:00
Brecht Van Lommel
72d2d05770 Cycles: border rendering support, includes some refactoring in how pixels are
accessed on devices.
2011-12-20 12:25:37 +00:00
Brecht Van Lommel
9e01abf777 Cycles: require Experimental to be set to enable CUDA on cards with shader model
lower than 1.3, since we're not officially supporting these. We're already not
providing CUDA binaries for these, so better make it clear when compiling from
source too.
2011-12-12 22:51:35 +00:00
Brecht Van Lommel
086e4ed825 Cycles: improve error reporting for opencl and cuda, showing error messages in
viewport instead of only console.
2011-11-22 20:49:33 +00:00
Brecht Van Lommel
eb2baf9abc Fix #29274: problem compiling cycles opencl kernel from directory with spaces.
Some drivers don't support passing include paths with spaces in them, nor does
the opencl spec specify anything about how to quote/escape such paths, so for
now we just resolved #includes ourselves. Alternative would have been to use c
preprocessor, but this also resolves all #ifdefs, which we do not want.
2011-11-22 16:38:58 +00:00
Brecht Van Lommel
47853bf6f6 Cycles: OpenCL tweaks
* Reduce kernel arguments size, helps compile for apple nvidia.
* Fix use of unitialized variable in displace kernel.
* Use build flags in opencl kernel md5 hash.
* Reorganize code for kernel feature #defines a bit.
2011-11-22 13:15:19 +00:00
Thomas Dinges
880225db77 OpenCL/Nvidia:
* Enable OpenCL Full Shading on NVIDIA cards.

Notes:
It makes not much sense to use OpenCL on a nVidia card (as it is slower compared to CUDA), but as OpenCL comes without dependencies, it's an good alternative if you don't want to install the CUDA toolkit or the build comes without CUDA kernels.
2011-11-12 22:22:00 +00:00
Brecht Van Lommel
5fd67a3ba5 Cycles: enable multi closure sampling and transparent shadows only on CPU and
CUDA cards with shader model >= 2 for now (GTX 4xx, 5xx, ..). The CUDA compiler
can't handle the increased kernel size currently.
2011-10-16 18:54:27 +00:00
Brecht Van Lommel
e9b967d05b Cycles: remove deprecated strict aliasing flag for opencl, fix missing update
modifying object layer in properties editor, and add memarena utility.
2011-09-19 11:57:31 +00:00
Brecht Van Lommel
66b1dfae89 Cycles: tweaks to properties and nodes
* Passes renamed to samples
* Camera lens radius renamed to aperature size/blades/rotation
* Glass and fresnel nodes input is now index of refraction
* Glossy and velvet fresnel socket removed
* Mix/add closure node renamed to mix/add shader node
* Blend weight node added for shader mixing weights

There is some version patching code for reading existing files, but it's not
perfect, so shaders may work a bit different.
2011-09-16 13:14:02 +00:00
Brecht Van Lommel
28cb4cb957 Cycles: reenable opencl binary caching on mac, it's not the cause of the problem. 2011-09-16 10:29:30 +00:00
Brecht Van Lommel
089abdecf7 Cycles: attempted fixes for OS X preview render problem, and disable
kernel cache there now as well since it seems to give issues there.
2011-09-14 22:26:55 +00:00
Brecht Van Lommel
cfbd6cf154 Cycles:
* OpenCL now only uses GPU/Accelerator devices, it's only confusing if CPU
  device is used, easy to enable in the code for debugging.
* OpenCL kernel binaries are now cached for faster startup after the first
  time compiling.
* CUDA kernels can now be compiled and cached at runtime if the CUDA toolkit
  is installed. This means that even if the build does not have CUDA enabled,
  it's still possible to use it as long as you install the toolkit.
2011-09-09 12:04:39 +00:00
Brecht Van Lommel
7d9d9fa976 Cycles: use workgroup size from opencl, attempt to fix issue with apple opencl. 2011-09-05 12:24:28 +00:00
Brecht Van Lommel
f3ee10ce5c Cycles: remove -Werror from opencl compile options, apple opencl on lion
seems to give "no previous prototype for function" warning, but it doens't
make much sense in opencl, seems like a compiler bug?
2011-09-05 08:23:01 +00:00
Brecht Van Lommel
db1664ed4c Cycles:
* Compute MD5 hash to deal with nvidia opencl compiler cache not recognizing
  changes in #included files, makes it possible to do kernel compile only
  once and remember it for the next time blender is started.
* Kernel tweak to compile with ATI/linux. Enabling any more functionality than
  simple clay render still chokes the compiler though, without a specific error
  message ..
2011-09-03 10:49:54 +00:00
Brecht Van Lommel
67030aaf84 Cycles: optimizations for instances in scene updates before render starts,
should load a non-trivial mesh instanced many times quite a bit faster now.
2011-09-02 16:15:18 +00:00
Brecht Van Lommel
1135875ab1 Cycles:
* Fix crash in light path node
* Fix struct alignment issue for cuda
* Fix issue with instances taking up too much memory
* Fix issue with ray visibility working incorrect on some objects
* Enable OpenCL always and remove option, it has no dependencies so may as well
* Refuse to load kernel if OpenCL version < 1.1, recent drivers are needed
* Better error handling for OpenCL device
* 3D views with rendered draw mode will now revert to wireframe on file load
2011-09-02 14:55:06 +00:00
Brecht Van Lommel
3c7dcd7a47 Cycles: compile opencl kernels in non-blocking thread, and don't crash on
build failure but show error message in status text.
2011-09-02 00:10:03 +00:00
Brecht Van Lommel
1f7ac51c36 Cycles: fix opencl device bug that crashed on windows. Now shows diffuse
meshes here on windows/nvidia.
2011-09-01 22:40:52 +00:00
Brecht Van Lommel
27102bfec4 Cycles: OpenCL library is now dynamically loaded so that blender doesn't crash
if it's not installed on the system.

Code copied from clew.h/clew.c in CLCC:
http://clcc.sourceforge.net/
2011-09-01 19:00:23 +00:00