Commit Graph

24 Commits

Author SHA1 Message Date
Sergey Sharybin
e4b910a0aa Cycles: __KERNEL_DEBUG__ wasn't set for compile-time kernels 2014-10-05 21:42:53 +06:00
Thomas Dinges
fb3f32760d Cycles: Add an experimental CUDA kernel.
Now we build 2 .cubins per architecture (e.g. kernel_sm_21.cubin, kernel_experimental_sm_21.cubin).
The experimental kernel can be used by switching to the Experimental Feature Set: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Experimental_Features

This enables Subsurface Scattering and Correlated Multi Jitter Sampling on GPU, while keeping the stability and performance of the regular kernel.

Differential Revision: https://developer.blender.org/D762
Patch by Sergey and myself.

Developer / Builder Note:
CUDA Toolkit 6.5 is highly recommended for this, also note that building the experimental kernel requires a lot of system memory (~7-8GB).
2014-08-26 17:02:26 +02:00
Thomas Dinges
603348c56e Cycles: Drop support for CUDA 5.0 Toolkit, only 6.0 and 6.5 (recommended) are supported now. 2014-08-21 23:35:20 +02:00
Sergey Sharybin
283abdf3b2 Fix compilation error with scons and older pythons 2014-06-26 16:03:52 +06:00
Campbell Barton
2dce13d213 Python: Remove deprecated uses of os.popen
T40415 by Lawrence D'Oliveiro
2014-06-20 02:00:46 +10:00
b33d83bf51 Attempted fix for T40363: CUDA 30% slowdown in testbuilds compared to 2.70.
CMake had this --fast-math flag but scons not, makes a big difference on some
files. Slightly slower rendering might still happen though, but it should not
be this much.
2014-05-26 16:52:28 +02:00
741f17f05b Cycles CUDA: make CUDA toolkit 6.0 the official supported version.
This also updates the configurations to build kernels for compute capability
5.0 cards, when using and older CUDA toolkit version this will be skipped.

Also includes tweaks to improve performance with this version:
* Increase max registers on sm_30, sm_35 and sm_50
* No longer use texture storage on sm_30
2014-04-30 16:07:27 +02:00
fd99b8d4cf Fix for scons/CUDA build after recent commit. 2014-04-16 21:23:37 +02:00
2851ed4a55 Cycles code refactor: use __launch_bounds__ instead of -maxrregcount for CUDA.
This makes it easier to have per kernel number of registers. Also, all the
tunable parameters for this are now in kernel.cu, rather than spread over cmake,
scons and device_cuda.cpp.
2014-04-16 21:05:04 +02:00
Thomas Dinges
297a2223b5 Cycles / CUDA: Increase sm_2x registers to 40.
This fixes the ptaxs "ACCESS_VIOLATION" error and should allow our Linux and Windows build bots to compile again.
Unfortunately this comes with a performance penalty on sm_2x cards, so this is only a workaround for now. Branched Path is still globally disabled on GPU.
2014-04-08 23:25:54 +02:00
73299516fa Fix scons CUDA build and compile error with more strict compilers. 2014-03-29 15:57:27 +01:00
84470a1190 Cycles code refactor: move geometry related kernel files into own directory. 2014-03-29 13:03:45 +01:00
Martijn Berger
184294782e patch by liblib (lid b)
Default installation path of cuda nvcc.exe contain spaces

Reviewers: juicyfruit

Differential Revision: https://developer.blender.org/D239
2014-01-27 11:43:41 +01:00
Martijn Berger
21d587d9fc Added option to have a seperate environment for executing nvcc
This can be used to compiler cuda kernels with Visual Studio 2010 while
the rest of blender is compiled with MSVC 12.0 / 2013
2013-12-29 14:57:21 +01:00
Thomas Dinges
b5a5773fa9 Cycles / CUDA:
* Remove support for  CUDA Toolkit 4.x, only Toolkit 5.0 and above are supported now.
* Remove support for sm_1x cards (< Fermi) for good. We didn't officially support those cards for a few releases already, now remove some special code that was still there.
2013-10-08 15:29:28 +00:00
Brecht Van Lommel
8d6e5e2fee Cycles: update build configurations to include CUDA sm_35 architecture. When using
a compiler older than CUDA 5.0 it will give a warning and skip this architecture.
2013-06-20 13:10:47 +00:00
Brecht Van Lommel
16204bd647 Cycles: prepare to make CUDA 5.0 the official version we use
* Add CUDA compiler version detection to cmake/scons/runtime
* Remove noinline in kernel_shader.h and reenable --use_fast_math if CUDA 5.x
  is used, these were workarounds for CUDA 4.2 bugs
* Change max number of registers to 32 for sm 2.x (based on performance tests
  from Martijn Berger and confirmed here), and also for NVidia OpenCL.

Overall it seems that with these changes and the latest CUDA 5.0 download, that
performance is as good as or better than the 2.67b release with the scenes and
graphics cards I tested.
2013-06-19 17:54:23 +00:00
Bastien Montagne
ab2c273b12 Added GPL header to sconscripts!
Also changed shebang to '#!/usr/bin/env python', this is more portable across unixes...
2012-12-17 08:01:43 +00:00
Brecht Van Lommel
dbd44e3bf5 Fix scons not installing closure/ directory for runtime compiles of CUDA kernel. 2012-12-03 16:51:05 +00:00
Thomas Dinges
d3c6c6babd Cycles / CUDA:
* Remove -use_fast_math flag from scons as well.
2012-05-28 19:49:26 +00:00
Thomas Dinges
7c630aac80 Scons/CUDA
* Added missing bitness info to the nvcc_flags. 
This makes sure that the nvcc compiler builds the correct cubins.
2011-12-08 19:16:43 +00:00
Thomas Dinges
b7649ea757 Cycles / CUDA Kernel compile:
* Added option "WITH_BF_CYCLES_CUDA_THREADED_COMPILE" for the people who have much RAM (8 or more) and can compile several kernels at the same time. If enabled, it uses the general BF_NUMJOBS flag.
* The option is off per default.
2011-12-06 16:00:57 +00:00
Brecht Van Lommel
cd1e78f1b7 Cycles: scons cuda kernel compile now does one kernel at a time, to reduce memory
usage.
2011-12-06 12:29:54 +00:00
Brecht Van Lommel
45de380771 Cycles
* Compile all of cycles with -ffast-math again
* Add scons compilation of cuda binaries, tested on mac/linux.
* Add UI option for supported/experimental features, to make it
  more clear what is supported, opencl/subdivision is experimental.
* Remove cycles xml exporter, was just for testing.
2011-12-01 16:33:21 +00:00