Commit Graph

186 Commits

Author SHA1 Message Date
Brecht Van Lommel
60e5abe71f Fix a few issues reported by coverity scan. 2013-09-03 22:39:21 +00:00
Brecht Van Lommel
29f6616d60 Cycles: viewport render now takes scene color management settings into account,
except for curves, that's still missing from the OpenColorIO GLSL shader.

The pixels are stored in a half float texture, converterd from full float with
native GPU instructions and SIMD on the CPU, so it should be pretty quick.
Using a GLSL shader is useful for GPU render because it avoids a copy through
CPU memory.
2013-08-30 23:49:38 +00:00
Brecht Van Lommel
6785874e7a Fix #36137: cycles render not using all GPU's when the number of GPU's is larger
than the number of CPU threads
2013-08-30 23:09:22 +00:00
Thomas Dinges
a51f8e4353 Cycles / Standalone:
* Standalone can now be compiled without the GUI, making the glut dependency optional. 

Added WITH_CYCLES_STANDALONE_GUI cmake flag.
2013-08-30 17:34:27 +00:00
Thomas Dinges
ff4e018753 Cycles / Standalone:
* Rename test to standalone.

Note: New CMAKE flag is WITH_CYCLES_STANDALONE.
2013-08-27 02:37:48 +00:00
Brecht Van Lommel
b9ce231060 Cycles: relicense GNU GPL source code to Apache version 2.0.
More information in this post:
http://code.blender.org/

Thanks to all contributes for giving their permission!
2013-08-18 14:16:15 +00:00
Brecht Van Lommel
d43682d51b Cycles: Subsurface Scattering
New features:

* Bump mapping now works with SSS
* Texture Blur factor for SSS, see the documentation for details:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Shaders#Subsurface_Scattering

Work in progress for feedback:

Initial implementation of the "BSSRDF Importance Sampling" paper, which uses
a different importance sampling method. It gives better quality results in
many ways, with the availability of both Cubic and Gaussian falloff functions,
but also tends to be more noisy when using the progressive integrator and does
not give great results with some geometry. It works quite well for the
non-progressive integrator and is often less noisy there.

This code may still change a lot, so unless you're testing it may be best to
stick to the Compatible falloff function.

Skin test render and file that takes advantage of the gaussian falloff:
http://www.pasteall.org/pic/show.php?id=57661
http://www.pasteall.org/pic/show.php?id=57662
http://www.pasteall.org/blend/23501
2013-08-18 14:15:57 +00:00
Campbell Barton
b8c3efc8c3 code cleanup: compiler warnings 2013-07-21 16:40:34 +00:00
Thomas Dinges
636b314677 Code cleanup / Cycles:
* Remove useless else branch, after recent changes.
2013-07-21 10:30:22 +00:00
Thomas Dinges
9732c6283e Cycles / CPU Rendering:
* "Auto Detect" now again uses the umber of cores, instead number of cores + 1.

This was added before we had Tile rendering and benchmarks on several systems showed that there is no gain with this now. There might be some slight difference (0.5% or so) slower/faster depending on the scene, but this is negligible.
2013-07-20 00:40:03 +00:00
Brecht Van Lommel
3d847ed6e6 Fix #36064: cycles direct/indirect light passes with materials that have zero
RGB color components gave non-grey results when you might no expect it.

What happens is that some of the color channels are zero in the direct light
pass because their channel is zero in the color pass. The direct light pass is
defined as lighting divided by the color pass, and we can't divide by zero. We
do a division after all samples are added together to ensure that multiplication
in the compositor gives the exact combined pass even with antialiasing, DoF, ..

Found a simple tweak here, instead of setting such channels to zero it will set
it to the average of other non-zero color channels, which makes the results look
like the expected grey.
2013-07-08 23:31:45 +00:00
Thomas Dinges
c6ce8de20e Code cleanup / Cycles:
* Some cleanup for castings.
2013-06-27 15:48:16 +00:00
Brecht Van Lommel
7902fa57b6 Code cleanup: cycles
* Reshuffle SSE #ifdefs to try to avoid compilation errors enabling SSE on 32 bit.
* Remove CUDA kernel launch size exception on Mac, is not needed.
* Make OSL file compilation quiet like c/cpp files.
2013-06-26 23:29:33 +00:00
Thomas Dinges
15f5da4cd4 Cycles / SSE2:
* kernel_sse2 was built without actual SSE2 intrinsics on x86 systems.
2013-06-26 22:12:23 +00:00
Brecht Van Lommel
240fb6fa26 Cycles: ensure any SSE data is allocated 16 byte aligned, happens automatically
on many platforms but is not assured everywhere.
2013-06-22 14:35:09 +00:00
Brecht Van Lommel
8d6e5e2fee Cycles: update build configurations to include CUDA sm_35 architecture. When using
a compiler older than CUDA 5.0 it will give a warning and skip this architecture.
2013-06-20 13:10:47 +00:00
Brecht Van Lommel
f811e6e3ae Cycles: optimized SSE BVH traversal now also works with SSE2 CPUs, so all the
way back to Pentium 4, using a slightly less efficient instruction.

Also ensure /Ox is used for Visual Studio for RelWithDebInfo builds.
2013-06-19 17:54:26 +00:00
Brecht Van Lommel
16204bd647 Cycles: prepare to make CUDA 5.0 the official version we use
* Add CUDA compiler version detection to cmake/scons/runtime
* Remove noinline in kernel_shader.h and reenable --use_fast_math if CUDA 5.x
  is used, these were workarounds for CUDA 4.2 bugs
* Change max number of registers to 32 for sm 2.x (based on performance tests
  from Martijn Berger and confirmed here), and also for NVidia OpenCL.

Overall it seems that with these changes and the latest CUDA 5.0 download, that
performance is as good as or better than the 2.67b release with the scenes and
graphics cards I tested.
2013-06-19 17:54:23 +00:00
Brecht Van Lommel
649dd6f648 Fix cycles crash on some processors. We actually need S-SSE3 support for this
new BVH traversal code, not just SSE3.
2013-06-18 16:52:02 +00:00
Brecht Van Lommel
484d765bd4 Cycles: attempt to fix internal compile error with some visual studio builds 2013-06-18 13:19:16 +00:00
Jürgen Herrmann
5fc1d9205a Cycles BVH Build fix for MSVC 2012.
needs to include intrin.h for _BitScanForward and _BitScanReverse.
2013-06-18 12:32:43 +00:00
Brecht Van Lommel
d57c6748c4 Cycles: optimization for BVH traveral on CPU's with SSE3, using code from Embree.
On the BMW scene, this gives roughly a 10% speedup overall with clang/gcc, and 30%
speedup with visual studio (2008). It turns out visual studio was optimizing the
existing code quite poorly compared to pretty good autovectorization by clang/gcc,
but hand written SSE code also gives a smaller speed boost there.

This code isn't enabled when using the hair minimum width feature yet, need to
make that work with the SSE code still.
2013-06-18 09:36:06 +00:00
Campbell Barton
9161a4daa5 fix for own error in recent solitify refactor (r57402), face flip check was incorrect. 2013-06-14 16:10:32 +00:00
Thomas Dinges
9020df976c Cycles / Wavelength to RGB node:
* Added a node to convert wavelength (in nanometers, from 380nm to 780nm) to RGB values. This can be useful to match real world colors easier.

* Code cleanup:
** Moved color functions (xyz and hsv) into dedicated utility files.
** Remove svm_lerp(), use interp() instead. 

Documentation:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/More#Wavelength

Example render:
http://www.pasteall.org/pic/show.php?id=53202

This is part of my GSoC 2013. (revisions 57322, 57326, 57335 and 57367 from soc-2013-dingto).
2013-06-10 21:55:41 +00:00
Antony Riakiotakis
b0d5555a06 Fix compilation of cycles for MinGW-w64 after recent commits. MinGW-w64 has conflicting redefinitions of the SSE functions in windows.h, so we will be using this header instead, since we can't always avoid including it instead of the sse headers. 2013-06-08 21:48:01 +00:00
Brecht Van Lommel
b20a7e01d0 Cycles: experimental correlated multi-jittered sampling pattern that can be used
instead of sobol. So far one doesn't seem to be consistently better or worse than
the other for the same number of samples but more testing is needed.

The random number generator itself is slower than sobol for most number of samples,
except 16, 64, 256, .. because they can be computed faster. This can probably be
optimized, but we can do that when/if this actually turns out to be useful.

Paper this implementation is based on:
http://graphics.pixar.com/library/MultiJitteredSampling/

Also includes some refactoring of RNG code, fixing a Sobol correlation issue with
the first BSDF and < 16 samples, skipping some unneeded RNG calls and using a
simpler unit square to unit disk function.
2013-06-07 16:06:22 +00:00
Brecht Van Lommel
d835d2f4e6 Code cleanup: avoid some warnings due to implicit uint/int/float/double conversion. 2013-06-07 16:06:17 +00:00
Thomas Dinges
9e4914e055 Cycles:
* Revert r57203 (len() renaming)
There seems to be a problem with nVidia OpenCL after this and I haven't figured out the real cause yet. 
Better to selectively enable native length() later, after figuring out what's wrong. 

This fixes [#35612].
2013-06-04 17:20:00 +00:00
Brecht Van Lommel
3fe117bd3d Fix build error on non-x86 architectures as pointed out by Jochen Schmitt. 2013-06-04 11:21:13 +00:00
Thomas Dinges
c5ed6765b9 Cycles / Math functions:
* Rename some math functions:
len -> length
len_squared -> length_squared
normalize_len -> normalize_length

* This way OpenCL uses its inbuilt length() function, rather than our own. The other two functions have been renamed for consistency. 
* Tested CPU, CUDA and OpenCL compile, should be no functional changes.
2013-06-02 20:39:32 +00:00
Brecht Van Lommel
2d0a586c29 Cycles OpenCL: keep the opencl context and program around for quicker rendering
the second time, as for example Intel CPU startup time is 9 seconds.

* Adds an cache for contexts and programs for each platform and device pair,
  which also ensure now no two threads try to compile and write the binary cache
  file at the same time.
* Change clFinish to clFlush so we don't block until the result is done, instead
  it will block at the moment we copy back memory.
* Fix error in Cycles time_sleep implementation, does not affect any active code
  though.
* Adds some (disabled) debugging code in the task scheduler.

Patch #35559 by Doug Gale.
2013-05-31 16:19:03 +00:00
Brecht Van Lommel
4bdb54a76e Cycles OpenCL: patch #35514 by Doug Gale
* Support using devices from all OpenCL platforms, so that you can use e.g. both
  Intel and NVidia OpenCL implementations if you have them installed.
* Fix compile error due to missing fmodf after recent math node change.
* Enable advanced shading for Intel OpenCL.
* CYCLES_OPENCL_DEBUG environment variable for generating debug symbols so you
  can debug with gdb. This crashes the compiler with Intel OpenCL on Linux though.
  To make this work the preprocessed kernel source code is written out, as gdb
  needs this.
* Show OpenCL compiler warnings even if the build succeeded.
* Some small fixes to initialize cdDevice to NULL, add missing NULL check when
  creating buffer and add missing space at end of build options for Apple OpenCL.
* Fix crash with multi device + opencl, now e.g. CPU + GPU render should work.

I did a few tweaks to the code and also:

* Fix viewport render failing sometimes with Apple CPU OpenCL, was not taking
  workgroup size limits into account properly.
* Add compile error when advanced shading in the Blender binary and OpenCL kernel
  are not in sync.
2013-05-27 16:21:07 +00:00
Thomas Dinges
38dc85f296 Math Node:
* Added a Modulo operation to the math node, available in Compositor, Shader and Texture Nodes.
2013-05-20 14:38:47 +00:00
Thomas Dinges
75e36650e3 Code cleanup / Cycles:
* Simplify shaperadius() function a bit to avoid castings.
* Style cleanup 1.f -> 1.0f, to follow rest of Cycles code.
2013-05-18 11:04:29 +00:00
Campbell Barton
f334df5624 code cleanup: double promotion warnings. 2013-05-16 17:20:56 +00:00
Thomas Dinges
d76b758f23 Cycles:
* Fix compile error, when building with __KERNEL_SSE__
2013-05-13 15:31:59 +00:00
Thomas Dinges
7636aeffe1 Cycles / Math:
* Add M_2PI_F and M_4PI_F constants and use them inside the codebase.
2013-05-12 14:13:29 +00:00
Brecht Van Lommel
d0ffbeec73 Cycles OpenCL: a few fixes to get things compiling after kernel changes,
for Apple OpenCL on OS X 10.8 and simple AO render.

Also environment variable CYCLES_OPENCL_TEST can now be set to CPU, GPU,
ACCELERATOR, DEFAULT or ALL values to test particuler devices.
2013-05-09 14:05:40 +00:00
Thomas Dinges
872a8ed1bf Cycles / Hair rendering:
* Enable hair rendering on the GPU.

Patch by Stuart Broadfoot, with small tweaks by me, to only enable it on sm_20 and above.
2013-05-08 17:33:25 +00:00
Brecht Van Lommel
34707c19f2 Fix 34764: cycles issue rendering instanced mesh with NaN coordinates. 2013-04-09 20:48:53 +00:00
Brecht Van Lommel
cf3ec257a2 Fix #34880: cycles motion blur render issue with some compilers. Actually is a bigger
problem where accessing float4 members with [] stops working due to optimizer, will
check that later.
2013-04-05 23:03:10 +00:00
Brecht Van Lommel
de9dffc61e Cycles: initial subsurface multiple scattering support. It's not working as
well as I would like, but it works, just add a subsurface scattering node and
you can use it like any other BSDF.

It is using fully raytraced sampling compatible with progressive rendering
and other more advanced rendering algorithms we might used in the future, and
it uses no extra memory so it's suitable for complex scenes.

Disadvantage is that it can be quite noisy and slow. Two limitations that will
be solved are that it does not work with bump mapping yet, and that the falloff
function used is a simple cubic function, it's not using the real BSSRDF
falloff function yet.

The node has a color input, along with a scattering radius for each RGB color
channel along with an overall scale factor for the radii.

There is also no GPU support yet, will test if I can get that working later.

Node Documentation:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Nodes/Shaders#BSSRDF

Implementation notes:
http://wiki.blender.org/index.php/Dev:2.6/Source/Render/Cycles/Subsurface_Scattering
2013-04-01 20:26:52 +00:00
Thomas Dinges
d4c97029ef Cycles GPU Rendering:
* Make Cycles aware of sm_35 (Tesla K20, GeForce GTX TITAN).

The CUDA Toolkit 5.0 is needed for that and this is not officially used yet, but people with access to such cards can start testing. (just build sm_35 kernels).
2013-02-21 17:16:32 +00:00
Campbell Barton
0528162eb6 patch [#34320] Cross compiling with mingw-w64 on ubuntu
from Martijn Berger (juicyfruit)

applying since this is only corrects header case which is ignored on windows anyway.
2013-02-19 12:05:38 +00:00
Brecht Van Lommel
6e03b70def Fix cycles hair curves with NaN values not rendering with dynamic BVH. These NaN
values were breaking the bounding box computation, now they should have no influence.
2013-02-14 21:40:28 +00:00
Brecht Van Lommel
4061f96d94 Fix cycles issue with BVH cache created with 64 bits and used for 32 bits binary,
and vice versa.
2013-02-13 11:02:51 +00:00
Brecht Van Lommel
7c9d993347 Fix cycles intersection issue with overlapping faces on windows 32 bit and CPU
without SSE3 support, due to 80 bit precision float register being used for one
bounding box but not the one next to it.
2013-02-04 16:12:37 +00:00
Brecht Van Lommel
38c94e9194 Fix cycles crash that happened with mesh emission and diffuse/glossy ray
visibility disabled on some objects.
2013-01-25 02:00:57 +00:00
Antony Riakiotakis
f2891d3731 For non-windows systems, check for CUDA compiler during runtime 2013-01-14 19:33:16 +00:00
Sergey Sharybin
e5179bfefc Remove usage WITH_CYCLES_CUDA_BINARIES in code, use check for
precompiled cubins instead,

Logic here is following now:
- If there're precompiled cubins, assume CUDA compute is available,
  otherwise
- If cuda toolkit found, assume CUDA compute is available
- In all other cases CUDA compute is not available

For windows there're still check for only precompiled binaries,
no runtime compilation is allowed.

Ended up with such decision after discussion with Brecht. The thing
is, if we'll support runtime compilation on windows we'll end up
having lots of reports about different aspects of something doesn't
work (you need particular toolkit version, msvc installed, environment
variables set properly and so) and giving feedback on such reports
will waste time.
2013-01-14 17:30:33 +00:00