Commit Graph

682 Commits

Author SHA1 Message Date
Martijn Berger
184294782e patch by liblib (lid b)
Default installation path of cuda nvcc.exe contain spaces

Reviewers: juicyfruit

Differential Revision: https://developer.blender.org/D239
2014-01-27 11:43:41 +01:00
Sv. Lockal
62f6d5351f Revert "Cycles: mix hair minimum width code with SSE intersection code"
Code is not equivalent in min/max part (SSE works with NaNs differently), this results in black dots with cardinal_curve hair.

This reverts commit b886c26d1f70d512b4f68975142372e3bee81c89.
2014-01-20 00:23:17 +04:00
Thomas Dinges
da523185fb Fix compilation of Cycles AVX kernel with cmake. 2014-01-16 18:32:54 +01:00
Thomas Dinges
de28a4d4b2 Cycles: Add an AVX kernel for CPU rendering.
* AVX is available on Intel Sandy Bridge and newer and AMD Bulldozer and newer.
* We don't use dedicated AVX intrinsics yet, but gcc auto vectorization gives a 3% performance improvement for Caminandes. Tested on an i5-3570, Linux x64.
* No change for Windows yet, MSVC 2008 does not support AVX.

Reviewed by: brecht
Differential Revision: https://developer.blender.org/D216
2014-01-16 17:04:11 +01:00
d9e52ac98b Code cleanup: move half float functions to separate header file. 2014-01-15 15:29:22 +01:00
8af782ad22 Code cleanup: some reshuffling of SIMD defines moving more code to util_optimization.h. 2014-01-15 15:11:50 +01:00
Sergey Sharybin
5cd321203e Fix compilation error with stricg GCC flags 2014-01-15 16:21:53 +06:00
Martijn Berger
0f3fed2970 OS X linker does not like empty compilation unit by itself in a library. Scons creates one library (.a) per kernel. This fixes that 2014-01-14 22:48:31 +01:00
Martijn Berger
993b946681 DingTo forgot to make sure kernel_sse41 is compiled in even when empty 2014-01-14 21:49:48 +01:00
Thomas Dinges
9351ac0d85 Cycles: Skip the compilation of the dedicated SSE2 kernel on x86-64, we can assume SSE2 here, so just re-use the regular one. Saves 500kb in the blender binary.
Reviewed by: brecht
Differential Revision: https://developer.blender.org/D199
2014-01-14 20:39:54 +01:00
Sv. Lockal
1c49eb0072 Cycles, Code cleanup: simplify code for color linear interpolation and float math
Reviewed By: brecht

Differential Revision: https://developer.blender.org/D215
2014-01-14 22:55:02 +04:00
Thomas Dinges
e9984653a8 Cycles: Fix Wave texture difference between OSL and SVM, OSL wasn't using the "Scale" properly for distortion. 2014-01-13 22:01:39 +01:00
Thomas Dinges
6b61f7f755 Code cleanup / Cycles: Don't pass scale to texture functions, do the multiplication in the function call already. 2014-01-13 21:17:55 +01:00
Sv. Lockal
9cf6946d31 Fix cycles texture crash on win x86-64 + msvc 11
Use union for __m128 aliasing; while gcc supports no-strict-aliasing attribute, unions are the most common way to deal with __m128 in msvc.
2014-01-13 18:31:02 +04:00
Sv. Lockal
d6c022d6d7 Fix compilation for OpenCL (and small stype fixes) 2014-01-12 18:18:43 +04:00
Sv. Lockal
47c5898fa1 Cycles: SSE for Voronoi textures (targeted for Haswell CPUs)
Gives up to 15% speedup scenes with voronoi-based textures (up to 25% with volumes) on Haswell. The performance change for other CPUs is much smaller: 1-2%.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D203
2014-01-12 18:14:00 +04:00
Sv. Lockal
da3fdf0b4b Code Cleanup: in Cycles SSE replace macros with templates, skip unused code with preprocessor, simplify casts 2014-01-11 22:20:03 +04:00
Sv. Lockal
b886c26d1f Cycles: mix hair minimum width code with SSE intersection code
Gives 6.5% speedup for hair.blend from testsuite.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D197
2014-01-11 20:47:30 +04:00
8c0f9365c0 Fix T38134: missing cycles update when removing world volume scatter shader. 2014-01-09 01:26:43 +01:00
Sv. Lockal
20b046d763 Cycles: workaround for noise performance regression in CUDA 5.5
Use manual ternary operation widening in grad(). Without it nvcc 5.5 produces multiple branch splits with very big branches (because of inlining). This solves 19% performance regression for BMW1M-MikePan.blend.

Also remove one redundant instruction in perlin SSE (when h == 12 or h == 14, then h is always >= 4).

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D190
2014-01-08 22:25:55 +04:00
7b0a46b1ff Fix CUDA/OpenCL compile errors in scattering commit. 2014-01-07 15:48:04 +01:00
01df756bd1 Cycles Volume Render: scattering support.
This is done by adding a Volume Scatter node. In many cases you will want to
add together a Volume Absorption and Volume Scatter node with the same color
and density to get the expected results.

This should work with branched path tracing, mixing closures, overlapping
volumes, etc. However there's still various optimizations needed for sampling.
The main missing thing from the volume branch is the equiangular sampling for
homogeneous volumes.

The heterogeneous scattering code was arranged such that we can use a single
stratified random number for distance sampling, which gives less noise than
pseudo random numbers for each step. For volumes where the color is textured
there still seems to be something off, needs to be investigated.
2014-01-07 15:03:41 +01:00
Jens Verwiebe
a0b424aa4c Take back last header copy, due it is for native only, must be a runtime solution, todo: do by definitions 2014-01-06 20:43:54 +01:00
Jens Verwiebe
48d8faeb79 Cmake: fix kernelcompile after introduction of util_simd.h 2014-01-06 20:26:02 +01:00
Sv. Lockal
acc90b40bf Cycles: Minor optimization (~1%) for texture access on CPU 2014-01-06 22:05:31 +04:00
Sv. Lockal
4817da0df4 Cleanup: use blend() in perlin noise (gives 12 less instructions on SSE4.1) 2014-01-06 21:24:28 +04:00
Sv. Lockal
96903508bc Cycles: SSE optimization for sRGB conversion (gives 7% speedup on CPU for pavillon_barcelone scene)
Thanks brecht/dingto/juicyfruit et al. for testing and reviewing this patch in T38034.
2014-01-06 20:03:30 +04:00
Campbell Barton
64fc94e93f Code Cleanup: osl style 2014-01-06 13:58:33 +11:00
Campbell Barton
c3bc2fd941 CMake: cleanup and add include 2014-01-04 13:17:07 +11:00
975c048ecd Fix gcc compile error in last commit. 2014-01-03 19:24:55 +01:00
ca7060662d Fix cycles OSL volume render crash with multiple closures. 2014-01-03 18:57:38 +01:00
bb0a0315e2 Code refactor: move random number and MIS variables into PathState.
This makes it easier to pass this state around, and wraps some common RNG
dimension computations in utility functions.
2014-01-03 18:57:38 +01:00
d0c6f14c73 Fix T38033: cycles volume emission changes with step size. 2014-01-02 21:34:22 +01:00
9cd2b19999 Cycles Volume Render: generated texture coordinates for volume render.
This does not support staying fixed while the surface deforms, but for static
meshes it should match up with the surface texture coordinates. Implemented
as a matrix transform from objects space to mesh texture space.

Making this work for deforming surfaces would be quite complicated, you might
need something like harmonic coordinates as used in the mesh deform modifier,
probably will not be possible anytime soon.
2013-12-31 17:38:26 +01:00
889d77e6f6 Cycles Volume Render: heterogeneous (textured) volumes support.
Volumes can now have textured colors and density. There is a Volume Sampling
panel in the Render properties with these settings:

* Step size: distance between volume shader samples when rendering the volume.
  Lower values give more accurate and detailed results but also increased render
  time.
* Max steps: maximum number of steps through the volume before giving up, to
  protect from extremely long render times with big objects or small step sizes.

This is much more compute intensive than homogeneous volume, so when you are not
using a texture you should enable the Homogeneous Volume option in the material
or world for faster rendering.

One important missing feature is that Generated texture coordinates are not yet
working in volumes, and they are the default coordinates for nearly all texture
nodes. So until that works you need to plug in object texture coordinates or a
world space position.

This is work by "storm", Stuart Broadfoot, Thomas Dinges and myself.
2013-12-30 00:04:02 +01:00
af128c4c96 Fix cycles volume emission not working with OSL. 2013-12-30 00:04:02 +01:00
30aa0c2482 Code refactor: better distinguish scatter and absorption for volume integration. 2013-12-30 00:04:02 +01:00
Martijn Berger
21d587d9fc Added option to have a seperate environment for executing nvcc
This can be used to compiler cuda kernels with Visual Studio 2010 while
the rest of blender is compiled with MSVC 12.0 / 2013
2013-12-29 14:57:21 +01:00
3f39af9cc2 Fix cycles volume render crash when trying to access primitive attributes
like generated texture coordinates or tangents.
2013-12-28 23:39:15 +01:00
fe222643b4 Cycles Volume Render: add volume emission support.
This is done using the existing Emission node and closure (we may add a volume
emission node, not clear yet if it will be needed).

Volume emission only supports indirect light sampling which means it's not very
efficient to make small or far away bright light sources. Using direct light
sampling and MIS would be tricky and probably won't be added anytime soon. Other
renderers don't support this either as far as I know, lamps and ray visibility
tricks may be used instead.
2013-12-28 23:20:53 +01:00
Sv. Lockal
077fe03eaf Use ccl_device_inline for SSE perlin noise
msvc ignores inline hint here and generates a bunch of push/lea
2013-12-28 23:26:42 +04:00
2b39214c4d Cycles Volume Render: add support for overlapping volume objects.
This works pretty much as you would expect, overlapping volume objects gives
a more dense volume. What did change is that world volume shaders are now
active everywhere, they are no longer excluded inside objects.

This may not be desirable and we need to think of better control over this.
In some cases you clearly want it to happen, for example if you are rendering
a fire in a foggy environment. In other cases like the inside of a house you
may not want any fog, but it doesn't seem possible in general for the renderer
to automatically determine what is inside or outside of the house.

This is implemented using a simple fixed size array of shader/object ID pairs,
limited to max 15 overlapping objects. The closures from all shaders are put
into a single closure array, exactly the same as if an add shader was used to
combine them.
2013-12-28 20:12:11 +01:00
e369a5c485 Cycles Volume Render: support for rendering of homogeneous volume with absorption.
This is the simplest possible volume rendering case, constant density inside
the volume and no scattering or emission. My plan is to tweak, verify and commit
more volume rendering effects one by one, doing it all at once makes it
difficult to verify correctness and track down bugs.

Documentation is here:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Materials/Volume

Currently this hooks into path tracing in 3 ways, which should get us pretty
far until we add more advanced light sampling. These 3 hooks are repeated in
the path tracing, branched path tracing and transparent shadow code:

* Determine active volume shader at start of the path
* Change active volume shader on transmission through a surface
* Light attenuation over line segments between camera, surfaces and background

This is work by "storm", Stuart Broadfoot, Thomas Dinges and myself.
2013-12-28 16:57:10 +01:00
133f770ab3 Code cleanup: move shadow_blocked function into separate file. 2013-12-28 16:57:10 +01:00
37c4d6a50a Cycles Volume Render: add flags to quickly detect when objects have a volume shader. 2013-12-28 16:57:10 +01:00
a35db17cee Cycles Volume Render: work on nodes and closures.
* Henyey-Greenstein scattering closure implementation.
* Rename transparent to absorption node and isotropic to scatter node.
* Volume density is folded into the closure weights.
* OSL support for volume closures and nodes.
* This commit has no user visible changes, there is no volume render code yet.

This is work by "storm", Stuart Broadfoot, Thomas Dinges and myself.
2013-12-28 16:57:02 +01:00
Thomas Dinges
1578b55c27 Cycles: Move SIMD utility functions into its own file.
Recently added SSE macros for noise texture can be moved here as well, but I leave this for later.
2013-12-27 21:30:21 +01:00
Thomas Dinges
a92abf5089 Cycles / Perlin Noise: Optimize noise calculation by using SIMD instructions on CPU.
This makes scenes with a Noise Texture render faster, the BMW file is 12-15% faster now.

Patch by Sv. Lockal, many thanks! :)
2013-12-27 18:48:37 +01:00
Thomas Dinges
40f79cf6e7 Cycles / Hair: Avoid duplicate calculations and remove redundant if branch, instead add the condition to the one above. 2013-12-26 21:52:46 +01:00
Thomas Dinges
03fed41e59 Cycles / Hair: Further cleanup of UI and internals.
* UI: Remove deprecated condition (CURVE_RIBBONS) and hide backface property, when it's hardcoded in C (Curve/Line segments && Ribbons).

* Remove "use_tangent_normal" and "CURVE_KN_TANGENTGNORMAL" as its unused (follow up for last commit).
2013-12-26 03:25:30 +01:00