Commit Graph

554 Commits

Author SHA1 Message Date
b5a58507f2 Fix Cycles OSL compilation based on modified time not working. 2016-11-12 17:33:07 +01:00
Lukas Stockner
4e68f48227 Cycles: Initialize the RNG state from the kernel instead of the host
This allows to save a memory copy, which will be particularly useful for network rendering.

Reviewers: sergey, brecht, dingto, juicyfruit, maiself

Differential Revision: https://developer.blender.org/D2323
2016-10-30 11:51:20 +01:00
Lukas Stockner
26bf230920 Cycles: Add optional probabilistic termination of light samples based on their expected contribution
In scenes with many lights, some of them might have a very small contribution to some pixels, but the shadow rays are traced anyways.
To avoid that, this patch adds probabilistic termination to light samples - if the contribution before checking for shadowing is below a user-defined threshold, the sample will be discarded with probability (1 - (contribution / threshold)) and otherwise kept, but weighted more to remain unbiased.
This is the same approach that's also used in path termination based on length.

Note that the rendering remains unbiased with this option, it just adds a bit of noise - but if the setting is used moderately, the speedup gained easily outweighs the additional noise.

Reviewers: #cycles

Subscribers: sergey, brecht

Differential Revision: https://developer.blender.org/D2217
2016-10-30 11:31:28 +01:00
Lukas Stockner
1272ee455e Cycles: Implement texture coordinates for Point, Spot and Area Lamps
When using the Normal output of the Texture Coordinate node on Point and Spot lamps, the coordinates now depend on the rotation of the lamp.
On Area lamps, the Parametric output of the Geometry node now returns UV coordinates on the area lamp.

Credit for the Area lamp part goes to Stefan Werner (from D1995).
2016-10-29 19:24:08 +02:00
Sergey Sharybin
f11298692b Cycles: More workarounds for weird crashes on AVX2
Oh man, is it a compiler bug? Is it something we do stupid?

For now more crap to prevent crashes. During the conference will talk to
Maxyn about how can we troubleshoot such weird issues.
2016-10-27 12:51:03 +02:00
Sergey Sharybin
7e380ad4c0 Cycles: Another attempt to fix crashes on AVX2 processors
Basically don't use rcp() in areas which seems to be critical after
second look. Also disabled some multiplication operators, not sure
yet why they might be a problem.

Tomorrow will be setting up a full test with all cases which were
buggy in our farm to see if this fix is complete.
2016-10-26 22:14:41 +02:00
Sergey Sharybin
35f152358b Cycles: Completely disable transform SSE for now
Was causing issues on another frame.

On a tight schedule, disabling for now so artists are happy.

Still looking into root of the issue!
2016-10-26 15:23:58 +02:00
Sergey Sharybin
7c7d23691f Cycles: Fix crashes after recent optimization commits
There is some precision issues for big magnitude coordinates which started
to give weird behavior of release builds. Some weird memory usage in BVH
which is tricky to nail down because only happens in release builds and GDB
reports all variables as optimized out when trying to use RelWithDebInfo.

There are two things in this commit:

- Attempt to make vectorized code closer to original one, hoping that it'll
  eliminate precision issue.
  This seems to work for transform_point().
- Similar trick did not work for transform_direction() even tho absolute
  error here is much smaller. For now disabled that function, need a more
  careful look here.
2016-10-26 14:30:25 +02:00
Sergey Sharybin
064caae7b2 Cycles: BVH-related SSE optimization
Several ideas here:

- Optimize calculation of near_{x,y,z} in a way that does not require
  3 if() statements per update, which avoids negative effect of wrong
  branch prediction.

- Optimization of direction clamping for BVH.

- Optimization of point/direction transform.

Brings ~1.5% speedup again depending on a scene (unfortunately, this
speedup can't be sum across all previous commits because speedup of
each of the changes varies from scene to scene, but it still seems to
be nice solid speedup of few percent on Linux and bigger speedup was
reported on Windows).

Once again ,thanks Maxym for inspiration!

Still TODO: We have multiple places where we need to calculate near
x,y,z indices in BVH, for now it's only done for main BVH traversal.
Will try to move this calculation to an utility function and see if
that can be easily re-used across all the BVH flavors.
2016-10-25 14:47:34 +02:00
Sergey Sharybin
af411d918e Cycles: Implement SSE-optimized path of util_max_axis()
The idea here is to avoid if statements which could cause wrong
branch prediction.

Gives a bit of measurable speedup up to ~1%. Still nice :)

Inspired by Maxym Dmytrychenko, thanks!
2016-10-25 13:54:17 +02:00
Sergey Sharybin
cde18cf3b3 Cycles: Fix static initialization order fiasco
Initialization order of global stats and node types was not strictly
defined and it was possible to have node types initialized first and
stats after that. This will zero out memory which was allocated from
the statistics causing assert failure when de-initializing node types.
2016-10-24 13:47:39 +02:00
Sergey Sharybin
0ddb8d9b13 Cycles: Disable optimization of operator / for float3
This was giving some speedup but made intersection tests to fail
from watertight point of view.

Needs deeper investigation, but need to quickly get it fixed for
the studio.
2016-10-14 13:53:26 +02:00
Sergey Sharybin
22cdf44101 Cycles: Use const reference for register variables in non-OpenCL code
This is something tested by @LazyDodo and suggested by Maxym to make
MSVC happier.
2016-10-12 14:48:59 +02:00
Sergey Sharybin
e588106d45 Cycles: Use more SSE intrinsics for float3 type
This gives about 5% speedup on AVX2 kernels (other kernels still
have SSE disabled for math operations) and this solves the slowdown
of koro scene mention in the previous commit.

The title says it all actually. This commit also contains
changes to pass float3 as const reference in affected functions.

This should make MSVC happier without breaking OpenCL because it's
only done in areas which are ifdef-ed for non-OpenCL.

Another patch based on inspiration from Maxym Dmytrychenko, thanks!
2016-10-12 14:43:00 +02:00
Sergey Sharybin
6a4ec3ca43 Cycles: Add new avxf vectorized data type
Based on existing ssef data type and to my knowledge it's also what happens in
Embree nowadays.

Inspired by Maxym Dmytrychenko and required for the upcoming triangle
intersection commit.

Hopefully the copyright message is correct.
2016-10-12 13:54:13 +02:00
a3abb020e3 Fix Cycles CUDA performance on CUDA 8.0.
Mostly this is making inlining match CUDA 7.5 in a few performance critical
places. The end result is that performance is now better than before, possibly
due to less register spilling or other CUDA 8.0 compiler improvements.

On benchmarks scenes, there are 3% to 35% render time reductions. Stack memory
usage is reduced a little too.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2269
2016-10-03 22:15:25 +02:00
lazydodo
3ee5ce155c [Windows/Cycles/Clang] Fix compilation error with clang-cl on windows 2016-10-02 14:01:23 -06:00
Sergey Sharybin
31ebbe40a0 Cycles: Improve OpenCL line information handling
Previously it was falling back to just a path after #include
statement was finished. Now we fall back to a proper current
file name after dealing with the preprocessor statement.
2016-09-29 10:20:24 +02:00
Sergey Sharybin
d15899cca7 Cycles: Fix compilation error after recent commits 2016-09-12 16:06:50 +02:00
Sergey Sharybin
91e0a16f2f Cycles: Use XDG's .cache folder for cached kernels
Basically just moves cached kernels from ~/.config/blender/BLENDER_VERSION to
~/.cache/cycles/kernels. This has following benefits:

- Follows XDG specification more closely,
  not as if it's totally crucial or measurable by users, but still nice.

- Prevents unexpected sizes of config folder, makes disk space used in more
  predictable for users way.

- Allows to share kernels across multiple Blender versions,
  which makes it easier debugging at the times close to release.

- "Copy Previous Settings" operator will no longer be copying possibly
  gigabytes of cached kernels, which used to lead to really nast disk usage
  and annoying delays of copying settings.

- In the future we can have some smart logic to clear old unused cached
  kernels.

Currently only done for Linux and OSX. Windows still follows old "cache"
folder logic, but it's not really important for now because we don't
support kernel compilation on this platform yet.

Reviewers: dingto, juicyfruit, brecht

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2197
2016-09-12 09:39:05 +02:00
Sergey Sharybin
18ae1504ea Fix T49286: Compilation error with XCode 7.0
Weirdly enough, this version of XCode seems to have static_assert()
even when NOT using C++11. This is totally weird and counter intuitive
since static_assert() is supposed to be C++11 onlky feature.

Can XCode stop using future, please? :)
2016-09-08 09:27:51 +02:00
Thomas Dinges
5c0a67b325 Cycles: Add single channel texture support for OpenCL.
This way OpenCL devices can also benefit from a smaller memory footprint, when using e.g. bumpmaps (greyscale, 1 channel).

Additional target for my GSoC 2016.
2016-08-14 20:21:08 +02:00
Thomas Dinges
9d236ac06c Cycles: Enable half float support (4 channels and 1 channel) on CUDA.
Atm OpenEXR half files benefit from this and will use only 1/2 of the memory now. More space for HDRs!

Part of my GSoC 2016.
2016-08-11 22:47:53 +02:00
Thomas Dinges
5ac7ef873b Cycles: Change code order for Image Data Types.
Now we have the 4 component ones first (float4, byte4, half4) followed by the 1 component ones (float, byte, half).
Makes code a bit more consistent and also reduces code a bit when enabling half support on GPU in next commit.

This also exposed a typo in half CPU images for 3D textures, which wasn't used yet, but good to have that one fixed anyway.
2016-08-11 22:30:03 +02:00
Mai Lavelle
013a5c27a5 Cycles: Remove odd definition from CMake file
This was causing Cycles standalone to fail to build from Blender repo.
Hopefully nothing breaks from removing this.
2016-08-11 14:35:43 -04:00
Sergey Sharybin
fdc43f993d Cycles: Use static assert to control structures alignment 2016-08-11 10:12:06 +02:00
Lukas Stockner
bbbc079a6c Cycles: Correct maximum number of textures on pre-Kepler CUDA cards
Commit c96ae81160ad added three data textures and therefore removed three image texture slots, but the value in util_textures.h wasn't updated.
2016-08-10 17:19:16 +02:00
Alexander Gavrilov
a7f6f900f3 Cycles: avoid making NaNs in Vector Math node by normalizing zero vectors.
Since inputs are user controlled, the node can't assume they aren't zero.
2016-08-09 13:20:22 +03:00
Mai Lavelle
0b68c68006 Cycles microdisplacement: Support for Catmull-Clark subdivision via OpenSubdiv
Enables Catmull-Clark subdivision meshes with support for creases and attribute
subdivision. Still waiting on OpenSubdiv to fully support face varying
interpolation for subdividing uv coordinates tho. Also there may be some
inconsistencies with Blender's subdivision which will be resolved at a
later time.

Code for reading patch tables and creating patch maps is borrowed
from OpenSubdiv.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2111
2016-08-07 11:13:11 -04:00
Alexander Gavrilov
1f19fba566 Cycles: hide particles with broken motion blur traces.
Currently cycles cannot correctly render motion blur for objects that appear or
disappear during the shutter window. Until that can be fixed properly, it may be
better to hide such particles rather than let them render as if they were
stationary for half of the frame.

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2125
2016-08-05 01:00:41 +02:00
Sergey Sharybin
6353ecb996 Cycles: Tweaks to support CUDA 8 toolkit
All the changes are mainly giving explicit tips on inlining functions,
so they match how inlining worked with previous toolkit.

This make kernel compiled by CUDA 8 render in average with same speed
as previous kernels. Some scenes are somewhat faster, some of them are
somewhat slower. But slowdown is within 1% so far.

On a positive side it allows us to enable newer generation cards on
buildbots (so GTX 10x0 will be officially supported soon).
2016-08-01 15:54:29 +02:00
b2e16c5700 Fix Cycles OpenCL compile error on Windows. 2016-07-31 02:02:28 +02:00
9b6ed3a42b Cycles: refactor kernel closure storage to use structs per closure type.
Reviewed By: dingto, sergey

Differential Revision: https://developer.blender.org/D2127
2016-07-31 02:34:43 +02:00
1e2efbc908 Cycles OpenCL: use #line directives for better error messages. 2016-07-30 18:25:52 +02:00
Mai Lavelle
c96ae81160 Cycles microdisplacement: ngons and attributes for subdivision meshes
This adds support for ngons and attributes on subdivision meshes. Ngons are
needed for proper attribute interpolation as well as correct Catmull-Clark
subdivision. Several changes are made to achieve this:

- new primitive `SubdFace` added to `Mesh`
- 3 more textures are used to store info on patches from subd meshes
- Blender export uses loop interface instead of tessface for subd meshes
- `Attribute` class is updated with a simplified way to pass primitive counts
  around and to support ngons.
- extra points for ngons are generated for O(1) attribute interpolation
- curves are temporally disabled on subd meshes to avoid various bugs with
  implementation
- old unneeded code is removed from `subd/`
- various fixes and improvements

Reviewed By: brecht

Differential Revision: https://developer.blender.org/D2108
2016-07-29 03:36:30 -04:00
Lukas Stockner
d9281a6332 Cycles: Fix three numerical issues in the fresnel, normal map and Beckmann code
- In fresnel_dielectric, the differentials calculation sometimes divided by zero.
- When the normal map was (0.5, 0.5, 0.5), the code would try to normalize a zero vector. Now, it just uses the regular normal as a fallback.
- The approximate error function used in Beckmann sampling sometimes overflowed to inf while calculating r^16. The final value is 1 - 1/r^16, however,
  so now it just returns 1 if the computation would overflow otherwise.
2016-07-16 20:54:14 +02:00
Sergey Sharybin
cb3b19730c Cycles: Use utility define for restrict pointers
This way restrict can be used for CUDA and OpenCL as well.

From quick tests in areas i've been testing this it might give some
barely measurable %% of speedup, but it increases registers pressure.

So use of this qualifier is still really limited.
2016-07-11 13:58:47 +02:00
Sergey Sharybin
a62967787c Fix T48808: Regression: Cycles OpenCL broken after Hair BVH commit 2016-07-08 09:41:36 +02:00
Sergey Sharybin
b03e66e75f Cycles: Implement unaligned nodes BVH builder
This is a special builder type which is allowed to orient nodes to
strands direction, hence minimizing their surface area in comparison
with axis-aligned nodes. Such nodes are much more efficient for hair
rendering.

Implementation of BVH builder is based on Embree, and generally idea
there is to calculate axis-aligned SAH and oriented SAH and if SAH
of oriented node is smaller than axis-aligned SAH we create unaligned
node.

We store both aligned and unaligned nodes in the same tree (which
seems to be different from what Embree is doing) so we don't have
any any extra calculations needed to set up hair ray for BVH
traversal, hence avoiding any possible negative effect of this new
BVH nodes type.

This new builder is currently not in use, still need to make BVH
traversal code aware of unaligned nodes.
2016-07-07 17:25:48 +02:00
Sergey Sharybin
4a641e3cbc Cycles: Fix corner case of human readable number returning empty string 2016-06-27 13:49:25 +05:00
Lukas Stockner
23c276832b Cycles: Add multi-scattering, energy-conserving GGX as an option to the Glossy, Anisotropic and Glass BSDFs
This commit adds a new distribution to the Glossy, Anisotropic and Glass BSDFs that implements the
multiple-scattering microfacet model described in the paper "Multiple-Scattering Microfacet BSDFs with the Smith Model".

Essentially, the improvement is that unlike classical GGX, which only models single scattering and assumes
the contribution of multiple bounces to be zero, this new model performs a random walk on the microsurface until
the ray leaves it again, which ensures perfect energy conservation.

In practise, this means that the "darkening problem" - GGX materials becoming darker with increasing
roughness - is solved in a physically correct and efficient way.

The downside of this model is that it has no (known) analytic expression for evalation. However, it can be
evaluated stochastically, and although the correct PDF isn't known either, the properties of MIS and the
balance heuristic guarantee an unbiased result at the cost of slightly higher noise.

Reviewers: dingto, #cycles, brecht

Reviewed By: dingto, #cycles, brecht

Subscribers: bliblubli, ace_dragon, gregzaal, brecht, harvester, dingto, marcog, swerner, jtheninja, Blendify, nutel

Differential Revision: https://developer.blender.org/D2002
2016-06-23 22:57:26 +02:00
Thomas Dinges
6311a9ff23 Cycles: Support half and half4 textures.
This is an initial commit for half texture support in Cycles.
It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU.

Supported:
* Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO).

ToDo:
Various things like support for inbuilt half textures, GPU... will come later, step by step.

Part of my GSoC 2016.
2016-06-19 17:31:16 +02:00
Lukas Stockner
7a5a02509b Cycles: Use faster ray-quad-intersection test
The original quad intersection test works by just testing against the two triangles that define the quad.
However, in this case it's actually faster to use the same test that's also used for portals: Determining
the distance to the plane in which the quad lies, calculating the hitpoint and checking whether it's in the
quad by projecting onto the sides.

Reviewers: brecht, sergey, dingto

Reviewed By: dingto

Differential Revision: https://developer.blender.org/D2045
2016-06-06 23:38:50 +02:00
Sergey Sharybin
c276480b0f Fix compilation error on 32 bit Windows 2016-06-06 14:01:49 +02:00
Sergey Sharybin
b277ba5c0d Cycles: Fix compilation error on OSX 2016-06-06 13:52:57 +02:00
Sergey Sharybin
b62faa54de Cycles: Add support of processor groups
Currently for windows only, this is an initial commit towards native
support of NUMA.

Current commit makes it so Cycles will use all logical processors on
Windows running on system with more than 64 threads.

Reviewers: juicyfruit, dingto, lukasstockner97, maiself, brecht

Subscribers: LazyDodo

Differential Revision: https://developer.blender.org/D2049
2016-06-06 09:14:37 +02:00
Mai Lavelle
4388b29e98 Cycles: Add human readable sizes to debug output
Some of these values can get quite large and are hard to read, adding this
makes it easy to read them at a glance.

Reviewed By: sergey

Differential Revision: https://developer.blender.org/D2039
2016-05-31 06:13:54 -04:00
Thomas Dinges
dc07a5561f Cleanup: Further tweaks for consistency and simplifications.
Now I can start adding half float without adding even bigger mess to all these functions. ;)
2016-05-27 23:35:29 +02:00
Thomas Dinges
2ee063868d Cleanup: Shorten texture variables, tex and image was kinda redundant.
Also make prefix consistent, so it starts with either TEX_NUM or TEX_START, followed by texture type and architecture.
2016-05-27 22:58:33 +02:00
f7c28a66e2 Fix Cycles compile errors with GCC due to double promotion as errors. 2016-05-22 19:17:22 +02:00
ec51175f1f Code refactor: add generic Cycles node infrastructure.
Differential Revision: https://developer.blender.org/D2016
2016-05-22 17:29:24 +02:00
9d19533117 Fix T48472: issue in array refactor, causing performance regression in BVH build. 2016-05-20 10:58:11 +02:00
Thomas Dinges
c9f1ed1e4c Cycles: Add support for bindless textures.
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above).
This is used for all 2D/3D textures, data still uses arrays as before.

User benefits:
* No more limits of image textures on Kepler.
 We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4.
 This can be extended further if we need to (just change the define).

* Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type).

ToDo / Issues:
* 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet.
* Dynamically allocate bindless_mapping array?

I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master.

Part of my GSoC 2016.

Reviewers: sergey, #cycles, brecht

Subscribers: swerner, jtheninja, brecht, sergey

Differential Revision: https://developer.blender.org/D1999
2016-05-19 13:14:37 +02:00
9dc5367c89 Cleanup code style inconsistency in last commits. 2016-05-17 23:41:45 +02:00
93e4ae84ad Code refactor: add some array utility methods, fix leak in assignment operator. 2016-05-17 21:39:16 +02:00
Thomas Dinges
3c85e1ca1a Cycles: Add support for single channel byte textures.
This way, we also save 3/4th of memory for single channel byte textures (e.g. Bump Maps).

Note: In order for this to work, the texture *must* have 1 channel only.
In Gimp you can e.g. do that via the menu: Image -> Mode -> Grayscale
2016-05-12 14:51:42 +02:00
Thomas Dinges
cde10e774c Fix array bounds compile warning. 2016-05-12 14:20:12 +02:00
Thomas Dinges
8de3303a03 Cleanup: Fix typo. 2016-05-12 02:11:36 +02:00
Thomas Dinges
4a4f043bc4 Cycles: Add support for single channel float textures on CPU.
Until now, single channel textures were packed into a float4, wasting 3 floats per pixel. Memory usage of such textures is now reduced by 3/4.
Voxel Attributes such as density, flame and heat benefit from this, but also Bumpmaps with one channel.
This commit also includes some cleanup and code deduplication for image loading.

Example Smoke render from Cosmos Laundromat: http://www.pasteall.org/pic/show.php?id=102972
Memory here went down from ~600MB to ~300MB.

Reviewers: #cycles, brecht

Differential Revision: https://developer.blender.org/D1981
2016-05-11 21:58:34 +02:00
Sergey Sharybin
92774ff792 Cycles: Use explicit qualifier for single-argument constructors
Almost in all cases we want such constructors to be explicit, there are
exceptions but only in few places.
2016-05-11 16:51:14 +02:00
Thomas Dinges
76481eaeff Cycles: Add support for float4 textures on OpenCL.
Title says it all, this adds OpenCL float4 texture support.

There is a bug in the code still, I get a "Out of ressources error" on nvidia hardware here, not sure whats wrong yet.
Will investigate further, but maybe someone else has an idea. :)

Reviewers: #cycles, brecht

Subscribers: brecht, candreacchio

Differential Revision: https://developer.blender.org/D1983
2016-05-10 02:53:50 +02:00
Thomas Dinges
9a1e11260c Cleanup: More byte -> byte4 renaming for consistency. 2016-05-09 02:22:01 +02:00
Thomas Dinges
734d1aec3f Cycles: Make CUDA adaptive feature compile a Debug flag.
If the CUDA Toolkit is installed and the user is on Linux,
adaptive, feature based CUDA runtime compile is now possible to enable via:

* Environment flag CYCLES_CUDA_ADAPTIVE_COMPILE or
* Debug menu (Debug value 256) in the Cycles UI.
2016-05-06 23:13:33 +02:00
Thomas Dinges
3807bcb3a8 Cleanup: Rename texture slots to float4 and byte, to distinguish from future float (single channel) and half_float slots.
Should be no functional changes, tested CPU and CUDA.
2016-05-06 14:37:35 +02:00
Sergey Sharybin
42fd1b9abe Cycles: Fix issues with stack allocator in MSVC
Couple of issues here:

- Was a bug in heap memory allocation when run out
  of allowed stack memory.

- Debug MSVC was failing because it uses separate
  allocator for some sort of internal proxy thing,
  which seems to be unable to be using stack memory
  because allocator is being created in non-persistent
  stack location.
2016-04-25 13:50:27 +02:00
Sergey Sharybin
02213b867e Cycles: Stop rendering when bad_alloc happens
This is an attempt to gracefully handle out-of-memory events
and stop rendering with an error message instead of a crash.

It uses bad_alloc exception, and usually i'm not really fond
of exceptions, but for such limited use for errors from which
we can't recover it should be fine.

Ideally we'll need to stop full Cycles Session, so viewport
render and persistent images frees all the memory, but that
we can support later, since it'll mainly related on telling
Blender what to do.

General rules are:

- Use as less exception handles as possible, try to find a
  most geenric pace where to handle those.

  For example, ccl::Session.

- Threads needs own handling, exception trap from one thread
  will not catch exceptions from other threads.

  That's why BVH build needs own thing.

Reviewers: brecht, juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1898
2016-04-20 16:19:49 +02:00
Sergey Sharybin
e3544c9e28 Cycles: Throw bad_alloc exception when custom allocators failed to allocate memory
This mimics behavior of default allocators in STL and allows all the routines
to catch out-of-memory exceptions and hopefully recover from that situation/
2016-04-20 15:49:52 +02:00
Thomas Dinges
557544f2c4 Cycles: Refactor Image Texture limits.
Instead of treating Fermi GPU limits as default,
and overriding them for other devices,
we now nicely set them for each platform.

* Due to setting values for all platforms,
we don't have to offset the slot id for OpenCL anymore,
as the image manager wont add float images for OpenCL now.

* Bugfix: TEX_NUM_FLOAT_IMAGES was always 5, even for CPU,
so the code in svm_image.h clamped float textures with alpha on CPU after the 5th slot.

Reviewers: #cycles, brecht

Reviewed By: #cycles, brecht

Subscribers: brecht

Differential Revision: https://developer.blender.org/D1925
2016-04-16 20:49:59 +02:00
Thomas Dinges
9c916b0172 Cleanup: Move texture definitions to util, to avoid bad level include. 2016-04-15 23:02:44 +02:00
Sergey Sharybin
3165e8740b Fix T48139: Checker texture strange behavior in cycles
Seems particular CUDA implementations has some precision issues,
which made integer coordinate (which was expected to always be
positive) to go negative.
2016-04-15 15:30:30 +02:00
Thomas Dinges
c8e2cc21ab Cleanup string includes after versioning commits 2016-04-13 09:45:32 +02:00
Thomas Dinges
3156055e27 Show version number in UI as well 2016-04-13 09:45:30 +02:00
Sergey Sharybin
bd7e4d2a3d Tweaks to the version string formation
Couple of things:

- No need to use string streams to format the version string,
  we can do it at compile time and don't bother with anything
  at runtime.

- Function declaration was wring and would have caused linking
  conflicts in cases when util_version.h was included from
  multiple places.

We should have an utility function to get Cycles version so
applications which are linked to Cycles dynamically can query
the version, but that can't be done as an inlined function in
header and would need to be a function properly exported to a
global symbol table (aka, be implemented in a .cpp file).
2016-04-13 09:45:26 +02:00
Thomas Dinges
ed050753ce Add a version number to Cycles standalone
Now Cycles has its own versioning, that is mainly interesting for external projects, which integrate the engine.

We start with version 1.7.0. Reasons for that:

* The engine is too mature for a 1.0 release.
* We assume that Cycles inside of Blender 2.61 was version 0.1. We count upwards in 0.1 steps, therefore Cycles inside of Blender 2.77 would be 1.7.

We use a common versioning scheme here, with 3 decimals for the major, minor and patch level.

At the moment cycles --version can be used to display the version, easy to parse for external projects. The info will be added to the UI later aswell.
2016-04-13 09:45:23 +02:00
Sergey Sharybin
84c68dcb3f Cycles: Minor cleanup, whitespace around keyword and preprocessor indent 2016-04-13 08:58:52 +02:00
Sergey Sharybin
3a80d5e1d0 Cycles: Fix rare dead-locks on TaskScheduler::exit()
When the Moon is full it was possible to have a dead-lock in task
scheduler's  exit() method.

Similar problem was fixed in Blender's task scheduler 3 years ago
in bae2a2c.
2016-04-10 21:18:54 +02:00
Sergey Sharybin
be2186ad62 Cycles: Solve possible issues with running out of stack memory allocator
Policy here is a bit more complicated, if tree becomes too deep we're
forced to create a leaf node and size of that leaf wouldn't be so well
predicted, which means it's quite tricky to use single stack array for
that.

Made it more official feature that StackAllocator will fall-back to
heap when running out of stack memory.

It's still much better than always using heap allocator.
2016-04-04 14:13:19 +02:00
Sergey Sharybin
5ab3a97dbb Cycles: Log overall time spent on building object's BVH
We had per-tree statistics already, but it's a bit tricky to see overall
time because trees could be building in parallel.

In fact, we can now print statistics for any TaskPool.
2016-04-04 13:43:19 +02:00
e02d0de36e Fix T47505: Cycles OpenCL rendering crash on Windows.
Restore the boost bug workaround, but without changing the locale.
2016-04-01 20:39:07 +02:00
Sergey Sharybin
f318e8322f Cycles: Report thread ID from worker thread to callbacks
Main use case of this ID will be to emulate TLS which otherwise
would require having some platform-specific implementations which
is not always really optimal.

See notes about the argument in util_task.h.
2016-04-01 15:25:35 +02:00
Sergey Sharybin
4738ae085d Cycles: Fix for missing pthread's spin on OSX 2016-04-01 09:16:46 +02:00
Sergey Sharybin
e2059380de Cycles: Add easy to use spin lock primitive
Currently unused, but will be handy for an upcoming changes.

It'll also be nice to be able to do scoped_lock() for both
Mutex and Spin, but currently it's not really easy to do,
need some changes in typedefs and such, will happen as a
separate commit.
2016-03-31 10:22:11 +02:00
Sergey Sharybin
7fd71338f9 Cycles: Expose array's capacity via getter function
This way it's possible to query capacity of an array, which then
could be used for some smart re-allocation and reserve policies.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
ffe59c54cb Cycles: Add STL allocator which uses stack memory
At this point we might want to rename allocator files to
util_allocator_foo.c so the stay nicely grouped in the folder.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
65b375e798 Cycles: Move non-vectorized bitscan() to util
This way we can use bitscan() from both vectorized and non-vectorized
code, which applies to both kernel and host code.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
0b6b094a8c Cycles: Aligned vector was not covered by guarded stat
This was making stats printed by the logging being wrong: they did not
include such memory as BVH storage.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
e4a265f058 Cycles: Add an option to build single kernel only which fits current CPU
This seems quite useful for the development, so you don't need to wait
all the kernels to be re-compiled when working on a new feature, which
speeds up re-iteration.

Marked as an advanced option, so if it doesn't work so well in practice
it's safe to revert anyway.
2016-03-25 16:09:05 +01:00
Sergey Sharybin
700722f686 Cycles: Cleanup, indent nested preprocessor directives
Quite straightforward, main trick is happening in path_source_replace_includes().

Reviewers: brecht, dingto, lukasstockner97, juicyfruit

Differential Revision: https://developer.blender.org/D1794
2016-03-25 13:55:42 +01:00
Sergey Sharybin
21f31e6054 Fix T47856: Cycles problem when running from multi-byte path
This is a mix of regression and old unsupported configuration.

Regression was caused by some checks added on Blender side which was
checking whether python function returned error or not. This made it
impossible to enable Cycles when running from a file path which can't
be encoded with MBCS codepage.

Non-regression issue was that it wasn't possible to use pre-compiled
CUDA kernels when running from a path with non-ascii multi-byte
characters.

This commit fixes regression and CUDA parts, but OSL still can't be
used from a non-ascii location because it uses non-widechar API to
work with file paths by the looks of it. Not sure we can solve this
just from our side by using some codepage trick (UTF-16?) since even
oslc fails to compile shader when there are non-ascii characters in
the path.
2016-03-23 13:58:31 +01:00
Thomas Dinges
79c8eed843 Cleanup: Update Cycles standalone copyright info. 2016-02-27 13:07:32 +01:00
Sergey Sharybin
b30ab24fb8 Cycles: Avoid re-definition of math cnstants with MSVC 2016-02-20 14:06:05 +05:00
Sergey Sharybin
3857b4600f Cycles: Don't silence unused macro, remove the macro instead
It's not really handy to silence something unused hoping for it'll be
used in the future. We can end up with quite some silencing then.

Also made this flag which i find rather useless to NOT cause -Werror
in Cycles code.
2016-02-17 12:40:56 +01:00
Campbell Barton
d4e5e94ec7 Cleanup: unused define warning 2016-02-17 21:39:23 +11:00
Sergey Sharybin
d0246d5f30 Cycles: Some cleanup, should be no functional changes
Addressing meaningful feedback from coverity.
2016-02-16 15:33:00 +01:00
Sergey Sharybin
63b60be6d7 Fix T47427: Crash caused by OSL 2016-02-16 13:38:07 +01:00
Sergey Sharybin
06743f4018 Cycles: Make guarded allocator compatible with MSVC2015 2016-02-15 18:33:36 +05:00
Sergey Sharybin
6371fccdbe Cycles: Fix guarded allocator issues on Windows
The issue was caused by static vectors allocating some internal
data using rebound element allocator for them, which was causing
access to a non-initialized statistics objects and was failing a
lot when switching Blender to a fully guarded allocation.

Additionally, we were not able to free that internal memory before
Blender exits, which was causing false-positive memory leak prints.

Now we're not using GuardedAllocator for those proxy containers.

Ideally this should be done as a GuardedAllocator::rebind, but
it didn't work for vector<bool> because it seems some internal
parts are converting bool to char32_t, which either makes it so
we can't use GuardedAllocator for those vectors or the compiler
get's confused when we're trying explicitly allow GuardedAllocator
for rebind<char32_t>.

This with current approach we should be fine for the release.
2016-02-15 11:46:13 +01:00
Sergey Sharybin
7d85da882b Cycles: Fix infinite recursion of md5 calculation on Windows
Was caused by some safety things of making sure we've for NULL
terminator for the buffer when doing mbs<->wcs conversion, but
it turns out this simply confuses str::string and it can no
longer have proper .size(). Let's assume behavior of string
allocation is same all over the std, and we can avoid having
that extra null-terminator allocated.
2016-02-14 21:08:11 +05:00
Thomas Dinges
6a593aba44 Cleanup: Move Cycles sky model data to util. 2016-02-13 13:41:40 +01:00
Sergey Sharybin
89b1f042cf Cycles: Fix compilation error on Windows 2016-02-13 13:29:13 +01:00