Commit Graph

452 Commits

Author SHA1 Message Date
Thomas Dinges
c9f1ed1e4c Cycles: Add support for bindless textures.
This adds support for CUDA Texture objects (also known as Bindless textures) for Kepler GPUs (Geforce 6xx and above).
This is used for all 2D/3D textures, data still uses arrays as before.

User benefits:
* No more limits of image textures on Kepler.
 We had 5 float4 and 145 byte4 slots there before, now we have 1024 float4 and 1024 byte4.
 This can be extended further if we need to (just change the define).

* Single channel textures slots (byte and float) are now supported on Kepler as well (1024 slots for each type).

ToDo / Issues:
* 3D textures don't work yet, at least don't show up during render. I have no idea whats wrong yet.
* Dynamically allocate bindless_mapping array?

I hope Fermi still works fine, but that should be tested on a Fermi card before pushing to master.

Part of my GSoC 2016.

Reviewers: sergey, #cycles, brecht

Subscribers: swerner, jtheninja, brecht, sergey

Differential Revision: https://developer.blender.org/D1999
2016-05-19 13:14:37 +02:00
9dc5367c89 Cleanup code style inconsistency in last commits. 2016-05-17 23:41:45 +02:00
93e4ae84ad Code refactor: add some array utility methods, fix leak in assignment operator. 2016-05-17 21:39:16 +02:00
Thomas Dinges
3c85e1ca1a Cycles: Add support for single channel byte textures.
This way, we also save 3/4th of memory for single channel byte textures (e.g. Bump Maps).

Note: In order for this to work, the texture *must* have 1 channel only.
In Gimp you can e.g. do that via the menu: Image -> Mode -> Grayscale
2016-05-12 14:51:42 +02:00
Thomas Dinges
cde10e774c Fix array bounds compile warning. 2016-05-12 14:20:12 +02:00
Thomas Dinges
8de3303a03 Cleanup: Fix typo. 2016-05-12 02:11:36 +02:00
Thomas Dinges
4a4f043bc4 Cycles: Add support for single channel float textures on CPU.
Until now, single channel textures were packed into a float4, wasting 3 floats per pixel. Memory usage of such textures is now reduced by 3/4.
Voxel Attributes such as density, flame and heat benefit from this, but also Bumpmaps with one channel.
This commit also includes some cleanup and code deduplication for image loading.

Example Smoke render from Cosmos Laundromat: http://www.pasteall.org/pic/show.php?id=102972
Memory here went down from ~600MB to ~300MB.

Reviewers: #cycles, brecht

Differential Revision: https://developer.blender.org/D1981
2016-05-11 21:58:34 +02:00
Sergey Sharybin
92774ff792 Cycles: Use explicit qualifier for single-argument constructors
Almost in all cases we want such constructors to be explicit, there are
exceptions but only in few places.
2016-05-11 16:51:14 +02:00
Thomas Dinges
76481eaeff Cycles: Add support for float4 textures on OpenCL.
Title says it all, this adds OpenCL float4 texture support.

There is a bug in the code still, I get a "Out of ressources error" on nvidia hardware here, not sure whats wrong yet.
Will investigate further, but maybe someone else has an idea. :)

Reviewers: #cycles, brecht

Subscribers: brecht, candreacchio

Differential Revision: https://developer.blender.org/D1983
2016-05-10 02:53:50 +02:00
Thomas Dinges
9a1e11260c Cleanup: More byte -> byte4 renaming for consistency. 2016-05-09 02:22:01 +02:00
Thomas Dinges
734d1aec3f Cycles: Make CUDA adaptive feature compile a Debug flag.
If the CUDA Toolkit is installed and the user is on Linux,
adaptive, feature based CUDA runtime compile is now possible to enable via:

* Environment flag CYCLES_CUDA_ADAPTIVE_COMPILE or
* Debug menu (Debug value 256) in the Cycles UI.
2016-05-06 23:13:33 +02:00
Thomas Dinges
3807bcb3a8 Cleanup: Rename texture slots to float4 and byte, to distinguish from future float (single channel) and half_float slots.
Should be no functional changes, tested CPU and CUDA.
2016-05-06 14:37:35 +02:00
Sergey Sharybin
42fd1b9abe Cycles: Fix issues with stack allocator in MSVC
Couple of issues here:

- Was a bug in heap memory allocation when run out
  of allowed stack memory.

- Debug MSVC was failing because it uses separate
  allocator for some sort of internal proxy thing,
  which seems to be unable to be using stack memory
  because allocator is being created in non-persistent
  stack location.
2016-04-25 13:50:27 +02:00
Sergey Sharybin
02213b867e Cycles: Stop rendering when bad_alloc happens
This is an attempt to gracefully handle out-of-memory events
and stop rendering with an error message instead of a crash.

It uses bad_alloc exception, and usually i'm not really fond
of exceptions, but for such limited use for errors from which
we can't recover it should be fine.

Ideally we'll need to stop full Cycles Session, so viewport
render and persistent images frees all the memory, but that
we can support later, since it'll mainly related on telling
Blender what to do.

General rules are:

- Use as less exception handles as possible, try to find a
  most geenric pace where to handle those.

  For example, ccl::Session.

- Threads needs own handling, exception trap from one thread
  will not catch exceptions from other threads.

  That's why BVH build needs own thing.

Reviewers: brecht, juicyfruit, dingto, lukasstockner97

Differential Revision: https://developer.blender.org/D1898
2016-04-20 16:19:49 +02:00
Sergey Sharybin
e3544c9e28 Cycles: Throw bad_alloc exception when custom allocators failed to allocate memory
This mimics behavior of default allocators in STL and allows all the routines
to catch out-of-memory exceptions and hopefully recover from that situation/
2016-04-20 15:49:52 +02:00
Thomas Dinges
557544f2c4 Cycles: Refactor Image Texture limits.
Instead of treating Fermi GPU limits as default,
and overriding them for other devices,
we now nicely set them for each platform.

* Due to setting values for all platforms,
we don't have to offset the slot id for OpenCL anymore,
as the image manager wont add float images for OpenCL now.

* Bugfix: TEX_NUM_FLOAT_IMAGES was always 5, even for CPU,
so the code in svm_image.h clamped float textures with alpha on CPU after the 5th slot.

Reviewers: #cycles, brecht

Reviewed By: #cycles, brecht

Subscribers: brecht

Differential Revision: https://developer.blender.org/D1925
2016-04-16 20:49:59 +02:00
Thomas Dinges
9c916b0172 Cleanup: Move texture definitions to util, to avoid bad level include. 2016-04-15 23:02:44 +02:00
Sergey Sharybin
3165e8740b Fix T48139: Checker texture strange behavior in cycles
Seems particular CUDA implementations has some precision issues,
which made integer coordinate (which was expected to always be
positive) to go negative.
2016-04-15 15:30:30 +02:00
Thomas Dinges
c8e2cc21ab Cleanup string includes after versioning commits 2016-04-13 09:45:32 +02:00
Thomas Dinges
3156055e27 Show version number in UI as well 2016-04-13 09:45:30 +02:00
Sergey Sharybin
bd7e4d2a3d Tweaks to the version string formation
Couple of things:

- No need to use string streams to format the version string,
  we can do it at compile time and don't bother with anything
  at runtime.

- Function declaration was wring and would have caused linking
  conflicts in cases when util_version.h was included from
  multiple places.

We should have an utility function to get Cycles version so
applications which are linked to Cycles dynamically can query
the version, but that can't be done as an inlined function in
header and would need to be a function properly exported to a
global symbol table (aka, be implemented in a .cpp file).
2016-04-13 09:45:26 +02:00
Thomas Dinges
ed050753ce Add a version number to Cycles standalone
Now Cycles has its own versioning, that is mainly interesting for external projects, which integrate the engine.

We start with version 1.7.0. Reasons for that:

* The engine is too mature for a 1.0 release.
* We assume that Cycles inside of Blender 2.61 was version 0.1. We count upwards in 0.1 steps, therefore Cycles inside of Blender 2.77 would be 1.7.

We use a common versioning scheme here, with 3 decimals for the major, minor and patch level.

At the moment cycles --version can be used to display the version, easy to parse for external projects. The info will be added to the UI later aswell.
2016-04-13 09:45:23 +02:00
Sergey Sharybin
84c68dcb3f Cycles: Minor cleanup, whitespace around keyword and preprocessor indent 2016-04-13 08:58:52 +02:00
Sergey Sharybin
3a80d5e1d0 Cycles: Fix rare dead-locks on TaskScheduler::exit()
When the Moon is full it was possible to have a dead-lock in task
scheduler's  exit() method.

Similar problem was fixed in Blender's task scheduler 3 years ago
in bae2a2c.
2016-04-10 21:18:54 +02:00
Sergey Sharybin
be2186ad62 Cycles: Solve possible issues with running out of stack memory allocator
Policy here is a bit more complicated, if tree becomes too deep we're
forced to create a leaf node and size of that leaf wouldn't be so well
predicted, which means it's quite tricky to use single stack array for
that.

Made it more official feature that StackAllocator will fall-back to
heap when running out of stack memory.

It's still much better than always using heap allocator.
2016-04-04 14:13:19 +02:00
Sergey Sharybin
5ab3a97dbb Cycles: Log overall time spent on building object's BVH
We had per-tree statistics already, but it's a bit tricky to see overall
time because trees could be building in parallel.

In fact, we can now print statistics for any TaskPool.
2016-04-04 13:43:19 +02:00
e02d0de36e Fix T47505: Cycles OpenCL rendering crash on Windows.
Restore the boost bug workaround, but without changing the locale.
2016-04-01 20:39:07 +02:00
Sergey Sharybin
f318e8322f Cycles: Report thread ID from worker thread to callbacks
Main use case of this ID will be to emulate TLS which otherwise
would require having some platform-specific implementations which
is not always really optimal.

See notes about the argument in util_task.h.
2016-04-01 15:25:35 +02:00
Sergey Sharybin
4738ae085d Cycles: Fix for missing pthread's spin on OSX 2016-04-01 09:16:46 +02:00
Sergey Sharybin
e2059380de Cycles: Add easy to use spin lock primitive
Currently unused, but will be handy for an upcoming changes.

It'll also be nice to be able to do scoped_lock() for both
Mutex and Spin, but currently it's not really easy to do,
need some changes in typedefs and such, will happen as a
separate commit.
2016-03-31 10:22:11 +02:00
Sergey Sharybin
7fd71338f9 Cycles: Expose array's capacity via getter function
This way it's possible to query capacity of an array, which then
could be used for some smart re-allocation and reserve policies.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
ffe59c54cb Cycles: Add STL allocator which uses stack memory
At this point we might want to rename allocator files to
util_allocator_foo.c so the stay nicely grouped in the folder.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
65b375e798 Cycles: Move non-vectorized bitscan() to util
This way we can use bitscan() from both vectorized and non-vectorized
code, which applies to both kernel and host code.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
0b6b094a8c Cycles: Aligned vector was not covered by guarded stat
This was making stats printed by the logging being wrong: they did not
include such memory as BVH storage.
2016-03-31 10:06:21 +02:00
Sergey Sharybin
e4a265f058 Cycles: Add an option to build single kernel only which fits current CPU
This seems quite useful for the development, so you don't need to wait
all the kernels to be re-compiled when working on a new feature, which
speeds up re-iteration.

Marked as an advanced option, so if it doesn't work so well in practice
it's safe to revert anyway.
2016-03-25 16:09:05 +01:00
Sergey Sharybin
700722f686 Cycles: Cleanup, indent nested preprocessor directives
Quite straightforward, main trick is happening in path_source_replace_includes().

Reviewers: brecht, dingto, lukasstockner97, juicyfruit

Differential Revision: https://developer.blender.org/D1794
2016-03-25 13:55:42 +01:00
Sergey Sharybin
21f31e6054 Fix T47856: Cycles problem when running from multi-byte path
This is a mix of regression and old unsupported configuration.

Regression was caused by some checks added on Blender side which was
checking whether python function returned error or not. This made it
impossible to enable Cycles when running from a file path which can't
be encoded with MBCS codepage.

Non-regression issue was that it wasn't possible to use pre-compiled
CUDA kernels when running from a path with non-ascii multi-byte
characters.

This commit fixes regression and CUDA parts, but OSL still can't be
used from a non-ascii location because it uses non-widechar API to
work with file paths by the looks of it. Not sure we can solve this
just from our side by using some codepage trick (UTF-16?) since even
oslc fails to compile shader when there are non-ascii characters in
the path.
2016-03-23 13:58:31 +01:00
Thomas Dinges
79c8eed843 Cleanup: Update Cycles standalone copyright info. 2016-02-27 13:07:32 +01:00
Sergey Sharybin
b30ab24fb8 Cycles: Avoid re-definition of math cnstants with MSVC 2016-02-20 14:06:05 +05:00
Sergey Sharybin
3857b4600f Cycles: Don't silence unused macro, remove the macro instead
It's not really handy to silence something unused hoping for it'll be
used in the future. We can end up with quite some silencing then.

Also made this flag which i find rather useless to NOT cause -Werror
in Cycles code.
2016-02-17 12:40:56 +01:00
Campbell Barton
d4e5e94ec7 Cleanup: unused define warning 2016-02-17 21:39:23 +11:00
Sergey Sharybin
d0246d5f30 Cycles: Some cleanup, should be no functional changes
Addressing meaningful feedback from coverity.
2016-02-16 15:33:00 +01:00
Sergey Sharybin
63b60be6d7 Fix T47427: Crash caused by OSL 2016-02-16 13:38:07 +01:00
Sergey Sharybin
06743f4018 Cycles: Make guarded allocator compatible with MSVC2015 2016-02-15 18:33:36 +05:00
Sergey Sharybin
6371fccdbe Cycles: Fix guarded allocator issues on Windows
The issue was caused by static vectors allocating some internal
data using rebound element allocator for them, which was causing
access to a non-initialized statistics objects and was failing a
lot when switching Blender to a fully guarded allocation.

Additionally, we were not able to free that internal memory before
Blender exits, which was causing false-positive memory leak prints.

Now we're not using GuardedAllocator for those proxy containers.

Ideally this should be done as a GuardedAllocator::rebind, but
it didn't work for vector<bool> because it seems some internal
parts are converting bool to char32_t, which either makes it so
we can't use GuardedAllocator for those vectors or the compiler
get's confused when we're trying explicitly allow GuardedAllocator
for rebind<char32_t>.

This with current approach we should be fine for the release.
2016-02-15 11:46:13 +01:00
Sergey Sharybin
7d85da882b Cycles: Fix infinite recursion of md5 calculation on Windows
Was caused by some safety things of making sure we've for NULL
terminator for the buffer when doing mbs<->wcs conversion, but
it turns out this simply confuses str::string and it can no
longer have proper .size(). Let's assume behavior of string
allocation is same all over the std, and we can avoid having
that extra null-terminator allocated.
2016-02-14 21:08:11 +05:00
Thomas Dinges
6a593aba44 Cleanup: Move Cycles sky model data to util. 2016-02-13 13:41:40 +01:00
Sergey Sharybin
89b1f042cf Cycles: Fix compilation error on Windows 2016-02-13 13:29:13 +01:00
Sergey Sharybin
ae635771b2 Cycles: Fix crash caused by the guarded allocation commit
C++ requires specific alignment of the allocations which was not an
issue when using GCC but uncovered issue when using Clang on OSX.
Perhaps some versions of Clang might show errors on other platforms
as well.
2016-02-13 12:35:33 +01:00
Sergey Sharybin
1477028ea3 Cycles: Fix compilation error with MinGW
Was using some C++0 which we don't officially enabled yet.
2016-02-12 20:21:13 +01:00