Commit Graph

125 Commits

Author SHA1 Message Date
Mike Erwin
291afea8cc OpenGL: clean up use of old extensions 2015-11-24 02:21:07 -05:00
Sergey Sharybin
c08727ebab Cycles: Quick experiment with using feature-adaptive kernels for CUDA
Gives few percent of memory improvement for regular feature set kernel
and could give significant memory improvement for Experimental kernel.
It could also give some degree of performance improvement, but this I
didn't really measure reliably yet.

Code is ifdef-ed for now, since it's only working on Linux and requires
CUDA toolkit to be installed (other platform only use precompiled
kernels).

This is just an experiment for now and a base for the proper feature
support in the future (with runtime compilation using CUDA 7?).
2015-11-21 22:16:01 +05:00
Sergey Sharybin
52d7bc624a Cycles: Make CUDA device internally operate with requested features
This just replaces internal argument `experimental` with `requested_features`
making it possible to access particular requested settings when building
kernels.
2015-11-21 21:49:00 +05:00
Sergey Sharybin
4690281b17 Cycles: Add implementation of clip extension mode
For now there's no OpenCL support, it'll come later.
2015-07-28 14:36:08 +02:00
Sergey Sharybin
3fba620858 Cycles: Prepare for more image extension types support
Basically just replace boolean periodic flag with extension type enum in the
device API.
2015-07-28 14:14:24 +02:00
Sergey Sharybin
4d74180b9f Cycles: Fix for wrong device enumeration in CUDA
it is the same issue as described in the previous commit, original changes
in this area were wrong and only worked on a bugger optimus driver which
simply appeared to work by co-incident and in fact used wrong device..
2015-06-27 15:13:08 +02:00
Sergey Sharybin
63dd554ff1 Cycles: Don't show pre-sm_20 CUDA cards in the device list 2015-06-20 17:34:12 +02:00
Sergey Sharybin
28f798f86e Cycles: Initial support for OpenCL capabilities reports
For now it's just generic information, still need to expose memory, workgorup
sizes and so on.
2015-06-05 14:17:30 +02:00
Sergey Sharybin
399a27b261 Cycles: Code cleanup, spaces around keyword and brace 2015-06-01 19:49:52 +05:00
Sergey Sharybin
2c503d8303 Cycles: Restructure kernel files organization
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.

Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.

Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.

There should be no functional changes.

Reviewers: juicyfruit, dingto

Differential Revision: https://developer.blender.org/D1314
2015-05-22 16:31:34 +05:00
Sergey Sharybin
2ab909a88c Cycles: Make experimental kernel build option more generic
Previously it was explicitly mentioning it's NVidia kernel related option,
but in fact it's also handy for the OpenCL kernel.
2015-05-15 13:22:47 +05:00
Antony Riakiotakis
4fc3188112 Cycles: Get rid of one more OpenGL matrix manipulation/push/pop. 2015-05-11 16:41:18 +02:00
Antony Riakiotakis
e38f914421 Cycles: use vertex buffers when possible to draw tiles on the screen.
Not terribly necessary in this case, since we are just drawing a quad,
but makes blender overall more GL 3.x core ready.
2015-05-11 16:28:41 +02:00
Antony Riakiotakis
5588a51c9c Cycles OpenGL: Don't use full matrix transform when we can just use
simple addition.
2015-05-11 13:10:19 +02:00
Sergey Sharybin
0e4ddaadd4 Cycles: Change the way how we pass requested capabilities to the device
Previously we only had experimental flag passed to device's load_kernel() which
was all fine. But since we're gonna to have some extra parameters passed there
it makes sense to wrap them into a single struct, which will make it easier to
pass stuff around.
2015-05-09 19:05:49 +05:00
Sergey Sharybin
2f5dd83759 Cycles: Add some statistics logging
Covers number of entities in the scene (objects, meshes etc), also reports
sizes of textures being allocated.
2015-04-10 15:37:49 +05:00
Sergey Sharybin
5ff132182d Cycles: Code cleanup, spaces around keywords
This inconsistency drove me totally crazy, it's really confusing
when it's inconsistent especially when you work on both Cycles and
Blender sides.

Shouldn;t cause merge PITA, it's whitespace changes only, Git should
be able to merge it nicely.
2015-03-28 00:15:15 +05:00
Sergey Sharybin
585dd26120 Cycles: Code cleanup, prepare for strict C++ flags 2015-03-27 18:23:31 +05:00
Sergey Sharybin
7ea7c2aab2 Cycles: Fix inconsistent command line used for runtime kernel compilation
Basically build-time compiled kernels were using --fast-math (which is correct)
but run-time compiled did not.
2015-02-02 15:00:21 +05:00
Sergey Sharybin
77e6f2212f Cycles: Allow paths customization via environment variables
This is for development and test environment setup only, not for
regular users usage hence no mentioning in the man page needed.
2015-02-02 02:02:10 +05:00
Sergey Sharybin
a922be9270 Cycles: Repot CPU and CUDA capabilities to system info operator
For CPU it gives available instructions set (SSE, AVX and so).

For GPU CUDA it reports most of the attribute values returned by
cuDeviceGetAttribute(). Ideally we need to only use set of those
which are driver-specific (so we don't clutter system info with
values which we can get from GPU specifications and be sure they
stay the same because driver can't affect on them).
2015-01-06 14:13:21 +05:00
Sergey Sharybin
9e2e408323 Cycles: Add logging to OSL and CUDA initialization/compilation
This is what was handy troubleshooting issues in the studio,
plus this is exactly the same thing which would be helpful
when solving issues with paths to compiled shaders and cubins
for standalone repository.
2015-01-01 01:31:08 +05:00
Thomas Dinges
ee36e75b85 Cleanup: Fix Cycles Apache header.
This was already mixed a bit, but the dot belongs there.
2014-12-25 02:50:24 +01:00
Campbell Barton
c07f6c02b3 Docs: reference the new manual 2014-12-08 11:18:58 +01:00
Thomas Dinges
e3a6f1c152 Cycles: Remove workaround for missing sm_52 kernel, now we require it for Maxwell cards. 2014-12-02 13:45:39 +01:00
Thomas Dinges
4ff8744669 Cycles / CUDA: Better fix for missing sm_52 kernel, in case user compiles himself. 2014-10-30 11:42:59 +01:00
Sergey Sharybin
e556670b36 Cycles: Do cuda pointer arithmetic in integers, don't use pointer arithmetic
This should hopefully fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765187
2014-10-14 17:54:41 +02:00
Sergey Sharybin
0106b94f9d Cycles: Fix for debug kernel not working with CUDA 2014-10-05 15:31:48 +06:00
Thomas Dinges
a613290775 Cycles / CUDA: Workaround to make sm_52 (Maxwell) cards work.
* sm_52 can run a sm_50 kernel, so tell runtime detection to use that until we build a dedicated sm_52 kernel.
2014-10-05 04:13:40 +02:00
Sergey Sharybin
fbed2047c8 Fix wrong track of the memory when doing device vector resize before freeing it
This is rather legit case which happens i.e. when having persistent images enabled
and session is updating the lookup tables.

Now device_memory keeps track of amount of memory being allocated on the device,
which makes freeing using the proper allocated size, not the CPU side buffer
size.
2014-09-04 17:25:12 +06:00
Thomas Dinges
fb3f32760d Cycles: Add an experimental CUDA kernel.
Now we build 2 .cubins per architecture (e.g. kernel_sm_21.cubin, kernel_experimental_sm_21.cubin).
The experimental kernel can be used by switching to the Experimental Feature Set: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Experimental_Features

This enables Subsurface Scattering and Correlated Multi Jitter Sampling on GPU, while keeping the stability and performance of the regular kernel.

Differential Revision: https://developer.blender.org/D762
Patch by Sergey and myself.

Developer / Builder Note:
CUDA Toolkit 6.5 is highly recommended for this, also note that building the experimental kernel requires a lot of system memory (~7-8GB).
2014-08-26 17:02:26 +02:00
Thomas Dinges
603348c56e Cycles: Drop support for CUDA 5.0 Toolkit, only 6.0 and 6.5 (recommended) are supported now. 2014-08-21 23:35:20 +02:00
Dalai Felinto
8d3cc431d7 Fix T41471 Cycles Bake: Setting small tile size results in wrong bake with stripes rather than the expected noise pattern
This problem was introduced in 983cbafd1877f8dbaae60b064a14e27b5b640f18
Basically the issue is that we were not getting a unique index in the
baking routine for the RNG (random number generator).

Reviewers: sergey

Differential Revision: https://developer.blender.org/D749
2014-08-19 11:40:33 +02:00
Dalai Felinto
2c5b6859d9 Revert "Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU"
This reverts commit a48b372b04421b00644a0660bfdf42229b5ffceb.

Leaving only the part that fix device_multi.cpp
2014-08-15 11:27:42 +02:00
Dalai Felinto
a48b372b04 Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU
In collaboration with Sergey Sharybin.

Also thanks to Wolfgang Faehnle (mib2berlin) for help testing the
solutions.

Reviewers: sergey

Differential Revision: https://developer.blender.org/D690
2014-08-05 13:50:50 -03:00
Sergey Sharybin
77b7e1fe9a Deduplicate CUDA and OpenCL wranglers
For now it was mainly about OpenCL wrangler being duplicated
between Cycles and Compositor, but with OpenSubdiv work those
wranglers were gonna to be duplicated just once again.

This commit makes it so Cycles and Compositor uses wranglers
from this repositories:

  - https://github.com/CudaWrangler/cuew
  - https://github.com/OpenCLWrangler/clew

This repositories are based on the wranglers we used before
and they'll be likely continued maintaining by us plus some
more players in the market.

Pretty much straightforward change with some tricks in the
CMake/SCons to make this libs being passed to the linker
after all other libraries in order to make OpenSubdiv linked
against those wranglers in the future.

For those who're worrying about Cycles being less standalone,
it's not truth, it's rather more flexible now and in the future
different wranglers might be used in Cycles. For now it'll
just mean those libs would need to be put into Cycles repository
together with some other libs from Blender such as mikkspace.

This is mainly platform maintenance commit, should not be any
changes to the user space.

Reviewers: juicyfruit, dingto, campbellbarton

Reviewed By: juicyfruit, dingto, campbellbarton

Differential Revision: https://developer.blender.org/D707
2014-08-05 13:57:50 +06:00
Campbell Barton
9c3025cd26 Spelling 2014-08-02 16:53:52 +10:00
Dalai Felinto
fc55c41bba Cycles Bake: show progress bar during bake
Baking progress preview is not possible, in parts due to the way the API
was designed. But at least you get to see the progress bar while baking.

Reviewers: sergey

Differential Revision: https://developer.blender.org/D656
2014-07-25 11:42:53 -03:00
Martijn Berger
bae2b3a688 Switch to Cuda 4.0 style api for kernel invocation. This is a small clean-up that has no functional changes but makes code a bit more readable.
Differential revision: https://developer.blender.org/D659

Reviewed by: Sergey Sharybin, Thomas Dinges
2014-07-25 13:33:19 +02:00
Thomas Dinges
9acabc13de Cleanup: Typo fixes. 2014-07-05 14:25:34 +02:00
Thomas Dinges
5898abe99d Cycles: Update CUDA error messages, based on Toolkit 6.0.
* Removed deprecated erros, and added some new ones, which might help to figure out problems in the future.
2014-07-02 01:50:42 +02:00
Thomas Dinges
4800c52700 Cleanup: Remove unused checks in CUDA device code. 2014-07-02 01:12:13 +02:00
e4e58d4612 Fix T40370: cycles CUDA baking timeout with high number of AA samples.
Now baking does one AA sample at a time, just like final render. There is
also some code for shader antialiasing that solves T40369 but it is disabled
for now because there may be unpredictable side effects.
2014-06-06 15:39:04 +02:00
865dfa8a7e Fix T40228: cycles CUDA multi GPU + world MIS giving error. 2014-06-05 18:10:32 +02:00
69c7522b24 Fix T40379: world MIS causing too much CUDA memory usage.
The kernel for baking the world texture was the same as the one used for
baking. Now that's separate which allows the kernel to reserve much less
memory.
2014-05-27 15:11:32 +02:00
3b53fffb77 Cycles: revert async CUDA changes, these are giving too much trouble still.
Fixes T40027. This means we get more CPU usage again when using multiple CUDA,
but the impact on performance is too big a problem with the current code.
2014-05-19 19:33:09 +02:00
Thomas Dinges
c08c931fb6 Cycles / CUDA: Increase maximum image textures on GPU.
Instead of 95, we can use 145 images now. This only affects Kepler and above (sm30, sm_35 and sm_50).

This can be increased further if needed, but let's first test if this does not come with a performance impact.

Originally developed during my GSoC 2013.
2014-05-11 03:38:39 +02:00
Thomas Dinges
fd26a32aa5 Fix T40119, CUDA Toolkit version mismatch 2014-05-10 01:26:04 +02:00
Campbell Barton
dc13969e48 Style cleanup: indentation, braces 2014-05-05 02:19:08 +10:00
Campbell Barton
1618329b00 Code cleanup: style, require ; for cuda_assert, opencl_assert 2014-05-04 03:57:50 +10:00