BVH traversal is not really that much a geometry and we've got
quite some traversals now. Makes sense to keep them separate in
the name of source structure clarity.
For non-branched path tracing with a GTX 960 and CUDA 7.5, this gives a small reduction
in stack usage but mainly: 8% faster render on BMW, 5% on pabellon, 13% on classroom.
This is an initial commit for half texture support in Cycles.
It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU.
Supported:
* Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO).
ToDo:
Various things like support for inbuilt half textures, GPU... will come later, step by step.
Part of my GSoC 2016.
Compile time per kernel increased alot after recent image commits, re-shuffle some code to fix this.
Patch by "LazyDodo".
Differential Revision: https://developer.blender.org/D2012
This way, we also save 3/4th of memory for single channel byte textures (e.g. Bump Maps).
Note: In order for this to work, the texture *must* have 1 channel only.
In Gimp you can e.g. do that via the menu: Image -> Mode -> Grayscale
Until now, single channel textures were packed into a float4, wasting 3 floats per pixel. Memory usage of such textures is now reduced by 3/4.
Voxel Attributes such as density, flame and heat benefit from this, but also Bumpmaps with one channel.
This commit also includes some cleanup and code deduplication for image loading.
Example Smoke render from Cosmos Laundromat: http://www.pasteall.org/pic/show.php?id=102972
Memory here went down from ~600MB to ~300MB.
Reviewers: #cycles, brecht
Differential Revision: https://developer.blender.org/D1981
* When Baking wasn't used we got an error.
* On top of Volume Nodes (NODES_FEATURE_VOLUME), we now also check if we need volume sampling code,
so we can disable that as well and save some further compilation time.
This seems quite useful for the development, so you don't need to wait
all the kernels to be re-compiled when working on a new feature, which
speeds up re-iteration.
Marked as an advanced option, so if it doesn't work so well in practice
it's safe to revert anyway.
The idea is to switch from allocating separate buffers for shader data's
structure of arrays to allocating one huge memory block and do some index
trickery to make it accessed as SOA.
This saves quite reasonable amount of lines of code in device_opencl and
also makes it possible to get rid of special declaration of ShaderData
structure.
As a side effect it also makes it easier to experiment with SOA vs. AOS
for split kernel.
Works fine here on NVidia GTX580, Intel CPU amd AMD Fiji cards.
Reviewers: #cycles, brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1593
It was mainly unfinished code for volume in a split kernel which
should be done differently anyway to avoid such a code copy-paste.
The code didn't really work, so likely nobody will cry.
Use KernelGlobals to access all the global arrays for the intermediate
storage instead of passing all this storage things explicitly.
Tested here with Intel OpenCL, NVIDIA GTX580 and AMD Fiji, didn't see
any artifacts, so guess it's all good.
Reviewers: juicyfruit, dingto, lukasstockner97
Differential Revision: https://developer.blender.org/D1736
The combined pass is built with the contributions the user finds fit.
It is useful for lightmap baking, as well as non-view dependent effects
baking.
The manual will be updated once we get closer to the 2.77 release.
Meanwhile the new page can be found here:
http://dalaifelinto.com/blender-manual/render/cycles/baking.html
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D1674
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
While SCons building system was serving us really good for ages it's no longer
having much attention by the developers and started to become quite a difficult
task to maintain.
What's even worse -- there started to be quite serious divergence between SCons
and CMake which was only accumulating over the releases now. The fact that none
of the active developers are really using SCons and that our main studio is also
using CMake spotting bugs in the SCons builds became quite a difficult task and
we aren't always spotting them in time.
Meanwhile CMake became really mature building system which is available on every
platform we support and arguably it's also easier and more robust to use.
This commit includes:
- Removal of actual SCons building system
- Removal of SCons git submodule
- Removal of documentation which is stored in the sources and covers SCons
- Tweaks to the buildbot master to stop using SCons submodule
(this change requires deploying to the server)
- Tweaks to the install dependencies script to skip installing or mentioning
SCons building system
- Tweaks to various helper scripts to avoid mention of SCons folders/files
as well
Reviewers: mont29, dingto, dfelinto, lukastoenne, lukasstockner97, brecht, Severin, merwin, aligorith, psy-fi, campbellbarton, juicyfruit
Reviewed By: campbellbarton, juicyfruit
Differential Revision: https://developer.blender.org/D1680
This makes it possible to move some parts of evaluation from host to the device
and hopefully reduce memory usage by avoid having full RGBA buffer on the host.
Reviewers: juicyfruit, lukasstockner97, brecht
Reviewed By: lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1702
Main goal is to make kernel signatures editing easier and less prone to the
errors caused by missing function signature update or so.
This will also make it easier to add new CPU architectures.
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: dingto, lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1703
Currently only two mappings are supported by API, which is Repeat (old behavior)
and new Clip behavior. Internally this extension is being converted to periodic
flag which was already supported but wasn't exposed.
There's no support for OpenCL yet because of the way how we pack images into a
single texture.
Those settings are not exposed to UI or anywhere else and there should be no
functional changes so far.
Code there started becoming a bit too big, by splitting it up it'll make it
easier to do improvements or extending the features in there.
The layout is not totally final yet, would need to try de-duplicating parts
of code from split kernel with non-split integrators,
This commit re-shuffles code in split kernel once again and makes it so common
parts which is in the headers is only responsible to making all the work needed
for specified ray index. Getting ray index, checking for it's validity and
enqueuing tasks are now happening in the device specified part of the kernel.
This actually makes sense because enqueuing is indeed device-specified and i.e.
with CUDA we'll want to enqueue kernels from kernel and avoid CPU roundtrip.
TODO:
- Kernel comments are still placed in the common header files, but since queue
related stuff is not passed to those functions those comments might need to
be split as well.
Just currently read them considering that they're also covering the way how
all devices are invoking the common code path.
- Arguments might need to be wrapped into KernelGlobals, so we don't ened to
pass all them around as function arguments.
Since the kernel split work we're now having quite a few of new files, majority
of which are related on the kernel entry points. Keeping those files in the
root kernel folder will eventually make it really hard to follow which files are
actual implementation of Cycles kernel.
Those files are now moved to kernel/kernels/<device_type>. This way adding extra
entry points will be less noisy. It is also nice to have all device-specific
files grouped together.
Another change is in the way how split kernel invokes logic. Previously all the
logic was implemented directly in the .cl files, which makes it a bit tricky to
re-use the logic across other devices. Since we'll likely be looking into doing
same split work for CUDA devices eventually it makes sense to move logic from
.cl files to header files. Those files are stored in kernel/split. This does not
mean the header files will not give error messages when tried to be included
from other devices and their arguments will likely be changed, but having such
separation is a good start anyway.
There should be no functional changes.
Reviewers: juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1314