This patch adds the "Hilbert Spiral", a custom-designed continuous space-filling curve, as a tile order for rendering in Cycles.
It essentially works by dividing the tiles into tile blocks which are processed in a spiral outwards from the center. Inside each
block, the tiles are processed in a regular Hilbert curve pattern. By rotating that pattern according to the spiral direction,
a continuous curve is obtained, which helps with cache coherency and therefore rendering speed.
The curve is a compromise between the faster-rendering Bottom-to-Top etc. orders and the Center order, which is a bit slower,
but starts with the more important areas. The Hilbert Spiral also starts in the center (unless huge tiles are used) and is still
marginally slower than Bottom-to-Top, but noticeably faster than Center.
Reviewers: sergey, #cycles, dingto
Reviewed By: #cycles, dingto
Subscribers: iscream, gregzaal, sergey, mib2berlin
Differential Revision: https://developer.blender.org/D1166
Patch from be28706 made it so integrator will use last shader's transparent
shadow flag, which is wrong since last shader might not have transparent
shadow while shaders prior to it might have one.
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
The issue was caused by OSL using TLS which is required to be freed before the
Cycles session is freed. This is quite tricky to do in Cycles because different
render session are sharing the same task scheduler, so when one session is being
freed TLS might need to be active still.
In order to solve this, we are now doing JIT optimization ahead of the time
which ensures either TLS of JIT is freed before the render on multi-core system
or freed on OSLRenderSession destroy on single-core system.
This might increase synchronization time due to JIT of unused function, but
that we can solve later with some smart idea,
This commit overrides the user's choice of tile order in the case of viewport rendering and always uses bottom-to-top instead.
This was already done until the TileManager redesign, but since it removed the distinction between viewport and regular rendering
in the manager, the viewport was now also using the selected order. Since this requires sorting of the generated tiles,
it slows down rendering a bit. With the forced bottom-to-top order, this sorting step can now be avoided again.
Since the tile order is invisible anyways for viewport rendering, this commit won't have any impact on users (apart from a slight speedup).
This commit adds "Bands Saw" and "Rings Saw" to the options for the Wave texture node in Cycles, behaving similar to the Saw option in BI textures.
Requested by @cekuhnen on BA.
Reviewers: dingto, sergey
Subscribers: cekuhnen
Differential Revision: https://developer.blender.org/D1699
This is an attempt to emulate real CMOS cameras which reads sensor by scanlines
and hence different scanlines are sampled at a different moment in time, which
causes so called rolling shutter effect. This effect will, for example, make
vertical straight lines being curved when doing horizontal camera pan.
This is controlled by the Shutter Type option in the Motion Blur panel.
Additionally, since scanline sampling is not instantaneous it's possible to have
motion blur on top of rolling shutter.
This is controlled by the Rolling Shutter Time slider which controls balance
between pure rolling shutter effect and pure motion blur effect.
Reviewers: brecht, juicyfruit, dingto, keir
Differential Revision: https://developer.blender.org/D1624
Vector mapping node was doing some weird mapping of both original and mapped
coordinates. Mapping of original coordinates was caused by the clamping nature
of the LUT generated from the node. Mapping of the mapped value again was quite
totally obscure -- one needed to constantly keep in mind that actual value will
be scaled up and moved down.
This commit makes it so values in the vector curve mapping are always absolute.
In fact, it is now behaving quite the same as RGB curve mapping node and the
code could be de-duplicated. Keeping the code duplicated for a bit so it's more
clear what exact parts of the node changed.
Reviewers: brecht
Subscribers: bassamk
Differential Revision: https://developer.blender.org/D1672
This gives about 2x speedup (3.2sec vs. 11.9sec with 32716 handled nodes) when
updating shader for the shader tree.
Reviewers: brecht, juicyfruit, dingto, lukasstockner97
Differential Revision: https://developer.blender.org/D1700
The idea is to have separate sets per node name in order to speed up the
comparison process. This will use a bit more memory and slow down simple
shaders, but this extra memory is not so much huge and time penalty is
not really measurable (at least from initial tests).
This saves orders of magnitude seconds when de-duplicating 17K nodes and
overall process now takes 0.01sec on my laptop,
Use Summary structure to collect all summary related on the shader compilation
process which then could be either simply reported to the log or be passed to
some user interface or so.
This is type of the summary / report which is most flexible and useful and
something we could use for other parts like shader optimization.
The idea of this commit is to merge nodes which has identical settings
and matching inputs into a single node in order to minimize number of
SVM instructions.
This is quite simple bottom-top graph traversal and the trickiest part
is how to compare node settings without too much trouble which seems to
be solved is quite clean way.
Still possibilities for further improvements:
- Support comparison of BSDF nodes
- Support comparison of volume nodes
- Support comparison of curve mapping/ramp nodes
Reviewers: brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1673
- When rendering in the Viewport, next_tile is sometimes called after a reset has been performed, but before
new tiles were generated. In that case, the tile list would be invalid, causing Blender to crash randomly.
- When generating new tiles, the TileManager would not clear the tile lists before re-generating them, leading
to some tiles being skipped during viewport rendering.
- When popping the next tile from a tile list, a reference to the just-deleted object would be returned, now the
object is copied before deleting it.
This way socket type conversions (such as color to float, or float to vector) do not stop the folding process.
Example: http://www.pasteall.org/pic/show.php?id=96803 (selected nodes are folded).
This commit modifies the TileManager to sort render tiles once after tiling the image,
instead of searching the next tile every time a new tile is acquired by a device.
This makes acquiring a tile run in constant time, therefore the render time is linear
w.r.t. the amount of tiles, instead of the quadratic dependency before.
Furthermore, each (logical) device now has its own Tile list, which makes acquiring
a tile for a specific device easier.
Also, some code in the TileManager was deduplicated.
Reviewers: dingto, sergey
Differential Revision: https://developer.blender.org/D1684
This is actually intended behavior to return NULL when the socket is not
found. It's used in certain BSDF nodes to query whether some inputs exists
or not.
Perhaps we can be more explicit here and have dedicated logic to query
socket existance and keep assert in place.
In any case, even if we lost assert() for the constant fold now it's
still somewhat better than duplicated code. Perhaps.
This way, connecting Value or RGB node to e.g. a Math node will still allow folding.
Note: The same should be done for the ConvertNode, but I leave that for another day.
Previously RGB Curves node will clamp input to 0..1 which is rather useless
when one wants to use HDR image textures and do bit of correction on them.
Now kernel code supports extrapolation of baked LUT based on first/last two
table points and performs linear extrapolation.
The only tricky part is to guess the range to bake the LUT for. Currently
it's using simple approach -- minmax of the input curves. While this behaves
ok for the simple cases it's easy to trick the system up causing incorrect
results.
Not sure we can solve those issues in a general case and since the new code
is giving more expected results it's not that bad actually. In the worst
case artist migh always create explicit point to make sure LUT is created
for the needed HDR range.
Reviewers: brecht, juicyfruit
Subscribers: sebastian_k
Differential Revision: https://developer.blender.org/D1658
This reduces stress on the the stack memory which could be really handy
on certain operation systems which applies strict limits on the stack.
Reviewers: brecht, juicyfruit, dingto
Reviewed By: brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1656
Previously render nodes will be always created with just a VECTOR socket
type and then those sockets will try to be set as all point, vector and
normal to work around lack of such a subtype distinguishing in blender.
This change makes it so subtype is being queried from OSL itself and
proper subtupe is being used for socket.
It's still not in use for the official builds because it requires changes
applied recently on the 1.7 branch of OSL:
https://github.com/imageworks/OpenShadingLanguage/commit/f70e58f
This solves artists confusion reported in T46117.
Reviewers: #cycles, juicyfruit
Reviewed By: #cycles, juicyfruit
Subscribers: juicyfruit
Differential Revision: https://developer.blender.org/D1627
Graph::disconnect() actually modifies links, needs to create a copy to iterate
if disconnect happens form inside the loop.
Question tho whether we can control this somehow..
Reported by BzztPloink in IRC, thanks!
Seems set_intersection() requires passing explicit comparator if non-default
one is used for the sets. A bit weird, but can't really find another explanation
here about whats' going on here.
The issue was caused by not really optimal graph traversal for gathering nodes
dependencies which could have exponential complexity with a long tree branches
connected with multiple connections between them.
Now we optimize the depth traversal and perform early output if the node was
already traversed.
Please note that this adds some limitations to the use of SVM compiler's
find_dependencies() in the cases when skip_node is not NULL and one wants to
perform dependencies find sequentially with the same set. This doesn't happen
in the code, but one should be aware of this.
The issue was than nodes dependencies were stored as set<ShaderNode*> which
is actually a so called "strict weak ordered", meaning order of nodes in
the set is strictly defined, but based on the ShaderNode pointer. This means
that between different render invokations order of original nodes could be
different due to different pointers allocated for ShaderNode.
This commit makes it so dependencies and maps used for ShaderNodes are based
on the node->id which has much more predictable order. It's still possible
to trick the system by doing some crazy edits during viewport rendfer and
cause difference between viewport and final render stacks.
Reviewers: brecht
Reviewed By: brecht
Subscribers: LazyDodo
Differential Revision: https://developer.blender.org/D1630
This is sort of extension of existing Use Environment option which now allows to
disable AO on the render layer basis.
Useful in cases like disabling AO for the background because it might make it
too flat and so.
Reviewers: juicyfruit, dingto, brecht
Reviewed By: brecht
Subscribers: eyecandy, venomgfx
Differential Revision: https://developer.blender.org/D1633
This gives few percent extra memory saving for the CUDA kernel when
using regular path tracing.
Still more like an experiment, but will be handy in the future.
Summary: By calculating the Camera-to-Screen-Matrix first, one inversion can be saved in the Camera sync.
It won't really improve speed and/or precision, it's mainly a small cleanup.
Reviewers: sergey, dingto
Subscribers:
Basically we can not use sharp closure as a substitude when filter glossy is
used. This is because we can not blur sharp reflection/refraction.
This is quite quick and not really clean implementation. Not really happy
with manual handling of original settings, but this is as good as we can do
in the quick patch. It's a good acknowledgment and we now can re-consider
some aspects of graph simplification to make such cases more natively
supported.
P.S. This failure would have been shown by our regression tests, so please,
bother a bit to run Cycles's test sweep before doing such optimizations.