This gives about 2x speedup (3.2sec vs. 11.9sec with 32716 handled nodes) when
updating shader for the shader tree.
Reviewers: brecht, juicyfruit, dingto, lukasstockner97
Differential Revision: https://developer.blender.org/D1700
This makes it possible to move some parts of evaluation from host to the device
and hopefully reduce memory usage by avoid having full RGBA buffer on the host.
Reviewers: juicyfruit, lukasstockner97, brecht
Reviewed By: lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1702
Main goal is to make kernel signatures editing easier and less prone to the
errors caused by missing function signature update or so.
This will also make it easier to add new CPU architectures.
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: dingto, lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1703
The idea is to have separate sets per node name in order to speed up the
comparison process. This will use a bit more memory and slow down simple
shaders, but this extra memory is not so much huge and time penalty is
not really measurable (at least from initial tests).
This saves orders of magnitude seconds when de-duplicating 17K nodes and
overall process now takes 0.01sec on my laptop,
Use Summary structure to collect all summary related on the shader compilation
process which then could be either simply reported to the log or be passed to
some user interface or so.
This is type of the summary / report which is most flexible and useful and
something we could use for other parts like shader optimization.
The idea of this commit is to merge nodes which has identical settings
and matching inputs into a single node in order to minimize number of
SVM instructions.
This is quite simple bottom-top graph traversal and the trickiest part
is how to compare node settings without too much trouble which seems to
be solved is quite clean way.
Still possibilities for further improvements:
- Support comparison of BSDF nodes
- Support comparison of volume nodes
- Support comparison of curve mapping/ramp nodes
Reviewers: brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1673
Issue was that dispatchEvent might call removeWindowEvents/
removeTypeEvents which will delete the event before we can do so.
To address this, handled events are now put in a separate list.
Reported by psy-fi and reviewed by brecht in IRC.
The events are allocated on the heap, then pushed on a stack. Before
being processed, they are popped from the stack, and deleted after
processing is done. When the manager is destroyed (e.g. application
closing), any remaining event in the stack is detroyed.
Issue is that when the "application closing" event is processed, it is
never freed, because the manager gets destroyed before the call to
`delete` is made and the event is not on the stack anymore.
Now events are left on the stack while they are processed, and only
popped and deleted after processing is done.
As a slight bonus refactor: use void as return type for dispatch events
functions, as no caller is checking the return value, and it is not
clear what it means (suggested by the reviewer).
Reviewers: brecht
Differential Revision: https://developer.blender.org/D1695
Historically blender had an audio sample rate of 44.1 kHz as default which is mostly popular because it's the sample rate of audio CDs. Audaspace kept using this default from the pre 2.5 era. It was about time to change to 48 kHz, which is a more widespread standard nowadays, especially in video. It is the recommended sampling rate of the Audio Engineering Society.
Further reading: https://en.wikipedia.org/wiki/44,100_Hz#Status
- When rendering in the Viewport, next_tile is sometimes called after a reset has been performed, but before
new tiles were generated. In that case, the tile list would be invalid, causing Blender to crash randomly.
- When generating new tiles, the TileManager would not clear the tile lists before re-generating them, leading
to some tiles being skipped during viewport rendering.
- When popping the next tile from a tile list, a reference to the just-deleted object would be returned, now the
object is copied before deleting it.
This way socket type conversions (such as color to float, or float to vector) do not stop the folding process.
Example: http://www.pasteall.org/pic/show.php?id=96803 (selected nodes are folded).
This commit modifies the TileManager to sort render tiles once after tiling the image,
instead of searching the next tile every time a new tile is acquired by a device.
This makes acquiring a tile run in constant time, therefore the render time is linear
w.r.t. the amount of tiles, instead of the quadratic dependency before.
Furthermore, each (logical) device now has its own Tile list, which makes acquiring
a tile for a specific device easier.
Also, some code in the TileManager was deduplicated.
Reviewers: dingto, sergey
Differential Revision: https://developer.blender.org/D1684
This is actually intended behavior to return NULL when the socket is not
found. It's used in certain BSDF nodes to query whether some inputs exists
or not.
Perhaps we can be more explicit here and have dedicated logic to query
socket existance and keep assert in place.
In any case, even if we lost assert() for the constant fold now it's
still somewhat better than duplicated code. Perhaps.
Use float in moto instead of double for MT_Scalar.
This switch allow future optimization like SSE.
Additionally, it changes the OpenGL calls to float versions as they are
very bad with doubles.
Reviewers: campbellbarton, moguri, lordloki
Reviewed By: lordloki
Subscribers: brecht, lordloki
Differential Revision: https://developer.blender.org/D1610
Performance is about the same or slightly better for typical IK chains.
In extreme cases with many bones and multiple targets, of which some are
unreachable, I've seen 2x speedups.
Maybe this is pedantic but I read it’s best to explicitly set the
desired component size.
Also append “_ARB” to float texture formats since those need an
extension in GL 2.1.