The issue was caused by possible use of object->derivedFinal from the render
thread, The patch tries to eliminate (or at least minimize, huh) amount of
access to the derivedFinal of a source object. It's still possible that in
the case of particle source derived mesh will be still unsafely used, but
with the patch applied we can easily change runtime part of the code and
cache derived mesh on the preparation stage.
Some ideas for the future:
- Check whether cache() was called on the point density node when calling
calc().
- Cache derivedMesh in the runtime part of point density node to avoid
possible remained thread conflicts.
- NULL the runtime part of the node on .blend load
Reviewers: campbellbarton, plasmasolutions
Reviewed By: plasmasolutions
Differential Revision: https://developer.blender.org/D1614
Seems set_intersection() requires passing explicit comparator if non-default
one is used for the sets. A bit weird, but can't really find another explanation
here about whats' going on here.
This mimics OpenMP's 'firstprivate' feature. It is sometimes handy to have some persistent local data during a whole chunk.
Reviewers: sergey
Reviewed By: sergey
Subscribers: campbellbarton
Differential Revision: https://developer.blender.org/D1635
The issue was caused by not really optimal graph traversal for gathering nodes
dependencies which could have exponential complexity with a long tree branches
connected with multiple connections between them.
Now we optimize the depth traversal and perform early output if the node was
already traversed.
Please note that this adds some limitations to the use of SVM compiler's
find_dependencies() in the cases when skip_node is not NULL and one wants to
perform dependencies find sequentially with the same set. This doesn't happen
in the code, but one should be aware of this.
The issue was than nodes dependencies were stored as set<ShaderNode*> which
is actually a so called "strict weak ordered", meaning order of nodes in
the set is strictly defined, but based on the ShaderNode pointer. This means
that between different render invokations order of original nodes could be
different due to different pointers allocated for ShaderNode.
This commit makes it so dependencies and maps used for ShaderNodes are based
on the node->id which has much more predictable order. It's still possible
to trick the system by doing some crazy edits during viewport rendfer and
cause difference between viewport and final render stacks.
Reviewers: brecht
Reviewed By: brecht
Subscribers: LazyDodo
Differential Revision: https://developer.blender.org/D1630
This gives much lower stack usage on GPU and reduces kernel memory size to
around 448MB on GTX560Ti (comparing to 652MB with previous commit and 946MB
with official release). There's also a barely measurable speedup of around
5%, but this is to be confirmed still.
At this stage we're using only ~3% for the experimental kernel and SSS
rendering seems to be faster by 40% and after some further testing we might
consider making SSS and CMJ official features and remove experimental
precompiled kernels.
The idea is to delay shooting indirect rays for the SSS sampling and
trace them after the main integration loop was finished.
This reduces GPU stack usage even further and brings it down to around
652MB (comparing to 722MB before the change and 946MB with previous
stable release).
This also solves the speed regression happened in the previous commit
and now simple SSS scene (SSS suzanne on the floor) renders in 0:50
(comparing to 1:16 with previous commit and 1:03 with official release).
This commit introduces a SSS-oriented intersection structure which is replacing
old logic of having separate arrays for just intersections and shader data and
encapsulates all the data needed for SSS evaluation.
This giver a huge stack memory saving on GPU. In own experiments it gave 25%
memory usage reduction on GTX560Ti (722MB vs. 946MB).
Unfortunately, this gave some performance loss of 20% which only happens on GPU.
This is perhaps due to different memory access pattern. Will be solved in the
future, hopefully.
Famous saying: won in memory - lost in time (which is also valid in other way
around).
Input array length is implicitly set at link time, based on the geometry
shader's layout. Specifying the wrong value here is an error; specifying
no value is the same as getting it right. (inspired by a recent codegen
change)
the paint cursor drawing when a stroke is underway; this information is
already available from the stroke itself.
Should improve performance when "show brush" is enabled (which is
always, for sane people). Finding an intersection is not always cheap,
especially on heavy meshes.
Cudos to the dyntopo test thread for the report and the sculpt love :)
yet another bug introduced by recent shadowing changes -- q and r CCG arrays
were overwritten by the temporary evaluation because the code was changed to
use global pointers instead of the local ones.
Naming of the variables could be changed to be a bit more accurate.
This is sort of extension of existing Use Environment option which now allows to
disable AO on the render layer basis.
Useful in cases like disabling AO for the background because it might make it
too flat and so.
Reviewers: juicyfruit, dingto, brecht
Reviewed By: brecht
Subscribers: eyecandy, venomgfx
Differential Revision: https://developer.blender.org/D1633
This patch allows the game engine to keep running while performing things like PNG compression and disk I/O.
As an example, my crowd simulation rasterizer saves a screenshot for every frame. This now takes up 13 msec per frame, which was 31 msec before this patch. Effectively, it allows the simulation to save every frame and still run at 60 FPS.
Reviewers: lordloki, moguri, panzergame
Reviewed By: moguri, panzergame
Projects: #game_engine
Differential Revision: https://developer.blender.org/D1507
This commit adds '--build-foo' options to force the script to build relevant libraries
instead of trying to use packages from the distribution.
In addition, it also now offers (with those '--build-foo' options) the possibility
to build libraries on distributions that are not fully supported.
This is limited, but should still help people once they have installed themselves
the basics of dependencies - boost, llvm, osl/osd etc. are not libraries that are
really easy to build.
DISCLAIMER: I did not take the 20 (or more) hours needed to test all combinations
over all distributions, and given the size of the changes, bad sneaky typos are quite
probable, so please report if you get some errors!
GLSL 130, 140, 150 with extensions as needed.
Similar logic to my recent gpu_extensions changes.
Partially fixes T46706. Matcaps now work with OpenSubdiv, as do basic
materials. Anything with UV coordinates is still broken.