Keep index using the outer scope for GHASH iter macros,
while its often nice, in some cases to declare in the for loop,
it means you cant use as a counter after the loop exits, and in some cases signed/unsigned may matter.
API changes should really be split off in their own commits too.
GPU_buffer no longer has a fallback to client vertex arrays, so remove
comments about it.
Changed a few internal structs/function interfaces to use bool where
appropriate.
Use for-loop scope and flexible declaration placement. PBVH does the
same thing but needs ~150 fewer lines to do it!
The change to BLI_ghashIterator_init is admittedly hackish but makes
GHASH_ITER_INDEX nicer to use.
In gpu lib:
- GPU_glsl_support() always returns true
- internal cleanup & comments
Outside gpu lib:
- remove check from various code, remove the “else” path
- sprinkled a few C99-isms
We can remove GPU_glsl_support() when BGE stops calling it.
It uses edge extrapolation code from Gimp's antialias plugin now,
which has advantage of giving symmetrical results, which makes it
possible to add two antialiased ID masks and have a constant 1.0
all over the frame. But has difference from the old implementation
because it uses 3x3 matrix only, which doesn't give so much smooth
looking edges. Perhaps it's not so bad, since if edges are really
need to be smooth one might use Blur node.
Another advantage is that the node is now nicely threaded.
Reviewers: campbellbarton
Reviewed By: campbellbarton
Subscribers: ania
Differential Revision: https://developer.blender.org/D1617
Previously render nodes will be always created with just a VECTOR socket
type and then those sockets will try to be set as all point, vector and
normal to work around lack of such a subtype distinguishing in blender.
This change makes it so subtype is being queried from OSL itself and
proper subtupe is being used for socket.
It's still not in use for the official builds because it requires changes
applied recently on the 1.7 branch of OSL:
https://github.com/imageworks/OpenShadingLanguage/commit/f70e58f
This solves artists confusion reported in T46117.
Reviewers: #cycles, juicyfruit
Reviewed By: #cycles, juicyfruit
Subscribers: juicyfruit
Differential Revision: https://developer.blender.org/D1627
Graph::disconnect() actually modifies links, needs to create a copy to iterate
if disconnect happens form inside the loop.
Question tho whether we can control this somehow..
Reported by BzztPloink in IRC, thanks!
This makes it possible to use scenes as a kind of
multi-user meta-strip (with their own time).
Currently this supports rendering & drawing nested strips,
but no convenient way to tab-enter into a scene strip.
When a user tries to load a non-existing blend file from the CLI, the path
is set and Blender acts as if it is loaded. This allows the user to create
a new file by typing 'blender filename.blend' and then saving. This
behaviour is common in other tooling, such as vim and Mypaint.
Blender's current behaviour (print an error message and open the default
scene) doesn't make much sense. It ignores the filename passed on the CLI,
whereas with this patch that filename is actually remembered, and used when
saving.
Reviewers: campbellbarton, sergey, mont29
The issue was caused by possible use of object->derivedFinal from the render
thread, The patch tries to eliminate (or at least minimize, huh) amount of
access to the derivedFinal of a source object. It's still possible that in
the case of particle source derived mesh will be still unsafely used, but
with the patch applied we can easily change runtime part of the code and
cache derived mesh on the preparation stage.
Some ideas for the future:
- Check whether cache() was called on the point density node when calling
calc().
- Cache derivedMesh in the runtime part of point density node to avoid
possible remained thread conflicts.
- NULL the runtime part of the node on .blend load
Reviewers: campbellbarton, plasmasolutions
Reviewed By: plasmasolutions
Differential Revision: https://developer.blender.org/D1614
Seems set_intersection() requires passing explicit comparator if non-default
one is used for the sets. A bit weird, but can't really find another explanation
here about whats' going on here.
This mimics OpenMP's 'firstprivate' feature. It is sometimes handy to have some persistent local data during a whole chunk.
Reviewers: sergey
Reviewed By: sergey
Subscribers: campbellbarton
Differential Revision: https://developer.blender.org/D1635
The issue was caused by not really optimal graph traversal for gathering nodes
dependencies which could have exponential complexity with a long tree branches
connected with multiple connections between them.
Now we optimize the depth traversal and perform early output if the node was
already traversed.
Please note that this adds some limitations to the use of SVM compiler's
find_dependencies() in the cases when skip_node is not NULL and one wants to
perform dependencies find sequentially with the same set. This doesn't happen
in the code, but one should be aware of this.
The issue was than nodes dependencies were stored as set<ShaderNode*> which
is actually a so called "strict weak ordered", meaning order of nodes in
the set is strictly defined, but based on the ShaderNode pointer. This means
that between different render invokations order of original nodes could be
different due to different pointers allocated for ShaderNode.
This commit makes it so dependencies and maps used for ShaderNodes are based
on the node->id which has much more predictable order. It's still possible
to trick the system by doing some crazy edits during viewport rendfer and
cause difference between viewport and final render stacks.
Reviewers: brecht
Reviewed By: brecht
Subscribers: LazyDodo
Differential Revision: https://developer.blender.org/D1630
This gives much lower stack usage on GPU and reduces kernel memory size to
around 448MB on GTX560Ti (comparing to 652MB with previous commit and 946MB
with official release). There's also a barely measurable speedup of around
5%, but this is to be confirmed still.
At this stage we're using only ~3% for the experimental kernel and SSS
rendering seems to be faster by 40% and after some further testing we might
consider making SSS and CMJ official features and remove experimental
precompiled kernels.
The idea is to delay shooting indirect rays for the SSS sampling and
trace them after the main integration loop was finished.
This reduces GPU stack usage even further and brings it down to around
652MB (comparing to 722MB before the change and 946MB with previous
stable release).
This also solves the speed regression happened in the previous commit
and now simple SSS scene (SSS suzanne on the floor) renders in 0:50
(comparing to 1:16 with previous commit and 1:03 with official release).
This commit introduces a SSS-oriented intersection structure which is replacing
old logic of having separate arrays for just intersections and shader data and
encapsulates all the data needed for SSS evaluation.
This giver a huge stack memory saving on GPU. In own experiments it gave 25%
memory usage reduction on GTX560Ti (722MB vs. 946MB).
Unfortunately, this gave some performance loss of 20% which only happens on GPU.
This is perhaps due to different memory access pattern. Will be solved in the
future, hopefully.
Famous saying: won in memory - lost in time (which is also valid in other way
around).
Input array length is implicitly set at link time, based on the geometry
shader's layout. Specifying the wrong value here is an error; specifying
no value is the same as getting it right. (inspired by a recent codegen
change)
the paint cursor drawing when a stroke is underway; this information is
already available from the stroke itself.
Should improve performance when "show brush" is enabled (which is
always, for sane people). Finding an intersection is not always cheap,
especially on heavy meshes.
Cudos to the dyntopo test thread for the report and the sculpt love :)
yet another bug introduced by recent shadowing changes -- q and r CCG arrays
were overwritten by the temporary evaluation because the code was changed to
use global pointers instead of the local ones.
Naming of the variables could be changed to be a bit more accurate.