This code allows to push a set of different operations all based on
iterations over a range of indices, and then process them all at once
over multiple threads.
This commit also adds unit tests for both old un-pooled, and new pooled
`task_parallel_range` family of functions, as well as some basic
performances tests.
This is mainly interesting for relatively low amount of individual
tasks, as expected.
E.g. performance tests on a 32 threads machine, for a set of 10
different tasks, shows following improvements when using pooled version
instead of ten sequential calls to `BLI_task_parallel_range()`:
| Num Items | Sequential | Pooled | Speed-up |
| --------- | ---------- | ------- | -------- |
| 10K | 365 us | 138 us | 2.5 x |
| 100K | 877 us | 530 us | 1.66 x |
| 1000K | 5521 us | 4625 us | 1.25 x |
Differential Revision: https://developer.blender.org/D6189
This new function is part of the 'parallel for loops' functions. It
takes an iterator callback to generate items to be processed, in
addition to the usual 'process' func callback.
This allows to use common code from BLI_task for a wide range of custom
iteratiors, whithout having to re-invent the wheel of the whole tasks &
data chuncks handling.
This supports all settings features from `BLI_task_parallel_range()`,
including dynamic and static (if total number of items is knwon)
scheduling, TLS data and its finalize callback, etc.
One question here is whether we should provide usercode with a spinlock
by default, or enforce it to always handle its own sync mechanism.
I kept it, since imho it will be needed very often, and generating one
is pretty cheap even if unused...
----------
Additionaly, this commit converts (currently unused)
`BLI_task_parallel_listbase()` to use that generic code. This was done
mostly as proof of concept, but performance-wise it shows some
interesting data, roughly:
- Very light processing (that should not be threaded anyway) is several
times slower, which is expected due to more overhead in loop management
code.
- Heavier processing can be up to 10% quicker (probably thanks to the
switch from dynamic to static scheduling, which reduces a lot locking
to fill-in the per-tasks chunks of data). Similar speed-up in
non-threaded case comes as a surprise though, not sure what can
explain that.
While this conversion is not really needed, imho we should keep it
(instead of existing code for that function), it's easier to have
complex handling logic in as few places as possible, for maintaining and
for improving it.
Note: That work was initially done to allow for D5372 to be possible... Unfortunately that one proved to be not better than orig code on performances point of view.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D5371
The function now allows custom return types defined
by the callbacks. This can be useful when a user of the
data structure has to implement some custom behavior.
This commit adds some new hashing based data structures to blenlib.
All of them use open addressing with probing currently.
Furthermore, they support small object optimization, but it is not
customizable yet. I'll add support for this when necessary.
The following main data structures are included:
**Set**
A collection of values, where every value must exist at most once.
This is similar to a Python `set`.
**SetVector**
A combination of a Set and a Vector. It supports fast search for
elements and maintains insertion order when there are no deletes.
All elements are stored in a continuous array. So they can be
iterated over using a normal `ArrayRef`.
**Map**
A set of key-value-pairs, where every key must exist at most once.
This is similar to a Python `dict`.
**StringMap**
A special map for the case when the keys are strings. This case is
fairly common and allows for some optimizations. Most importantly,
many unnecessary allocations can be avoided by storing strings in
a single buffer. Furthermore, the interface of this class uses
`StringRef` to avoid unnecessary conversions.
This commit is a continuation of rB369d5e8ad2bb7.
These two data structures reference strings somewhere in memory.
They do not own the referenced string. The string is considered
const.
A string referenced by StringRefNull can be expected to be
null-terminated. That is not the case for StringRef.
This commit is a continuation of rB369d5e8ad2bb7c2.
Many generic C++ data structures have been developed in the
functions branch. This commit merges a first chunk of them into
master. The following new data structures are included:
Array: Owns a memory buffer with a fixed size. It is different
from std::array in that the size is not part of the type.
ArrayRef: References an array owned by someone else. All elements
in the referenced array are considered to be const. This should
be the preferred parameter type for functions that take arrays
as input.
MutableArrayRef: References an array owned by someone else. The
elements in the referenced array can be changed.
IndexRange: Specifies a continuous range of integers with a start
and end index.
IntrusiveListBaseWrapper: A utility class that allows iterating
over ListBase instances where the prev and next pointer are
stored in the objects directly.
Stack: A stack implemented on top of a vector.
Vector: An array that can grow dynamically.
Allocators: Three allocator types are included that can be used
by the container types to support different use cases.
The Stack and Vector support small object optimization. So when
the amount of elements in them is below a certain threshold, no
memory allocation is performed.
Additionally, most methods have unit tests.
I'm merging this without normal code review, after I checked the
code roughly with Sergey, and after we talked about it with Brecht.
Currently unused, but will allow to keep of an owner of the depsgraph.
Could also simplify other APIs in the future by avoiding to pass bmain
explicitly to relation update functions and things like that.
Bugs were: (1) needed an epsilon test in CCW test in order to
handle new costraint edge that intersects an existing point
but only within epsilon; (2) the "valid bmesh" output mode
sometimes left a face that included outside frame point.
The most common use of the text editor seems to be for scripting. Having
line numbers and syntax highlighting enabled by default seems sensible.
Syntax highlighting is now enabled by default, but is automatically
disabled when the datablock has a non-highlighted extension.
Highlighting is enabled for filenames like:
- Text
- Text.001
- somefile.py
and is automatically disabled when the datablock has an extension for
which Blender has no syntax highlighter registered.
Reviewers: billreynish, campbellbarton
Subscribers: brecht, billreynish
Differential Revision: https://developer.blender.org/D5472
See Design task T68277, and patch D5423.
This commit includes edits by @ideasman42 to patch in
branch temp-D5423-update, plus responses to his comments.
Nothing special to mention about regression test itself, it basically
mimics the one for `BLI_task_parallel_mempool()`...
Basic performances benchmarks do not tell us much, besides the fact that
for very light processing of listbase, even with 100k items,
single-thread remains an order of magnitude faster than threaded code.
Synchronization is just way too expensive in that case with current
code. This should be partially solvable with much bigger (and
configurable) chunk sizes though (current ones are just ridiculous
for such cases ;) )...
The task_scheduler was not being explicitly freed, leading to
unpredictable behavior when the process was exiting. The test
would pass, but would sometimes segfault at process shutdown.
Use CMake's target_link_libraries instead of manually maintaining
library dependencies in a single list.
In practice adding new libraries often ended up being guess-work,
now each library lists the libraries it uses.
This was used for the game player executable so libraries
could optionally link to stubs.
If we need this functionality it can be done using target-properties
as described in T46725.
No functional change, this adds LIB definition and args to cmake files.
Without this it's difficult to migrate away from 'BLENDER_SORTED_LIBS'
since there are many platforms/configurations that could break when
changing linking order.
Manually add and enable WITHOUT_SORTED_LIBS to try building
without sorted libs (currently fails since all variables are empty).
This check will eventually be removed.
See T46725.
While it looks more longer, but also contains more comments
about what's going on.
Surely, this function almost never breaks and investing time
into maintaining its tests is not that important, but we should
have a good, clean, understandable tests so they act as a nice
example of how they are to be written. Especially important to
show correct language usage, without old school macros magic.
Doing this at a lunch breaks, so will be series of some updates
in the area.
The `BLI_path_frame_strip` function was completely broken, unless the
number of digits in the sequence number was the same as the length of
the extension. In other words, it would work fine for `file.0001.abc` (4
digit `0001` and 4 char `.abc`), but other combinations would truncate
to the shortest (`file.001.abc` would become `file.###.ab` and
`file.00001.a` would become `file.##.a`). The dependency between the
sequence number and the file extension is now removed.
The behaviour has changed a little bit in the case where there are no
numbers in the filename. Previously, `path="filename.abc"` would result
in `path="filename.abc"` and `ext=""`, but now it results in
`path="filename"` and `ext=".abc"`. This way `ext` always contains the
extension, and the behaviour is consistent regardless of whether there
were any numbers found.
Furthermore, I've removed the `bool set_frame_char` parameter, because
it was unclear, probably also buggy, and most importantly, never used.
I've also added a unit test for the `BLI_path_frame_strip` function.
The new data structure uses open addressing instead of chaining to resolve collisions in the hash table.
This new structure was never slower than the old implementation in my tests. Code that first inserts all edges and then iterates through all edges (e.g. to remove duplicates) benefits the most, because the `EdgeHashIterator` becomes a simple for loop over a continuous array.
Reviewer: campbellbarton
Differential Revision: D4050
We already had a BKE_main.h header, no reason not to put there
Main-specific functions, BKE_library has already more than enough to
handle with IDs and library management!
Simple find_nearest relies on a heuristic for efficient culling of
the BVH tree, which involves a fast callback that always updates the
result, and the caller reusing the result of the previous find_nearest
to prime the process for the next vertex.
If the callback is slow and/or applies significant restrictions on
what kind of nodes can qualify for the result, the heuristic can't
work. Thus for such tasks it is necessary to order and prune nodes
before the callback at BVH tree level using a priority queue.
Since, according to code history, for simple find_nearest the
heuristic approach is faster, this mode has to be an option.
This reverts reverting commit rB55324b8a2e6799300, and do proper 'cleanup' (sigh)
in gtest as well.
Sorry for the noise, did not understood what had happened here
immediately. :/