The issue was caused by an integer overflow in the `FFMPEGReader::seek`, which
lead to different results depending on an exact compiler version and platform.
The overflow happened in the equation which calculates `m_position`, in its
`packet.pts - st_time` part, which is effectively `int64_t - uint64_t` which is
expected to result in `uint64_t` due to the way how upcasting works in C++.
With the values `packet.pts=0` and `st_time=353600` from the test file the
calculation resulted in a negative value, which then wrapped around due to the
unsigned nature of the type, resulting in very big value on arm64. The value
was wrong, but it did not manifest itself as an error, since the seek code
believed it found the place.
However, when built on x86 platform, the result value was very large negative
value. It is possible that the type upcasting went different, due to the
multiplication with a double value later in the formula. And this large negative
value lead to the seeking code to read the entire file, trying to make it so
the `m_position` is not less than the requested position.
Fortunately, the `AVStream::start_time` is `int64_t`, which leads to a simple
fix which avoids signed/unsigned cast, making the seeking code to properly
calculate `m_position` on both arm64 and x64 platforms, regardless of the
compiler.
Pull Request: https://projects.blender.org/blender/blender/pulls/122149
Use COM smart pointer (`ComPtr`) to simplify memory management in WASAPI driver.
This reduces chances of calling `Release()` when a COM object has not been allocated.
Pull Request: https://projects.blender.org/blender/blender/pulls/121828
`xxHash` is a fast non-cryptographic hashing library. It significantly outperforms
md5 which we use in some places currently while also having great collision
resistance if not attacked explicitly.
The library is added to `extern` because that was the easiest way to do it and has
the least impact on others. I expect this library to become a required dependency
instead of an optional one. It's licence is `BSD 2-Clause` which seems to be the
first of its kind in Blender (there is `BSD 3-Clause` a couple of times).
For now, I used the library only for data deduplication when baking geometry nodes
where the same geometry is generated for each frame. The bake time in my test
goes down from >6s to <1s (note that this includes more than just the hashing time).
Pull Request: https://projects.blender.org/blender/blender/pulls/120139
After 91eb50ec2f436c026233f184a2dfc7141d6f3b44, we would get random crashes/asserts from the pulseaudio library like:
`Assertion 'e->mainloop->n_enabled_defer_events > 0' failed at ../pulseaudio-17.0/src/pulse/mainloop.c:261, function mainloop_defer_enable(). Aborting.`
It seems like we would run into a race condition if we didn't guard the
pulseaudio flush command with the pulseaudio mutex.
This is probably because the pulseaudio thread would try to read the
buffer for a tiny bit even after pausing the playback.
Sadly the only way to reproduce this is to playback any scene (seem to happen more often if A/V sync is on) and spam play/pause.
Note that I could not reproduce this on every computer I tested this on.
But by expanding the main pulseaudio mutex lock, I can't seem to reproduce this anymore.
So I think that is the correct solution.
Pull Request: https://projects.blender.org/blender/blender/pulls/120072
This patch ports the GLSL SMAA library to the CPU compositor in order to
unify the anti-aliasing behavior between the CPU and GPU compositor.
Additionally, the SMAA texture generator was removed since it is now
unused.
Previously, we used an external C++ library for SMAA anti-aliasing,
which is itself a port of the GLSL SMAA library. However, the code
structure and results of the library were different, which made it quite
difficult to match results between CPU and GPU, hence the decision to
port the library ourselves.
The port was performed through a complete copy of the library to C++,
retaining the same function and variable names, even if they are
different from Blender's naming conversions. The necessary code changes
were done to make it work in C++, including manually doing swizzling
which changes the code structure a bit.
Even after porting the library, there were still major differences
between CPU and GPU, due to different arithmetic precision. To fix this
some of the bilinear samplers used in branches and selections were
carefully changed to use point samplers to avoid discontinuities around
branches, also resulting in a nice performance improvement. Some slight
differences still exist due to different bilinear interpolation, but
they shall be looked into later once we have a baseline implementation.
The new implementation is slower than the existing implementation, most
likely due to the liberal use of bilinear interpolation, since it is
quite cheap on GPUs and the code even does more work to use bilinear
interpolation to avoid multiple texture fetches, except this causes a
slow down on CPUs. Some of those were alleviated as mentioned in the
previous section, but we can probably look into optimizing it further.
Pull Request: https://projects.blender.org/blender/blender/pulls/119414
This file is generated by the attribution builder for every release
and it's up to the release team to check the log for extern and others to make sure this is up to date.
Exact an exact match with Clang broke building when the compiler ID
was "AppleClang", reverting parts of [0].
[0]: 6549019ae19cecbea524782054dca0e99cb833b8
* Only works on machines with a Qualcomm Snapdragon 8cx Gen3 or above.
Older generation devices are not and will not be supported due to
some driver issues
* Requires VS2022 for building.
* Uses new MSVC preprocessor for sse2neon compatibility.
* SIMD is not enabled, waiting on conversion of blenlib to C++.
Ref #119126
Pull Request: https://projects.blender.org/blender/blender/pulls/117036
New ("fullframe") CPU compositor backend is being used now, and all the code
related to "tiled" CPU compositor is just never used anymore. The new backend
is faster, uses less memory, better matches GPU compositor, etc.
TL;DR: 20 thousand lines of code gone.
This commit:
- Removes various bits and pieces related to "tiled" compositor (execution
groups, one-pixel-at-a-time node processing, read/write buffer operations
related to node execution groups).
- "GPU" (OpenCL) execution device, that was only used by several nodes of
the tiled compositor.
- With that, remove CLEW external library too, since nothing within Blender
uses OpenCL directly anymore.
Pull Request: https://projects.blender.org/blender/blender/pulls/118819
Chrome would only drop into windows supporting v5 of the XDND spec,
resolve by bumping the version and no other changes.
From looking into the changes between v3 & v5 they mainly relate to
dragging files between windows & dragging onto the root window.
Since Blender does neither, bump the version to allow dragging
links from google-chrome.
Update from version 4.65 (2009) to latest 23.01 (2023).
LZMA is only used for pointcache "Heavy" compression option. With
updated SDK, testing in a 100k particle system point cache
(Windows/Ryzen5950X), average times to read or write a cached frame:
- Read a bit faster, 24.3 -> 23.2ms
- Write/bake about 2x faster, 237 -> 103ms
Pull Request: https://projects.blender.org/blender/blender/pulls/118433
Previously in Audaspace there was choice between linear resampler (okay
for preview, but not great for final mix), or "extremely high quality
preset" for rendering the final mix; with nothing in between. I have just
landed "medium" and "low" resampler quality levels in upstream Audaspace
(see https://github.com/audaspace/audaspace/pull/18 with details and
quality spectograms, also comparison with Audacity resampler).
This PR updates Audaspace to latest upstream, and switches to use the newly
added "medium" quality resampler. There's no audible difference (nor visible
one in spectrograms), as far as I can tell.
Timings, rendering out frames 1000-3000 of Sprite Fright Edit blender
studio data set:
- Windows (Ryzen 5950X, VS2022): 92 -> 73 sec
- Mac (M1 Max, clang 15): 70 -> 62 sec
i.e. using a faster audio resampler makes the _whole render process_ be
10-20% faster (however, this from VSE where it combines already pre-rendered
image strips).
Pull Request: https://projects.blender.org/blender/blender/pulls/116059
ROCm 6 brings some changes to the HIP API. This pull request is meant to be
backward and forward compatible.
That is Blender could be compiled with either ROCM 6 or 5 and run on either.
The main change is the hipMemoryType enum, which we check based on the
runtime version to use the correct enum values.
Without this, HIP will not work on Windows with upcoming 23.40 driver.
Pull Request: https://projects.blender.org/blender/blender/pulls/116713
* Different fix for Mantaflow linker warnings that works with new OpenVDB.
* Use new linker for arm64 as it no longer produces warnings with latest
Xcode. Still use the old one for x86_64 as some warnings remain.
* Fix wrong x86_64 build target in deps builder.
For the upcoming 4.1 libraries.
Ref #113157
Pull Request: https://projects.blender.org/blender/blender/pulls/116708
NDEBUG is part of the C standard and disables asserts. Only this will
now be used to decide if asserts are enabled.
DEBUG was a Blender specific define, that has now been removed.
_DEBUG is a Visual Studio define for builds in Debug configuration.
Blender defines this for all platforms. This is still used in a few
places in the draw code, and in external libraries Bullet and Mantaflow.
Pull Request: https://projects.blender.org/blender/blender/pulls/115774
The last good commit was 8474716abb0db3b06838a57f7217bc945638d8df.
After this commits from main were pushed to blender-v4.0-release. These are
being reverted.
Commits a4880576dc from to b26f176d1a that happend afterwards were meant for
4.0, and their contents is preserved.
With current main code, trying to build a full debug build with ASAN
enabled will fail at linking, due to out-of-bound memory references
within the binary file, e.g.:
```
mold: error: /usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o:(.text):
relocation R_X86_64_32S against .tm_clone_table out of range:
2147741344 is not in [-2147483648, 2147483648)
```
This commit works around the issue by disabling ASAN by default for the
`extern` dependencies. A new `WITH_COMPILER_ASAN_EXTERN` CMake
option is added to enable it if needed - some other components need to be
disabled then.
NOTE: This is more of an emergency band-aid to the general binary size issue,
see #113892 for on-going discussions about better solutions in the long term.
Pull Request: https://projects.blender.org/blender/blender/pulls/113891
The sound equalizer is using the Audaspace FFT Convolver.
The blender part creates an array of descriptions of power per "band"
and orders the creation of Equalizer (ISound) in the Audaspace.
Modifier can be created on sound strips. It lets you define
amplification or attenuation over frequency range from 30Hz to 20 kHz.
The power is limited to -30 db - 30 db. This is done using curve
mapping widget.
Co-authored-by: menda <alguien@aqui.es>
Co-authored-by: Richard Antalik <richardantalik@gmail.com>
Pull Request: https://projects.blender.org/blender/blender/pulls/105613
Listing the "Blender Foundation" as copyright holder implied the Blender
Foundation holds copyright to files which may include work from many
developers.
While keeping copyright on headers makes sense for isolated libraries,
Blender's own code may be refactored or moved between files in a way
that makes the per file copyright holders less meaningful.
Copyright references to the "Blender Foundation" have been replaced with
"Blender Authors", with the exception of `./extern/` since these this
contains libraries which are more isolated, any changed to license
headers there can be handled on a case-by-case basis.
Some directories in `./intern/` have also been excluded:
- `./intern/cycles/` it's own `AUTHORS` file is planned.
- `./intern/opensubdiv/`.
An "AUTHORS" file has been added, using the chromium projects authors
file as a template.
Design task: #110784
Ref !110783.