Motivation: When discussing with @Jeroen-Bakker and @aras_p about how to
approach sorting when rendering Gaussian splats in Blender, we realised that
compute shaders could help there (and have many other use cases), and that also
due to Blender 4.0 being on OpenGL >= 4.3, we can now rely on compute shaders
existing.
This PR is an initial pass for that functionality. It comes with a Python example, which
runs a compute shader and saves the output to a texture, which is then rendered in the
viewport.
There is no exposed support for storage buffers yet, but I expect I'll be able to work on
them soon if this is accepted.
The newly added parts are:
1. `gpu.compute.dispatch()`
2. a way set the compute source to `GPUShaderCreateInfo`: `shader_info.compute_source()`
3. a way to set the image store for `GPUShaderCreateInfo`: `shader_info.image()`
4. a way to set the `local_group_size` for `GPUShaderCreateInfo`: `shader_info.local_group_size(x,y,z)`
5. a way to get `max_work_group_size` from capabilities: `gpu.capabilities.max_work_group_size_get(index)`
6. a way to get `max_work_group_count` from capabilities: `gpu.capabilities.max_work_group_count_get(index)`
Pull Request: https://projects.blender.org/blender/blender/pulls/114238
No functional changes.
Rename the `eAutokey_Flag` to `eKeyInsert_Flag`
to indicate that it is not only used for auto keying.
Also rename the enum items to better reflect what they are used for.
Pull Request: https://projects.blender.org/blender/blender/pulls/115295
These drivers crash on startup caused by a driver bug. This include the
latest drivers for legacy Intel CPUs with a HD 4000/HD 5000 series GPU.
To be on the safe side all drivers with version 20.19.15.51* will be marked
unsupported as we don't have the platforms to identify the precise driver
versions that fail.
See #113124 for more information.
Pull Request: https://projects.blender.org/blender/blender/pulls/115228
Since b1526dd2c6, viewport renderer sets `G.is_rendering`, but VSE scene
rendering function used this to decide whether to do offscreen opengl
render or use render API. This logic was quite weak.
Use `SeqRenderData::for_render` instead of `G.is_rendering`, since it
explicitly defines whether strip rendering is invoked by F12 render job.
Since offscreen gl rendering rely on `BLI_thread_is_main()` being
true, use this to initialize `do_seq_gl` variable for clarity.
The use of render job path was further conditioned on `G.is_rendering`,
with exception of running headless, but this condition was incorrect.
This condition was reformulated as precondition, which if true, returns
`nullptr` instead of crashing.
Pull Request: https://projects.blender.org/blender/blender/pulls/115079
Add a function that copies selected values to groups of values in the
result array. Add a runtime-typed version and a version for affecting
all attributes. Also make the "gather_group_to_group" follow the
same pattern.
This allows for some optimization when we know
some effects are not present. Also this is needed
to detect the case when world contains absorption
in order to disable distant lighting.
Related #114062
Add an "Index Switch" node which is meant as a simpler version of
the "Menu Switch" from #113445 that doesn't allow naming items
or displaying them in a dropdown, but still allows choosing between
an arbitrary number of items, unlike the regular "Switch" node.
Even when the Menu Switch is included (which should be in the
same release as this), it may still be helpful to have explicit mapping
of indices, and a fair amount of the internals can be shared anyway.
Pull Request: https://projects.blender.org/blender/blender/pulls/115250
Currently we have options to transfer the paint mask, face sets, and
color attributes to the new mesh created by voxel remesh. This doesn't
make use of the generic attribute design, where it would be clearer to
transfer all attributes with the same methods. That's reflected in the
code as well-- we do duplicate work for the mask and vertex colors, for
example.
This commit replaces the transfer options with a single checkbox for
all attributes. All attribute types on all domains are transferred with
basically the same method as the "Sample Nearest" node from geometry
nodes-- they take the value from the nearest source element of the same
domain. Face corners are handled differently than before. Instead of
retrieving the mixed value of all the corners from the nearest source
vertex, the value from the nearest corner of the nearest face.
---
Some timing information, showing that when interpolating the same
data, the attribute propagation is significantly faster than before.
Edge and corner attributes would add some cost (edges more than
corners), but might not always be present.
Before
```
voxel_remesh_exec: 3834.63 ms
BKE_shrinkwrap_remesh_target_project: 1141.17 ms
BKE_mesh_remesh_reproject_paint_mask: 689.35 ms
BKE_remesh_reproject_sculpt_face_sets: 257.14 ms
BKE_remesh_reproject_vertex_paint: 54.64 ms
BKE_mesh_smooth_flag_set: 0.15 ms
```
After
```
voxel_remesh_exec: 3339.32 ms
BKE_shrinkwrap_remesh_target_project: 1158.76 ms
mesh_remesh_reproject_attributes: 517.52 ms
```
Pull Request: https://projects.blender.org/blender/blender/pulls/115116
Code assumed that lookdev shaders could be deferred compiled, but
that is never the case. This PR cleans up some related code and
mechanisms that were introduced because we thought it was the case.
Pull Request: https://projects.blender.org/blender/blender/pulls/114368
* Rename "Type" to "Shape" in user interface. RDNA already used
the term Shape (Still need to push this from office)
* Show LightProbe overlay settings
* Rename "Cubemap" to "Sphere"
* Rename "Planar" to "Plane"
Pull Request: https://projects.blender.org/blender/blender/pulls/114406
When the new rotation socket is connected to another rotation socket
already, a conversion node is unnecessary. Check this in addition to the
existing "non-self" implicit conversions.
In Simple Solidify's mode edge weights created by "Bevel Convex"
disappear, if edge weight is added to original mesh. From technical
side problem is identical to the one in #114860.
If original mesh doesn't have layer "bevel_weight_edge" option "Bevel
Convex" works fine. All bevels disappear after one edge gets bevel
weight. Except ones corresponding newly added weight. In patched
version weights created by "Convex Bevel" stay. Also manually added
weight can be observed on the edge of inner cube.
Second problem is when am original plane has one vertex with bevel
weight 1.0, but no bevel in result. If to change Solidify's mode from
"Simple" to "Complex" bevel appears. Patch adds this behavior to the
"Simple" mode.
Pull Request: https://projects.blender.org/blender/blender/pulls/115258
This patch rewrites the Inpaint node in the Realtime Compositor. The old
method suffered from discontinuities and singularities in the inpainting
regions. Furthermore, it ignored semi-transparent areas.
The new method is inspired by a two pass method described by the paper:
Rosner, Jakub, et al. "Fast GPU-based image warping and inpainting for
frame interpolation." International Conferences on Computer Graphics,
Vision and Mathematics. 2010.
In particular, we first fill the inpainting region using jump flooding,
then we apply a variable size blur pass whose size is proportional to
the distance to the inpainting boundary. The smoothed region is then
mixed with the input using its alpha.
The new method is much closer to the Bertalmio-style diffusion-based
inpainting methods, and thus can more accurately close holes than
existing methods.
The aforementioned method requires variable size blur, which is quite
expensive for this use case, so a new implementation was added that
approximates the method using a separable implementation, which provides
a visually pleasing result assuming a sufficiently smooth radius field,
which is true for our case since the field is an SDF.
Fixes: #114422
Pull Request: https://projects.blender.org/blender/blender/pulls/114849
Caused by error in refactor commit 4d668e6825ac3. Offsets are drawn
outside of strip boundary, but strip content range in drawing context
is clamped by handles, so nothing was drawn.
Also this feature was not working, when using solo feature. This was
caused by another refactor e39cc7831387d452. `special_seq_update` global
variable was declared in both files after splitting them, but set only
in one of them. Use `ED_sequencer_special_preview_get()` instead of
accessing variable directly.
Similar to [0], but also including the further optimizations from [1],
parallelize the vertex to edge, vertex to face, and edge to face
topology map index creation (the first part of the maps was already
parallelized). Vertex to face index creation went from 9.8 ms to 1.7 ms
on average, for a 1 million vertex grid. That's nice, since this map
is used for vertex normals after [2].
[0]: 226359ec48c820df59eb948d6d32a5560bffd685
[1]: 98e33adac24082d43b2439ea4ea264b10329704b
[2]: 5052e0d4073d74cdc8d513034a208004fd44d593
- Only calculate normals on the necessary domain
- Functions for exporting generic data
- Parallelize export of multiple submeshes
- Parallelize export within a single submesh
- Resize vectors to correct size to avoid reallocation
- Simplify hot loops to improve performance
- Optimize single material case to avoid index remapping
`write_submeshes` timing information (average of many runs)
| Test | Before | After |
| ------------------------------ | --------- | --------- |
| 6 million vert mesh | 791.99 ms | 130.75 ms |
| 1.5 million vert 100 materials | crash | 48.27 ms |
| Mr. Elephant test file | 778.95 ms | 277.06 ms |
Pull Request: https://projects.blender.org/blender/blender/pulls/113412
This loop might be 7x faster (not whole node).
All other code is already parallel, not sure why this was disabled.
Potentially, this was missed after some cleanup.
Pull Request: https://projects.blender.org/blender/blender/pulls/115246
Remove unnecessary N^2\n complexity. Disjoint set will join all
elements in list, even without fully-related joins. Usually cost is
small (10%~ for this specific function), but some certain files might
be 10000x slower. But that is very corner case.
Pull Request: https://projects.blender.org/blender/blender/pulls/115245
Ensure all Closures are filled with correct values,
like the principled bsdf node already does.
The main reason is that the new AgX color transform doesn't play well
with negative values (see #113220), but it's probably best to ensure we
use sanitized values in the rendering code as a whole.
Pull Request: https://projects.blender.org/blender/blender/pulls/115059
This add the displacement option to EEVEE materials.
This unifies the option for Cycles and EEVEE.
The displacement only option is not matching cycles
and not particularly useful. So we decided to not
support it and revert to displacement + bump.
Pull Request: https://projects.blender.org/blender/blender/pulls/113979
Expose the ID type identifier as defined by the `rna_enum_id_type_items`
enum items as `ID.id_type` in RNA.
Add some test to `id_management` ensuring that all ID types exposed in
`bpy.data` have a valid `id_type` value, i.e. that they have a matching
entry in `rna_enum_id_type_items`.
This will hopefully prevent future cases like #115151 .
Especially on windows, direct output to `cout` via `<<` is very expensive.
Instead, use fmtlib to do all formatting into a no-alloc `fmt::memory_buffer`,
and output that with one call to `cout`.
timeit utilities are not used much by default, but during development or
profiling one often uncomments macros like `DEBUG_TIME` that then enable
`SCOPED_TIMER` or `SCOPED_TIMER_AVERAGED`.
Having one `SCOPED_TIMER_AVERAGED` inside sequencer `draw_channels`, with
empty timeline and all default channels; the overhead in % of `draw_channels`
duration of said scoped timer before and after this change:
- Windows: 29% -> 5%
- Mac: 5.0% -> 4.4%
Pull Request: https://projects.blender.org/blender/blender/pulls/115233
This uses the principles outlined in
Screen Space Indirect Lighting with Visibility Bitmask
to compute local and distant diffuse lighting.
This implements it inside the ray-tracing module as a fallback when the
surface is too rough. The threshold for blending between technique is
available to the user.
The implementation first setup a radiance buffer and a view normal
buffer. These buffer are tracing resolution as the lighting quality is
less important for rough surfaces. These buffers are necessary to avoid
re-projection on a per sample basis, and finding and rotating the
surface normal.
The processing phase scans the whole screen in 2 directions and outputs
local incomming lighting from neighbor pixels and the remaining
occlusion for everything that is outside the view.
The final steps filters the result of the previous phase while applying
the occlusion on the probe radiance to have an energy conserving mix.
Related #112979
Pull Request: https://projects.blender.org/blender/blender/pulls/114259
Before #112901, subsurface scattering was skipped if its radius was
less than 1 pixel.
Aside from an optimization, this also avoided divisions by zero in
`burley_eval`.
This just brings back the early return when `pixel_footprint < 1.0`.
Pull Request: https://projects.blender.org/blender/blender/pulls/114928