While building the AVX kernel, util_avxf.h/avxb.h were using some AVX2 intrinsics,
these were never called, so it wasn't a run-time issue, but the intrinsics headers
on centos excluded the AVX2 prototypes when building the AVX kernel causing build errors.
This commit cleans up the improper usage of the AVX2 intrinsics and provides AVX
fallback implementations for future use.
Differential Revision: https://developer.blender.org/D3696
This isn't really possible to do the shuffle which was attempted to do.
While it's possible to achieve expected behavior, the function needs to
be rewritten. Since it's not used anyway, it's simpler to remove it for
now.
Failed as in did not allocate due to possibly weight cutoff.
Tryign to allocated Extra storage for closure in such situation
will consfuse Cycles and cause crashes later one due to obscure
values in ShaderData.
* WITH_SYSTEM_OPENJPEG is removed and is now always on, this was already
the case for macOS and Windows.
* This should not break existing Linx builds. If there is no new enough
OpenJPEG installed, CMake will no find libopenjp2 and WITH_IMAGE_OPENJPEG
will be disabled.
* install_deps.sh was updated with new package names, since distributions
put this version in a new package.
Differential Revision: https://developer.blender.org/D3663
The assembler template was backing up and restoring ebx, which is
fair enough. However, this did not prevent compiler for putting
result variables to ebx. This was causing data corruption.
In order to prevent this easiest solution is to list ebx in clobbers
for the assembly.
The script clearly states:
This makes the presumption that you are include al.h like
#include "al.h"
and not
#include <AL/al.h>
The reason for this is that the latter is not entirely portable.
Windows/Creative Labs does not by default put their headers in AL/ and
OS X uses the convention <OpenAL/al.h>.
This commit makes default precompiled OpenAL to be properly detected
and also removes hack on MacOS which was finding the OpenAL package but
then was overwriting include directory.
Note, that new audaspace in 2.8 is using expected #include <al.h>.
This is an initial implementation of BVH8 optimization structure
and packated triangle intersection. The aim is to get faster ray
to scene intersection checks.
Scene BVH4 BVH8
barbershop_interior 10:24.94 10:10.74
bmw27 02:41.25 02:38.83
classroom 08:16.49 07:56.15
fishy_cat 04:24.56 04:17.29
koro 06:03.06 06:01.45
pavillon_barcelona 09:21.26 09:02.98
victor 23:39.65 22:53.71
As memory goes, peak usage raises by about 4.7% in a complex
scenes.
Note that BVH8 is disabled when using OSL, this is because OSL
kernel does not get per-microarchitecture optimizations and
hence always considers BVH3 is used.
Original BVH8 patch from Anton Gavrikov.
Batched triangles intersection from Victoria Zhislina.
Extra work and tests and fixes from Maxym Dmytrychenko.
With small tiles, the repeated allocations on GPUs can actually slow down the denoising quite a lot.
Allocating the buffer just once reduces rendertime for the default cube with 16x16 tiles and denoising on a mobile 1050 from 22.7sec to 14.0sec.
Building the CUDA kernels takes quite a bit of memory, and when building all of
them the combined usage can be too much on some systems (especially VMs).
Therefore, this patch adds an option to force the build system to build them
sequentially by making each build step depend on the previous kernel.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3623
This is in preparation of upgrading our library dependencies, some of which
need C++11. We already use C++11 in blender2.8 and for Windows and macOS, so
this just affects Linux.
On many distributions this will not require any changes, on some
install_deps.sh will need to be run again to rebuild libraries.
Differential Revision: https://developer.blender.org/D3568
Just basic algebra - because all vectors have the same z coordinate, a lot of terms end up cancelling out.
Not exactly a massive improvement, but it's measurable with Branched PT and a high sample count on the lamp.
Reviewers: brecht, sergey
Reviewed By: brecht
Subscribers: swerner
Differential Revision: https://developer.blender.org/D3540
Gathers information about object geometry and textures. Very basic at
this moment, but need to start somewhere.
Things which needs to be included still:
- "Runtime" information, like BVH. While it is not directly controllable
by artists, it's still important to know.
- Device array sizes. Again, not under artists control, but is added to
the overall size.
- Memory peak at different synchronization stages.
At this point it simply prints info to the stdout after F12 is done,
need better control over that too.
Reviewers: brecht
Differential Revision: https://developer.blender.org/D3566
There is no reason or justification to have helper functions as
class methods: they do not depend on anything in the class itself.
There are probably more cases like that.
While changing the shading normal is a great way to add additional detail to a model, there are some problems with it.
One of them is that at grazing angles and/or strong changes to the normal, the reflected ray can end up pointing into the actual geometry, which results in a black spot.
This patch helps avoid this by automatically reducing the strength of the bump/normal map if the reflected direction would end up too shallow or inside the geometry.
Differential Revision: https://developer.blender.org/D2574
The old springs with damping 1.0 operate in a special way that
is more similar to plastic deformation than a spring. Some users
rely on that, so let the user choose which implementation to use.
This also restores full backward compatibility with 2.79.
Reviewers: sergof
Differential Revision: https://developer.blender.org/D3544
This increases stack memory usage some, and ideally we'd support a dynamic
size. But this is quite difficult on the GPU and hopefully 32 is enough even
for very complex cases.
This is a physically-based, easy-to-use shader for rendering hair and fur,
with controls for melanin, roughness and randomization.
Based on the paper "A Practical and Controllable Hair and Fur Model for
Production Path Tracing".
Implemented by Leonardo E. Segovia and Lukas Stockner, part of Google
Summer of Code 2018.
Features to get the 2nd, 3rd, 4th closest point instead of the closest, and
various distance metrics. No viewport/Eevee support yet.
Patch by Michel Anders, Charlie Jolly and Brecht Van Lommel.
Differential Revision: https://developer.blender.org/D3503
This works for Cycles, Eevee, texture nodes and compositing. It helps to
reduce the number of math nodes required in various node setups.
Differential Revision: https://developer.blender.org/D3537
The X resource database is to be explicitly destroyed. This fixes 46 bytes
leak per every window DPI query (which happens a lot on window move/resize
and even on areas resize).
Unfortunately, this does not fully fix the leak since the known leak:
https://bugs.freedesktop.org/show_bug.cgi?id=94604
Textures in 16 bit integer format are sometimes used for displacement, bump and normal maps and can be exported by tools like Substance Painter. Without this patch, Cycles would promote those textures to single precision floating point, causing them to take up twice as much memory as needed.
Reviewers: #cycles, brecht, sergey
Reviewed By: #cycles, brecht, sergey
Subscribers: sergey, dingto, #cycles
Tags: #cycles
Differential Revision: https://developer.blender.org/D3523
This deduplicates the calls for tile (un)mapping and allows to have a target buffer that is different from the source buffer (needed for baking and animation denoising).
The latest clang compiler (at least the one in Xcode 9.4.1) warns about the register keyword and macro expansions using defined().
Since these warnings come from third party code, we can't address them directly in Blender. Silencing them via #pramgas will
at least keep the warnings during a build down to the ones that are relevant to Blender code.
This means the shader can now be used for procedural texturing. New
settings on the node are Samples, Inside, Local Only and Distance.
Original patch by Lukas with further changes by Brecht.
Differential Revision: https://developer.blender.org/D3479
I've limited it to just the RGB<->XYZ stuff for now, correct image handling is the next step.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3478
The automatic mode checks all Enviroment Texture nodes and picks the largest image's resolution.
If there are no Enviroment Textures, it just uses the old default.
Also, the sampling map now isn't limited to square shapes. The automatic detection uses the exact image size,
the manual UI option now halves the value to get the height.
A default aspect ratio of 2:1 makes sense since this is what most HDRIs use.
Reviewers: brecht, sergey
Differential Revision: https://developer.blender.org/D3477
There is one legit place in the code where memcpy was used as an
optimization trick. Was needed for older version of GCC, but now
it should be re-evaluated and checked if it still helps to have
that trick.
In other places it's somewhat lazy programming to zero out all
object members. That is absolutely unsafe, at the moment when
less trivial class is used as a member in that object things
will break.
Other cases were using memcpy into an object which comes from
an external library. We don't control that object, and we can
not guarantee it will always be safe for such memory tricks
and debugging bugs caused by such low level access is far fun.
Ideally we need to use more proper C++, but needs to be done with
big care, including benchmarks of each change, For now do
annoying but simple cast to void*.
In C++ it is not really safe to memcpy objects, and newer GCC will warn
about this. However, we don't use our vector for unsafe-to-memcpy objects,
so just explicitly silence that warning.
All keyboard events were sending double key events (including modifiers)
when xinput was enabled with gnome (causing much confusion!).
I cant test if XIM works,
but this isn't useful to send double events, so disabling for now.
Not sure why exactly it is called a cleanup, the code was much more clear
and robust against possible missing return statements which are MANDATORY.
Missing return statement will:
- Cause two different BVH traversals to be run.
Not is happening currently, but if more BVH layouts are added, it will
become a problem.
- It is already causing assert() statements to fail, since functions are
no longer returning when they are supposed to.
If there is any measurable reason to keep this change, let me know.
Otherwise just stick to reliable/tested/robust code.
This reverts commit ba65f7093b39a8e5f1fb869cbc347fb810a05ab9.
This commit contains the minimum to make clang build/work with blender, asan and ninja build support is forthcoming
Things to note:
1) Builds and runs, and is able to pass all tests (except for the freestyle_stroke_material.blend test which was broken at that time for all platforms by the looks of it)
2) It's slightly faster than msvc when using cycles. (time in seconds, on an i7-3370)
victor_cpu
msvc:3099.51
clang:2796.43
pavillon_barcelona_cpu
msvc:1872.05
clang:1827.72
koro_cpu
msvc:1097.58
clang:1006.51
fishy_cat_cpu
msvc:815.37
clang:722.2
classroom_cpu
msvc:1705.39
clang:1575.43
bmw27_cpu
msvc:552.38
clang:561.53
barbershop_interior_cpu
msvc:2134.93
clang:1922.33
3) clang on windows uses a drop in replacement for the Microsoft cl.exe (takes some of the Microsoft parameters, but not all, and takes some of the clang parameters but not all) and uses ms headers + libraries + linker, so you still need visual studio installed and will use our existing vc14 svn libs.
4) X64 only currently, X86 builds but crashes on startup.
5) Tested with llvm/clang 6.0.0
6) Requires visual studio integration, available at https://github.com/LazyDodo/llvm-vs2017-integration
7) The Microsoft compiler spawns a few copies of cl in parallel to get faster build times, clang doesn't, so the build time is 3-4x slower than with msvc.
8) No openmp support yet. Have not looked at this much, the binary distribution of clang doesn't seem to include it on windows.
9) No ASAN support yet, some of the sanitizers can be made to work, but it was decided to leave support out of this commit.
Reviewers: campbellbarton
Differential Revision: https://developer.blender.org/D3304
This patch adds support for IES files, a file format that is commonly used to store the directional intensity distribution of light sources.
The new IES node is supposed to be plugged into the Strength input of the Emission node of the lamp.
Since people generating IES files do not really seem to care about the standard, the parser is flexible enough to accept all test files I have tried.
Some common weirdnesses are distributing values over multiple lines that should go into one line, using commas instead of spaces as delimiters and adding various useless stuff at the end of the file.
The user interface of the node is similar to the script node, the user can either select an internal Text or load a file.
Internally, IES files are handled similar to Image textures: They are stored in slots by the LightManager and each unique IES is assigned to one slot.
The local coordinate system of the lamp is used, so that the direction of the light can be changed. For UI reasons, it's usually best to add an area light,
rotate it and then change its type, since especially the point light does not immediately show its local coordinate system in the viewport.
Reviewers: #cycles, dingto, sergey, brecht
Reviewed By: #cycles, dingto, brecht
Subscribers: OgDEV, crazyrobinhood, secundar, cardboard, pisuke, intrah, swerner, micah_denn, harvester, gottfried, disnel, campbellbarton, duarteframos, Lapineige, brecht, juicyfruit, dingto, marek, rickyblender, bliblubli, lockal, sergey
Differential Revision: https://developer.blender.org/D1543
The GPU kernel needs to use atomics for accumulation since all offsets are processed in
parallel, but on CPUs that's not the case, so we can disable them there for a considerable speedup.
The Math node currently has the normal atan() function, but for
actual angles this is fairly useless without additional nodes to handle the signs.
Since the node has two inputs anyways, it only makes sense to add an arctan2 option.
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D3430
Some years-old deprecated stuff has now been removed.
Correct solution is probably to use valid defines etc. in own code, but
this is more FFMEPG maintainer task (since it also may change how old
FFMPEG we do support...).
The new constraint is slower and not backward compatible, but should
be better, especially in the damping side. The new constraint also
has a different valid range of the damping coefficient, and a limit
implementation that bounces instead of making the object stationary.
Reviewers: sergof
Differential Revision: https://developer.blender.org/D3125
Better fix for T54457. It seems Debian compiles OpenVDB without ABI 3
compatibility, while Arch does enable it as is the default in the OpeVDB
CMake build system.
So now there's an option that the distribution can set depending on how
they compile their OpenVDB package.
- See `--log` help message for usage.
- Supports enabling categories.
- Color severity.
- Optionally logs to a file.
- Currently use to replace printf calls in wm module.
See D3120 for details.
Roughness baking previously defaulted to 1.0 for all diffuse materials,
now we also bake roughness values of Oren-Nayer and Principled Diffuse.
Differential Revision: https://developer.blender.org/D3115
Use C++11 threads when available, and native critical section on Windows.
Later on we can remove pthread code when C+11 becomes required.
Differential Revision: https://developer.blender.org/D3116
Random numbers for step offset were correlated, now use stratified samples
which reduces noise as well for some types of volumes, mainly procedural
ones where the step size is bigger than the volume features.
With better directory layout and more proper include
statements we can avoid several local modifications,
such as changing config.h for Windows Glog and the
ones related on pass-through statements in logging
headers in Glog.
This commit also makes unused functions not-a-warning
for external code.
Increasing the samplig dimensions like this is not optimal, I'm looking
into some deeper changes to reuse the random number and change the RR
probabilities, but this should fix the bug for now.
Unity itself is deprecated, but the API is also supported by KDE and the GNOME Dock extension,
which means that it will be useful for a wide variety of distributions.
To get a progress bar, the system must have a blender.desktop file and libunity installed.
The need for libunity is annoying, but the only alternative would be to integrate a DBus library...
Reviewers: campbellbarton, brecht
Differential Revision: https://developer.blender.org/D3106
This save a little memory and copying in the kernel by storing only a 4x3
matrix instead of a 4x4 matrix. We already did this in a few places, and
those don't need to be special exceptions anymore now.
This is in preparation of making Transform affine only, and also gives us
a little extra type safety so we don't accidentally treat it as a regular
4x4 matrix.
The purpose of the previous code refactoring is to make the code more readable,
but combined with this change benchmarks also render about 2-3% faster with an
NVIDIA Titan Xp.
This is an issue with which value to trust: fps vs. tbr. They both cam be
somewhat broken. Currently the idea is:
- If file was saved with FFmpeg AND we are decoding with FFmpeg we trust tbr.
- If we are decoding with Libav we use fps (there does not seem to be tbr in
Libav, unless i'm missing something).
- All other cases we use fps.
Seems to work all good for files from T53857, T54148 and T51153. Ideally we
would need to collect some amount of regression files to make further tweaks
more scientific.
Reviewers: mont29
Reviewed By: mont29
Differential Revision: https://developer.blender.org/D3083
around the volume.
We generate a tight mesh around the active voxels of the volume in order
to effectively skip empty space, and start volume ray marching as close
to interesting volume data as possible. See code comments for details on
how the mesh generation algorithm works.
This gives up to 2x speedups in some scenes.
Reviewed by: brecht, dingto
Reviewers: #cycles
Subscribers: lvxejay, jtheninja, brecht
Differential Revision: https://developer.blender.org/D3038
This is more important now that we will have tigther volume bounds that
we hit multiple times. It also avoids some noise due to RR previously
affecting these surfaces, which shouldn't have been the case and should
eventually be fixed for transparent BSDFs as well.
For non-volume scenes I found no performance impact on NVIDIA or AMD.
For volume scenes the noise decrease and fixed artifacts are worth the
little extra render time, when there is any.
This is used to determine which voxels are to be considered empty space.
Previously it was hardcoded for converting dense grids to OpenVDB grids
to reduce disk space usage.
This value is also useful for rendering engines to know, i.e. to
optimize ray marching.
Some other software cannot handle grid names with spaces in them. We still check for names with spaces so as to not break old
files.
This fixes T53802.
Similar to the Principled BSDF, this should make it easier to set up volume
materials. Smoke and fire can be rendererd with just a single principled
volume node, the appropriate attributes will be used when available. The node
also works for simpler homogeneous volumes like water or mist.
Differential Revision: https://developer.blender.org/D3033
We now continue transparent paths after diffuse/glossy/transmission/volume
bounces are exceeded. This avoids unexpected boundaries in volumes with
transparent boundaries. It is also required for MIS to work correctly with
transparent surfaces, as we also continue through these in shadow rays.
The main visible changes is that volumes will now be lit by the background
even at volume bounces 0, same as surfaces.
Fixes T53914 and T54103.
It seems to be useful still in cases where the particle are distributed in
a particular order or pattern, to colorize them along with that. This isn't
really well defined, but might as well avoid breaking backwards compatibility
for now.
This is like the only way to add variety to hair which is created
using simple children. Used here for the hair.
Maybe not ideal, but the time will show.
Burley SSS uses a bit of strange thing where the albedo and closure weight are
different, which makes the subsurface color act a bit like a subsurface radius
indirectly by the way the Burley SSS profile works.
This can't work for random walk SSS though, and it's not clear to me that this
is actually a good idea since it's really the subsurface radius that is supposed
to control this. For now I'll leave Burley SSS working the same to not break
backwards compatibility.
It is basically brute force volume scattering within the mesh, but part
of the SSS code for faster performance. The main difference with actual
volume scattering is that we assume the boundaries are diffuse and that
all lighting is coming through this boundary from outside the volume.
This gives much more accurate results for thin features and low density.
Some challenges remain however:
* Significantly more noisy than BSSRDF. Adding Dwivedi sampling may help
here, but it's unclear still how much it helps in real world cases.
* Due to this being a volumetric method, geometry like eyes or mouth can
darken the skin on the outside. We may be able to reduce this effect,
or users can compensate for it by reducing the scattering radius in
such areas.
* Sharp corners are quite bright. This matches actual volume rendering
and results in some other renderers, but maybe not so much real world
objects.
Differential Revision: https://developer.blender.org/D3054
This brings separate initialization for libcuda and libnvrtc, which
fixes Cycles nvrtc compilation not working on build machines without
CUDA hardware available.
Differential Revision: https://developer.blender.org/D3045
We should actually be using CL_DEVICE_MEM_BASE_ADDR_ALIGN for sub buffers,
previous change in this code was incorrect. Renamed the function now to
make the specific purpose of this alignment clear, it's not required for
data types in general.
This patch changes the huge list of projects in visual studio into a nice tree matching the source folder structure. see D2823 for details.
Differential Revision: http://developer.blender.org/D2823
nvcc is very picky regarding compiler versions, severely limiting the compiler we can use, this commit adds a nvrtc based compiler that'll allow us to build the cubins even if the host compiler is unsupported. for details see D2913.
Differential Revision: http://developer.blender.org/D2913
This adds midlevel and object/world space for displacement, and a
vector displacement node with tangent/object/world space, midlevel
and scale.
Note that tangent space vector displacement still is not exactly
compatible with maps created by other software, this will require
changes to the tangent computation.
Differential Revision: https://developer.blender.org/D1734