Needed by incomming changes in pbvh.c.
Note that we make it much simpler than for other primitives in this file - think
we could revise its content to make it simpler one day...
The issue was caused by different AABB used by Cycles and texture sampler.
Instead of trying to keep this two functions in sync we now do have an
utility call in the point density node to query the AABB.
simulations.
This commits implements OpenVDB as an extra cache format in the Point
Cache system for smoke simulations. Compilation with the library is
turned off by default for now, and shall be enabled when the library is
present.
A documentation of its doings is available here: http://
wiki.blender.org/index.php/User:Kevindietrich/OpenVDBSmokeExport.
A guide to compile OpenVDB can be found here (Linux): http://
wiki.blender.org/index.php?title=Dev:Doc/Building_Blender/Linux/
Dependencies_From_Source#OpenVDB
Reviewers: sergey, lukastoenne, brecht, campbellbarton
Reviewed By: brecht, campbellbarton
Subscribers: galenb, Blendify, robocyte, Lapineige, bliblubli,
jtheninja, lukasstockner97, dingto, brecht
Differential Revision: https://developer.blender.org/D1721
This changes the following defaults:
- Render settings:
* Samples: 100
* Preview Samples: 50
* Filter: Blackmann-Harris
* Tile Order: Hilbert Spiral
- Lamp settings:
* Use MIS: On
- Material settings:
* Volume Sampling: Multiple Importance
Old files are not affected, I tested the versioning code back and forth.
More changes are to come (World, BVH...) but that needs a bit more work.
Fix T47213.
There was actually no real bug here, just clarify now in the UI that Mesh, World and Lamp samples only have an effect if we sample all lights (direct or indirect).
The issue was discontinuity in logic when importing vertices from blender
and then importing data layers regardless of how we split the face. Quite
interesting we didn't notice this issue before.
Thanks Bastien for the investigation, based on D1742 but redid it to make
patch a bit more clear to follow.
There is no function pointers in OpenCL specification. For as long
as we want to support this platform we should follow the specifications.
While the code is not totally optimal now, it should not be that huge
of performance issue on CPU since it does jump tables just nicely, so
it's not that much extra computation here.
Now image will be opened for while render session is active, this is
needed to keep image cache working correctly. But stopping render
should now release all files descriptors.
Displacement shader was not updating motion vertex positions.
Current solution is not totally correct because it applies same offset
for all time steps. Ideally we'll need to evaluate displacement shader
for every time offset separately, but currently we don't have subframe
image access.
For the time being will consider this a TODO.
Compiling OSL scripts with errors in them would cause Blender to crash since the OSL version
bump to 1.6.9 instead of printing the error to the console as it did before.
With version 1.6.2, OSL added a pointer to an OpenImageIO ErrorHandler as an argument to the
OSLCompiler constructor. However, since it defaults to the NULL pointer, Blender still compiled
fine after the OSL version bump.
It turns out, though, that this pointer is used without further checks inside the OSL code, which
makes it crash when it tries to report an error unless a valid ErrorHandler pointer is specified.
Therefore, this commit simply passes a pointer to the static default handler that OIIO offers,
which prints the error to the console just like OSL did before.
Using this feature for a more advanced error handling and displaying from the Blender side would
be possible and seems reasonable, but for now it's not really relevant for fixing this bug.
The combined pass is built with the contributions the user finds fit.
It is useful for lightmap baking, as well as non-view dependent effects
baking.
The manual will be updated once we get closer to the 2.77 release.
Meanwhile the new page can be found here:
http://dalaifelinto.com/blender-manual/render/cycles/baking.html
Reviewers: sergey, brecht
Differential Revision: https://developer.blender.org/D1674
This commit removes the experimental CUDA kernel, making SSS and CMJ
regular features.
Several improvements have been made in the past few
weeks (thanks Sergey!) which make SSS render several times faster (2-3x
compared to 2.76b) on the GPU, and the increased VRAM usage has also been
fixed. Therefore the experimental kernel is no longer needed.
Differential Revision: https://developer.blender.org/D1726
Manual has been updated: too:
https://www.blender.org/manual/render/cycles/features.html
The goal is to make Experimental kernel closer in performance to the
official kernel, avoiding spills and such.
There should not be big impact on official kernel, own tests showed
few percent performance drop on laptop's GPU. CPU was always the
same speed on AVX, AVX2 and SSE4.1 CPUs i've been testing here.
This seems to be the last essential step before we can get rid of
Experimental kernel and enable SSS officially on GPU without causing
some major performance issues.
Surely some more tweaks are possibly required, but that we can do
for until cows go home anyway.
Should be no functional changes at all, just speeds up re-compilation
when some features needs to be disabled for development purposes.
For example, when running lots of Valgrind it's handy to disable any
GPU devices because otherwise you'll be wasting quite some time in
the driver while enumerating devices.
Reviewers: dingto, lukasstockner97, brecht, juicyfruit
Differential Revision: https://developer.blender.org/D1730
Previously several areas were calling TEST_SHARED_PTR_SUPPORT and
TEST_UNORDERED_MAP_SUPPORT which isn't that bad on it's own but
was causing some quite verbose output with same information line
printed multiple times. additionally, what's more worse, define flags
for Ceres were duplicated in main CMakeLists and Ceres's CMakeLists.
Now we've got a single place where checks for those classes are
happening and other areas are simply checking for variables set by
those check macros, keeping CMake output clean and nice.
The main purpose of such linking is to make Blender compatible with
NVidia's debuggers and profilers which are doing some LD_PRELOAD
magic to intercept some function calls. Such magic conflicts with
our CUDA wrangler magic and causes segmentation faults.
The option is disabled by default, so there's no affect on any of
artists.
In order to make Blender linked directly against CUDA library use
the WITH_CUDA_DYNLOAD CMake option (it's marked as advanced).
This panel is only visible when debug_value is set to 256 and has no
affect in other cases. However, if debug value is not set to this
value, environment variables will be used to control which features
are enabled, so there's no visible changes to anyone in fact.
There are some changes needed to prevent devices re-enumeration on
every Cycles session create.
Reviewers: juicyfruit, lukasstockner97, dingto, brecht
Reviewed By: lukasstockner97, dingto
Differential Revision: https://developer.blender.org/D1720
Although the code made it impossible to use time_start_ uninitialized, at least GCC did
still produce multiple warnings about it.
Since time_dt() is an extremely cheap operation and functionality does not change in any way when
removing the check in the constructor, this commit removes the check and therefore the warning.
This patch adds the "Hilbert Spiral", a custom-designed continuous space-filling curve, as a tile order for rendering in Cycles.
It essentially works by dividing the tiles into tile blocks which are processed in a spiral outwards from the center. Inside each
block, the tiles are processed in a regular Hilbert curve pattern. By rotating that pattern according to the spiral direction,
a continuous curve is obtained, which helps with cache coherency and therefore rendering speed.
The curve is a compromise between the faster-rendering Bottom-to-Top etc. orders and the Center order, which is a bit slower,
but starts with the more important areas. The Hilbert Spiral also starts in the center (unless huge tiles are used) and is still
marginally slower than Bottom-to-Top, but noticeably faster than Center.
Reviewers: sergey, #cycles, dingto
Reviewed By: #cycles, dingto
Subscribers: iscream, gregzaal, sergey, mib2berlin
Differential Revision: https://developer.blender.org/D1166
This change is for a few reasons:
- it works with color, and (therefore) will need to be color managed, at
some point. This will be much easier to do if the code is closer to the
actual color management code (in Blender's core, so to speak).
- it has nothing to do with the actual fire simulation, as it is just
used to create a lookup table
- it can be reused for other purposes (i.e. in Blender internal
renderer, if people are interrested in a blackbody node à la Cycles)
- cleanup: some functions (`contrain_rgb`, `xyz_to_rgb`) already exist
in BLI
Reviewers: brecht
Reviewed By: brecht
Subscribers: brecht
Differential Revision: https://developer.blender.org/D1719
If anyone finds OS X UI drawing glitches with different graphics cards please
report them and I'll add an exception specifically for Intel, but in theory this
should work fine for all graphics cards.
While previous code was already compiling with OSL 1.6 it was using some symbols
which were considered deprecated in upstream.
This commit adds some ifdefs, but soon we'll get rid of all them rather soon
with the upcoming OIIO/OSL update.
Patch from be28706 made it so integrator will use last shader's transparent
shadow flag, which is wrong since last shader might not have transparent
shadow while shaders prior to it might have one.
CYCLES_OPENCL_TEST was removed, there was an insonsistency between
opencl_kernel_use_split() and opencl_get_usable_devices().
From now on, to test non whitelisted devices please use either
CYCLES_OPENCL_MEGA_KERNEL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST.
This commit changes the way how we pass bounce information to the Light
Path node. Instead of manualy copying the bounces into ShaderData, we now
directly pass PathState. This reduces the arguments that we need to pass
around and also makes it easier to extend the feature.
This commit also exposes the Transmission Bounce Depth to the Light Path
node. It works similar to the Transparent Depth Output: Replace a
Transmission lightpath after X bounces with another shader, e.g a Diffuse
one. This can be used to avoid black surfaces, due to low amount of max
bounces.
Reviewed by Sergey and Brecht, thanks for some hlp with this.
I tested compilation and usage on CPU (SVM and OSL), CUDA, OpenCL Split
and Mega kernel. Hopefully this covers all devices. :)
While SCons building system was serving us really good for ages it's no longer
having much attention by the developers and started to become quite a difficult
task to maintain.
What's even worse -- there started to be quite serious divergence between SCons
and CMake which was only accumulating over the releases now. The fact that none
of the active developers are really using SCons and that our main studio is also
using CMake spotting bugs in the SCons builds became quite a difficult task and
we aren't always spotting them in time.
Meanwhile CMake became really mature building system which is available on every
platform we support and arguably it's also easier and more robust to use.
This commit includes:
- Removal of actual SCons building system
- Removal of SCons git submodule
- Removal of documentation which is stored in the sources and covers SCons
- Tweaks to the buildbot master to stop using SCons submodule
(this change requires deploying to the server)
- Tweaks to the install dependencies script to skip installing or mentioning
SCons building system
- Tweaks to various helper scripts to avoid mention of SCons folders/files
as well
Reviewers: mont29, dingto, dfelinto, lukastoenne, lukasstockner97, brecht, Severin, merwin, aligorith, psy-fi, campbellbarton, juicyfruit
Reviewed By: campbellbarton, juicyfruit
Differential Revision: https://developer.blender.org/D1680
The issue was caused by OSL using TLS which is required to be freed before the
Cycles session is freed. This is quite tricky to do in Cycles because different
render session are sharing the same task scheduler, so when one session is being
freed TLS might need to be active still.
In order to solve this, we are now doing JIT optimization ahead of the time
which ensures either TLS of JIT is freed before the render on multi-core system
or freed on OSLRenderSession destroy on single-core system.
This might increase synchronization time due to JIT of unused function, but
that we can solve later with some smart idea,
This commit overrides the user's choice of tile order in the case of viewport rendering and always uses bottom-to-top instead.
This was already done until the TileManager redesign, but since it removed the distinction between viewport and regular rendering
in the manager, the viewport was now also using the selected order. Since this requires sorting of the generated tiles,
it slows down rendering a bit. With the forced bottom-to-top order, this sorting step can now be avoided again.
Since the tile order is invisible anyways for viewport rendering, this commit won't have any impact on users (apart from a slight speedup).
This commit adds "Bands Saw" and "Rings Saw" to the options for the Wave texture node in Cycles, behaving similar to the Saw option in BI textures.
Requested by @cekuhnen on BA.
Reviewers: dingto, sergey
Subscribers: cekuhnen
Differential Revision: https://developer.blender.org/D1699
This is an attempt to emulate real CMOS cameras which reads sensor by scanlines
and hence different scanlines are sampled at a different moment in time, which
causes so called rolling shutter effect. This effect will, for example, make
vertical straight lines being curved when doing horizontal camera pan.
This is controlled by the Shutter Type option in the Motion Blur panel.
Additionally, since scanline sampling is not instantaneous it's possible to have
motion blur on top of rolling shutter.
This is controlled by the Rolling Shutter Time slider which controls balance
between pure rolling shutter effect and pure motion blur effect.
Reviewers: brecht, juicyfruit, dingto, keir
Differential Revision: https://developer.blender.org/D1624
Vector mapping node was doing some weird mapping of both original and mapped
coordinates. Mapping of original coordinates was caused by the clamping nature
of the LUT generated from the node. Mapping of the mapped value again was quite
totally obscure -- one needed to constantly keep in mind that actual value will
be scaled up and moved down.
This commit makes it so values in the vector curve mapping are always absolute.
In fact, it is now behaving quite the same as RGB curve mapping node and the
code could be de-duplicated. Keeping the code duplicated for a bit so it's more
clear what exact parts of the node changed.
Reviewers: brecht
Subscribers: bassamk
Differential Revision: https://developer.blender.org/D1672
This gives about 2x speedup (3.2sec vs. 11.9sec with 32716 handled nodes) when
updating shader for the shader tree.
Reviewers: brecht, juicyfruit, dingto, lukasstockner97
Differential Revision: https://developer.blender.org/D1700
This makes it possible to move some parts of evaluation from host to the device
and hopefully reduce memory usage by avoid having full RGBA buffer on the host.
Reviewers: juicyfruit, lukasstockner97, brecht
Reviewed By: lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1702
Main goal is to make kernel signatures editing easier and less prone to the
errors caused by missing function signature update or so.
This will also make it easier to add new CPU architectures.
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: dingto, lukasstockner97, brecht
Differential Revision: https://developer.blender.org/D1703
The idea is to have separate sets per node name in order to speed up the
comparison process. This will use a bit more memory and slow down simple
shaders, but this extra memory is not so much huge and time penalty is
not really measurable (at least from initial tests).
This saves orders of magnitude seconds when de-duplicating 17K nodes and
overall process now takes 0.01sec on my laptop,
Use Summary structure to collect all summary related on the shader compilation
process which then could be either simply reported to the log or be passed to
some user interface or so.
This is type of the summary / report which is most flexible and useful and
something we could use for other parts like shader optimization.
The idea of this commit is to merge nodes which has identical settings
and matching inputs into a single node in order to minimize number of
SVM instructions.
This is quite simple bottom-top graph traversal and the trickiest part
is how to compare node settings without too much trouble which seems to
be solved is quite clean way.
Still possibilities for further improvements:
- Support comparison of BSDF nodes
- Support comparison of volume nodes
- Support comparison of curve mapping/ramp nodes
Reviewers: brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1673
Issue was that dispatchEvent might call removeWindowEvents/
removeTypeEvents which will delete the event before we can do so.
To address this, handled events are now put in a separate list.
Reported by psy-fi and reviewed by brecht in IRC.
The events are allocated on the heap, then pushed on a stack. Before
being processed, they are popped from the stack, and deleted after
processing is done. When the manager is destroyed (e.g. application
closing), any remaining event in the stack is detroyed.
Issue is that when the "application closing" event is processed, it is
never freed, because the manager gets destroyed before the call to
`delete` is made and the event is not on the stack anymore.
Now events are left on the stack while they are processed, and only
popped and deleted after processing is done.
As a slight bonus refactor: use void as return type for dispatch events
functions, as no caller is checking the return value, and it is not
clear what it means (suggested by the reviewer).
Reviewers: brecht
Differential Revision: https://developer.blender.org/D1695
Historically blender had an audio sample rate of 44.1 kHz as default which is mostly popular because it's the sample rate of audio CDs. Audaspace kept using this default from the pre 2.5 era. It was about time to change to 48 kHz, which is a more widespread standard nowadays, especially in video. It is the recommended sampling rate of the Audio Engineering Society.
Further reading: https://en.wikipedia.org/wiki/44,100_Hz#Status
- When rendering in the Viewport, next_tile is sometimes called after a reset has been performed, but before
new tiles were generated. In that case, the tile list would be invalid, causing Blender to crash randomly.
- When generating new tiles, the TileManager would not clear the tile lists before re-generating them, leading
to some tiles being skipped during viewport rendering.
- When popping the next tile from a tile list, a reference to the just-deleted object would be returned, now the
object is copied before deleting it.
This way socket type conversions (such as color to float, or float to vector) do not stop the folding process.
Example: http://www.pasteall.org/pic/show.php?id=96803 (selected nodes are folded).
This commit modifies the TileManager to sort render tiles once after tiling the image,
instead of searching the next tile every time a new tile is acquired by a device.
This makes acquiring a tile run in constant time, therefore the render time is linear
w.r.t. the amount of tiles, instead of the quadratic dependency before.
Furthermore, each (logical) device now has its own Tile list, which makes acquiring
a tile for a specific device easier.
Also, some code in the TileManager was deduplicated.
Reviewers: dingto, sergey
Differential Revision: https://developer.blender.org/D1684
This is actually intended behavior to return NULL when the socket is not
found. It's used in certain BSDF nodes to query whether some inputs exists
or not.
Perhaps we can be more explicit here and have dedicated logic to query
socket existance and keep assert in place.
In any case, even if we lost assert() for the constant fold now it's
still somewhat better than duplicated code. Perhaps.
Use float in moto instead of double for MT_Scalar.
This switch allow future optimization like SSE.
Additionally, it changes the OpenGL calls to float versions as they are
very bad with doubles.
Reviewers: campbellbarton, moguri, lordloki
Reviewed By: lordloki
Subscribers: brecht, lordloki
Differential Revision: https://developer.blender.org/D1610
Performance is about the same or slightly better for typical IK chains.
In extreme cases with many bones and multiple targets, of which some are
unreachable, I've seen 2x speedups.
Maybe this is pedantic but I read it’s best to explicitly set the
desired component size.
Also append “_ARB” to float texture formats since those need an
extension in GL 2.1.
Use new GPU_legacy_support() function.
Determine GLSL version once instead of per shader.
For Texture Buffers, allow ARB or EXT version of the extension. Either
one will do.
In practice this gives us a context that is *compatible* with GL 2.1. On
my machine it gives a GL 3.3 or 4.3 compatibility profile context,
depending on graphics card installed.
Also fixed enum for core profile (not used yet).
Also added option for GL 3.2 compatibility profile. This will be useful
during Blender 2.8 development, until we are able to use the core
profile. On my machine this gives exactly a GL 3.2 compatibility profile
context, not 3.3 or 4.
My previous edit to this check was too lax.
OSD's shader for the Transform Feedback evaluator declares itself
#version 410 so disable the feature if user's GL < 4.1.
This way, connecting Value or RGB node to e.g. a Math node will still allow folding.
Note: The same should be done for the ConvertNode, but I leave that for another day.
Previously RGB Curves node will clamp input to 0..1 which is rather useless
when one wants to use HDR image textures and do bit of correction on them.
Now kernel code supports extrapolation of baked LUT based on first/last two
table points and performs linear extrapolation.
The only tricky part is to guess the range to bake the LUT for. Currently
it's using simple approach -- minmax of the input curves. While this behaves
ok for the simple cases it's easy to trick the system up causing incorrect
results.
Not sure we can solve those issues in a general case and since the new code
is giving more expected results it's not that bad actually. In the worst
case artist migh always create explicit point to make sure LUT is created
for the needed HDR range.
Reviewers: brecht, juicyfruit
Subscribers: sebastian_k
Differential Revision: https://developer.blender.org/D1658
This must have happened months ago, but as I did not `make clean` any build folder since then,
so only noted that today.
Issue is same as dirty patch we have to apply to ODL sources before building it in install_deps.sh - for
some mysterious reason, it has become impossible to compoile .osl files into .oso ones without
giving explicit output file name (otherwise it just produces `.oso` file - utterly stupid and useless).
We could probably fix that in own OSL source, but think being explicit here does not hurt anyway, so...
Let's go the easy way.
This reduces stress on the the stack memory which could be really handy
on certain operation systems which applies strict limits on the stack.
Reviewers: brecht, juicyfruit, dingto
Reviewed By: brecht, juicyfruit, dingto
Differential Revision: https://developer.blender.org/D1656