Root of the issue goes back to the on-fly normals commit and the
latest fix for it wasn't actually correct. I've mixed two fixes
in there.
So the idea here goes back to storing negative scaled object flag
and flip runtime-calculated normal if this flag is set, which is
pretty much the same as the original fix for the issue from me.
The issue with motion blur wasn't caused by the rumtime normals
patch and it had issues before, because it already did runtime
normals calculation. Now made it so motion triangles takes the
negative scale flag into account.
This actually makes code more clean imo and avoids rather confusing
flipping code in mesh.cpp.
This is rather workaround solution for now, which seems to
work and it's not that huge to maintain (one liner apart from
the comment).
Idea is to make sure PeekMessage peeks the message when window
proc receives WM_MOUSEWHEEL (some touchpad drivers seems to
swallow the messages making it so PeekMessage doesn't get
anything).
We can't really make CPU and GPU results look the same in all possible
circumstances, but here we can make them look close enough to each other
by making it so sobol pattern for bounce number is the smae for both
CPU and GPU.
This makes CPU and GPU render results look the same with low number of
samples, high number of samples was never an issue.
This commit merges the code in the pie-menu branch.
As per decisions taken the last few days, there are no pie menus
included and there will be an official add-on including overrides of
some keys with pie menus. However, people will now be able to use the
new code in python.
Full Documentation is in http://wiki.blender.org/index.php/Dev:Ref/
Thanks:
Campbell Barton, Dalai Felinto and Ton Roosendaal for the code review
and design comments
Jonathan Williamson, Pawel Lyczkowski, Pablo Vazquez among others for
suggestions during the development.
Special Thanks to Sean Olson, for his support, suggestions, testing and
merciless bugging so that I would finish the pie menu code. Without him
we wouldn't be here. Also to the rest of the developers of the original
python add-on, Patrick Moore and Dan Eicher and finally to Matt Ebb, who
did the research and first implementation and whose code I used to get
started.
- Forgot to handle command line arguments
- Because of the fact we need to be able to
use stdout and stderr we need to use regular
console application for the wrapper.
- Because of using regular application for the
wrapper we need to check forparent PID in the
isStartedFromCommandPrompt().
I really hope it's not gonna to become any more
complicated.
In collaboration with Sergey Sharybin.
Also thanks to Wolfgang Faehnle (mib2berlin) for help testing the
solutions.
Reviewers: sergey
Differential Revision: https://developer.blender.org/D690
For now it was mainly about OpenCL wrangler being duplicated
between Cycles and Compositor, but with OpenSubdiv work those
wranglers were gonna to be duplicated just once again.
This commit makes it so Cycles and Compositor uses wranglers
from this repositories:
- https://github.com/CudaWrangler/cuew
- https://github.com/OpenCLWrangler/clew
This repositories are based on the wranglers we used before
and they'll be likely continued maintaining by us plus some
more players in the market.
Pretty much straightforward change with some tricks in the
CMake/SCons to make this libs being passed to the linker
after all other libraries in order to make OpenSubdiv linked
against those wranglers in the future.
For those who're worrying about Cycles being less standalone,
it's not truth, it's rather more flexible now and in the future
different wranglers might be used in Cycles. For now it'll
just mean those libs would need to be put into Cycles repository
together with some other libs from Blender such as mikkspace.
This is mainly platform maintenance commit, should not be any
changes to the user space.
Reviewers: juicyfruit, dingto, campbellbarton
Reviewed By: juicyfruit, dingto, campbellbarton
Differential Revision: https://developer.blender.org/D707
The issue was caused by the render engine loading edit mesh, which re-allocates
mesh array which might be referenced by other object's derived meshed.
Worst thing about this is that updating render engine happens from the end of
scene update function, after all the objects are updated and so. This is needed
so render engine gets the update objects which is correct.
The only proper way to solve the issue is to make it so viewport engine does not
leave objects in inconsistent state, meaning nobody will reference to freed data.
In order to reach this we do edit mesh loading before running objects update so
all the objects which uses that mesh will have proper references in the derived
mesh.
This also solves old creepyness which happened before when having single object
in edit mode. tweaking it will calculate derived mesh as a part of scene update,
then this derived mesh will be freed by edit mesh loading and viewport will be
creating derived mesh again.
Now render engine is expected to do nothing with meshes which are in edit mode,
but they still need to load edit data for non0meshes. It's not really easy to
do from the BKE level because needed functions are implemented in the editor.
Thanks Campbell for the review!
Differential Revision: https://developer.blender.org/D697
(original patch by Sergey Sharybin)
Note: RNA API can't use size_t at the moment. Once it does this patch
can be tweaked a bit to fully benefit from size_t larger dimensions.
(right now num_pixels is passed as int)
Reviewed By: sergey, campbellbarton
Differential Revision: https://developer.blender.org/D688
uses old behavior again. OSX menu and CTL-CMD-F still work as lion fullscreen as well as right-upper corner fs window-icon
- We must investigate here why double promotion happens from op calls ( dispatchEvents on redraw cause duplicated calls here )
- The actual op calls cause fs to be in a wrong state, so also mousehandles fail and CTX_wm_window(C) is not valid.
- similar problem is with quit op, which does not close the app right ( totblocks )
- i would prefer to try getting direct os function call here rather
Baking progress preview is not possible, in parts due to the way the API
was designed. But at least you get to see the progress bar while baking.
Reviewers: sergey
Differential Revision: https://developer.blender.org/D656
Also extended the size of buf[] in print_error() to prevent mem_printmemlist_pydict_script[]
from getting truncated when MEM_printmemlist_pydict() is used.
Differential revision: https://developer.blender.org/D675
Reviewed by: Campbell Barton
Fix T41115: Motion Blur renders Objects Black - But not in Viewport Preview
This actually extends previous fix to normals and makes it all much nicer now.
Worth doing some intense testing, quick one worked just fine but there always
could be some corner cases.
Fix T41079: Solid black render of object with negative scale and smooth shading
In both cases the issue was caused by negative scaled objects with single mesh
users for which scale gets applied when using static BVH.
Since the on-fly normals calculation land normals for such cases weren't flipped
leading them to point to a wrong direction.
Added a special object flag for this, which is a bit of a bummer because now
we've got less bits for real useful things, but this is the only way to get
proper normals without adding more complexity in the on-fly calculations.
This is related to Task T34861 to increase up & track axis options for TrackTo actuator. I've just added it to differential to facilitate an easier review.
With the patch applied you can select X, Y and Z axis for the Up axis, and X, Y, Z, -X, -Y and -Z for the track axis.
Related to the implementation I have used the algorithm from Trackto constrain placed in constrain.c but adapted to be used with MOTO library.
The wiki docs are here (http://wiki.blender.org/index.php/User:Lordloki/Doc:2.6/Manual/Game_Engine/Logic/Actuators/Edit_Object#Trackto_Actuator).
Test file is here: {F97623}
I have also uploaded 2 screenshots showing the UI modifications to the TrackTo actuator:
{F91992} {F91990}
Reviewers: moguri, dfelinto
Reviewed By: moguri
CC: Genome36
Differential Revision: https://developer.blender.org/D565
Fix T41013: OSL and Crash
Fix T40989: Intermittent crash clicking material color selector
Issue was caused by not enough precision for inversion threshold.
Use double precision for this threshold now. We might want to
investigate this code a bit more further, stock implementation
uses doubles for all computation. Using floats might be a reason
of bad rows distribution in theory.
material is applied to multiple Dupli instances of an object.
One of the random_id initialization lines for cycles objects slipped
into the basic update part in this commit:
rBb98ff5cb5b2c14c33b16e3b129e1e08810e90a6c
This would constantly re-shuffle the random_id ...
Fix T39286: Display percentage ignored in Cycles viewport.
The threaded depsgraph update changes included a cleanup of the global
is_rendering flag, which was replaced by a general EvalContext being
passed to dupli functions.
Problem is that the global flag was true for viewport duplis before
(ugly hack), which was used as a check for generating dupli orco/UV from
mesh data layers. The new flag is stricter and only true for actual
renders, which disables these attributes and breaks the Cycles
Texture Coordinates and UVMap nodes.
The solution is to extend the simple for_render boolean to an enum:
* VIEWPORT: OpenGL viewport drawing (dupli tex coords omitted)
* PREVIEW: Viewport preview render (simplified modifiers)
* RENDER: Full render with all details and attributes
There are still some areas that need to be examined, in particular
modifiers seem to totally ignore the EvaluationContext!
Instead they generally execute without render params from the depsgraph
(BKE_object_handle_update_ex) and are built with render settings
explicitly.
Differential Revision: https://developer.blender.org/D613
* malloc() is used now, which is supported since sm_20: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations The performance of this needs to be tested on various cards still.
* This also works for Heterogeneous Decoupled Ray Marching, but in this case I get sporadic "Illegal Address" errors on my Geforce 540, therefore I did not remove the GPU check in kernel_volume_use_decoupled() yet.
I would appreciate some tests from people who compile themselves, enable Volumetrics in kernel_types.h.
* Put Normal Settings beneath the other ones, wild button jumping should be avoided.
* Remove Cage prefix for Object and Extrusion, it's clear from the button placement, the former UI was too squeezed...
* CUDA can be compiled with Volume support again, change line 78 kernel_types.h for that.
Volumes are still fragile on GPU though, got some Memory/Address CUDA errors in tests.. needs to be investigated more deeply.
The problem is that the render window keeps keybord input focus even after it has been
lowered. Windows maintains the Z-order of windows, so the present solution is to raise
the window that has been just below the render window.
Differential revision: https://developer.blender.org/D594
Reviewed by: campbellbarton
Volume scatter might happen before path termination, so
need to check transparent bounces and consider shadow an
opaque when max transparent bounces are reached.
TODO: CPU code seems to have different branching in conditions
which made me thinking it does different things with volume
attenuation, but from the render results it seems the same
exact things are happening there. Worth looking into making
simplifying code a bit here to improve readability.
It turns out that the new Beckmann sampling function doesn't work well with
Quasi Monte Carlo sampling, mainly near normal incidence where it can be worse
than the previous sampler. In the new sampler the random number pattern gets
split in two, warped and overlapped, which hurts the stratification, see the
visualization in the differential revision.
Now we use a precomputed table, which is much better behaved. GGX does not seem
to benefit from using a precomputed table.
Disadvantage is that this table adds 1MB of memory usage and 0.03s startup time
to every render (on my quad core CPU).
Differential Revision: https://developer.blender.org/D614
This way util_simd.cpp would not require modifications
if/when SSE2 is suddenly supported on 32bit platforms.
This also allowed to unleash some issues with util_simd.h
related on the fact that there size_t and int are actually
the same types.
* Anisotropic BSDF now supports GGX and Beckmann distributions, Ward has been
removed because other distributions are superior.
* GGX is now the default distribution for all glossy and anisotropic nodes,
since it looks good, has low noise and is fast to evaluate.
* Ashikhmin-Shirley is now available in the Glossy BSDF.
* Ashikhmin-Shirley anisotropic BSDF was added as closure
* Anisotropic BSDF node now has two distributions
Reviewers: brecht, dingto
Differential Revision: https://developer.blender.org/D549
Samples render slower than before, but hopefully this is made up for with
reduced noise in most cases. The main slowdown comes from samples that would
previously be wasted and turn out black, which are now continued.
GGX sampling is about the same speed as before, while for Beckmann it is slower
still. Perhaps optimizations are still possible there, but didn't find anything
easy.
Code from this paper, which comes with sample code:
Importance Sampling Microfacet-Based BSDFs using the Distribution of Visible Normals.
E. Heitz and E. d'Eon, EGSR 2014
Differential Revision: https://developer.blender.org/D572
This gives you "Multiple Importance", "Distance" and "Equiangular" choices.
What multiple importance sampling does is make things more robust to certain
types of noise at the cost of a bit more noise in cases where the individual
strategies are always better.
So if you've got a pretty dense volume that's lit from far away then distance
sampling is usually more efficient. If you've got a light inside or near the
volume then equiangular sampling is better. If you have a combination of both,
then the multiple importance sampling will be better.
* Volume multiple importace sampling support to combine equiangular and distance
sampling, for both homogeneous and heterogeneous volumes.
* Branched path "Sample All Direct Lights" and "Sample All Indirect Lights" now
apply to volumes as well as surfaces.
Implementation note:
For simplicity this is all done with decoupled ray marching, the only case we do
not use decoupled is for distance only sampling with one light sample. The
homogeneous case should still compile on the GPU because it only requires fixed
size storage, but the heterogeneous case will be trickier to get working.
* Added support for uchar4 attributes to Cycles' attribute system.
* This is used for Vertex Colors now, which saves some memory (4 unsigned characters, instead of 4 floats).
* GPU Texture Limit on sm_20 and sm_21 decreased from 95 to 94, because we need a new texture for the uchar4 attributes. This is no problem for sm_30 or newer.
Part of my GSoC 2014.
This kernel is compiled with AVX2, FMA3, and BMI compiler flags. At the moment only Intel Haswell benefits from this, but future AMD CPUs will have these instructions as well.
Makes rendering on Haswell CPUs a few percent faster, only benchmarked with clang on OS X though.
Part of my GSoC 2014.
Instead of pre-calculation and storage, we now calculate the face normal during render.
This gives a small slowdown (~1%) but decreases memory usage, which is especially important for GPUs,
where you have limited VRAM.
Part of my GSoC 2014.
This makes the code a bit easier to understand, and might come in handy
if we want to reuse more Embree code.
Differential Revision: https://developer.blender.org/D482
Code by Brecht, with fixes by Lockal, Sergey and myself.
This gives around 30% of speedup for gaussian blur node.
Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:
- Aligned malloc. It's needed to load data onto SSE registers
faster. based on the aligned_malloc() from Libmv with
some additional trickery going on to support arbitrary
alignment (this magic is needed because of MemHead).
In the practice only 16bit alignment is supported because
of the lack of aligned malloc with arbitrary alignment
for OSX. Not a bit deal for now because we need 16 bytes
alignment at this moment only. Could be tweaked further
later.
- Memory buffers in compositor are now aligned to 16 bytes.
Should be harmless for non-SSE cases too. just mentioning.
Reviewers: campbellbarton, lukastoenne, jbakker
Reviewed By: campbellbarton
CC: lockal
Differential Revision: https://developer.blender.org/D564
This means packed images and movies are now supported when using OSL
backend for material shading.
Uses special file name to distinguish whether image is builtin or not.
This part might become a bit smarted or optimized a bit, but it's good
enough with this implementation already.
Suggestion by Andy Davies (metalliandy) to conform with industry standard (custom cage is something else apparently)
Note: this is the last bake related commit I plan for 2.71/rc (unless
everyone agrees that we could squeeze in D546 - custom UVs, which would
be really nice to add for 2.71 scripters)
Note 2: I'll update the wiki docs shortly
There is a new option to select whether you want to use cage or not.
When not using cage the results will be more similar with Blender
Internal, where the inwards rays (trying to hit the highpoly objects)
don't always come from smooth normals. So if the active object has sharp
edges and an EdgeSplit modifier you get bad corners.
This is useful, however, to bake to planes without the need of adding
extra loops around the edges.
When cage is "on" the user can decide on setting a cage extrusion or to
pick a Custom Cage object. The cage extrusion option works in a
duplicated copy of the active object with EdgeSplit modifiers removed to
inforce smooth normals. The custom cage option takes an object with the
same number of faces as the active object (and the same face ordering).
The custom cage now controls the direction and the origin of the
rays casted to the highpoly objects. The direction is a ray from the
point in the cage mesh to the equivalent point to the base mesh. That
means the face normals are entirely ignored when using a cage object.
For developers:
When using an object cage the ray is calculated from the cage mesh to
the base mesh. It uses the barycentric coordinate from the base mesh UV,
so we expect both meshes to have the same primitive ids (which won't be
the case if the cage gets edited in a destructive way).
That fixes T40023 (giving the expected result when 'use_cage' is false).
Thanks for Andy Davies (metalliandy) for the consulting with normal
baking workflow and extensive testing. His 'stress-test' file will be
added later to our svn tests folder. (The file itself is not public yet
since he still has to add testing notes to it).
Many thanks for the reviewers.
More on cages:
http://wiki.polycount.com/NormalMap/#Working_with_Cages
Reviewers: campbellbarton, sergey
CC: adriano, metalliandy, brecht, malkavian
Differential Revision: https://developer.blender.org/D547
Now baking does one AA sample at a time, just like final render. There is
also some code for shader antialiasing that solves T40369 but it is disabled
for now because there may be unpredictable side effects.
Cycles expects to find all node sockets with their correct names, but
this can be changed via the API (see bug report discussion).
Solution for now is to let cycles accept this case gracefully instead
of crashing. The shader will simply use the internal default values for
inputs and any connections will be ignored.
Would be nice to report the error somewhere, but cycles doesn't have a
proper logging system for this purpose yet.
it's possible that runtime optimizer would call get_attribute
with NULL renderstate. As per documentation, it's valid to
return false in that cases and in worst case we'll just miss
some possible optimization.
Supporting such cases would require some bigger changes to
Cycles since attributes are only set to up for the kernel
after shader compilation.
Thanks Brecht for review!
Transparent objects could become subtly visible by the different sampling
patterns for pixels covered and not covered by the object. It still converged
to the right solution but that can take a while. Now we try to use the same
sampling pattern here.
This reverts commit 12abe94de827d9ae9c0dd6cc49bc6c3e377842ad.
After a long discussion in the bug tracker we decided baking should use
the faces normals for glossy (and combined). This is what Blender
Internal is doing, and one of the more predictable way of yielding
predictable results.
That also means the result will not match the render perfectly, but this
is preferrable over the alternatives at hand.
Conflicts:
intern/cycles/kernel/kernel_bake.h
Comments from Brecht Van Lommel:
"""
Currently the viewing direction for each pixel is set to the normal, so
at every pixel glossy is evaluated as if you're looking straight at it.
Blender Internal works the same.
"""
This patch makes baking glossy as viewed from the camera.
Reviewers: brecht
CC: zanqdo
Differential Revision: https://developer.blender.org/D555
The kernel for baking the world texture was the same as the one used for
baking. Now that's separate which allows the kernel to reserve much less
memory.
CMake had this --fast-math flag but scons not, makes a big difference on some
files. Slightly slower rendering might still happen though, but it should not
be this much.
The remaining functions in blender_python.cpp changed from using the
MACRO to use python_thread_state_save/python_thread_state_restore
Since this bug only happens when 'Persistent Images' is on it was
introduced in some of the early merges with master and I never caught
it.
Thanks Daniel Salazar for helping with the bug hunting.
If we are using a mix node we still need to evaluate the BSDF lighting
even if scattering is successful.
Note: this was working for branched path (probably an oversight when
branched path support was introduced for baking, a good oversight though
;)
The bug was caused by using negative numbers as the end for playing forever (or until the end of the sound is reached) in the library. This was used with speaker objects which have an end of FLT_MAX now instead and the negative number interpretation was removed. I hope this doesn't break anything else.
Fixes T40027. This means we get more CPU usage again when using multiple CUDA,
but the impact on performance is too big a problem with the current code.
Instead of 95, we can use 145 images now. This only affects Kepler and above (sm30, sm_35 and sm_50).
This can be increased further if needed, but let's first test if this does not come with a performance impact.
Originally developed during my GSoC 2013.
This fixes the SSS Direct/Indirect passes as well as the Combined pass.
Patch reviewed and with fixes and contributions from Brecht van Lommel.
Note: displacement/bump map (related to the report) will be handled separately
Reviewers: brecht
Differential Revision: https://developer.blender.org/D503
Now the COMBINED pass includes the Ambient Occlusion.
This was not reported anywhere, but while working in the Subsurface Scattering I realize we needed this fix for combined.
This was faster for my AMD system but slower for Intel.
However with gcc4.9,-O3 I was able to get roughly the same speed before/after.
Revert since this isnt giving such clear benefits on most systems.
Expand Cycles to use the new baking API in Blender.
It works on the selected object, and the panel can be accessed in the Render panel (similar to where it is for the Blender Internal).
It bakes for the active texture of each material of the object. The active texture is currently defined as the active Image Texture node present in the material nodetree. If you don't want the baking to override an existent material, make sure the active Image Texture node is not connected to the nodetree. The active texture is also the texture shown in the viewport in the rendered mode.
Remember to save your images after the baking is complete.
Note: Bake currently only works in the CPU
Note: This is not supported by Cycles standalone because a lot of the work is done in Blender as part of the operator only, not the engine (Cycles).
Documentation:
http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Bake
Supported Passes:
-----------------
Data Passes
* Normal
* UV
* Diffuse/Glossy/Transmission/Subsurface/Emit Color
Light Passes
* AO
* Combined
* Shadow
* Diffuse/Glossy/Transmission/Subsurface/Emit Direct/Indirect
* Environment
Review: D421
Reviewed by: Campbell Barton, Brecht van Lommel, Sergey Sharybin, Thomas Dinge
Original design by Brecht van Lommel.
The entire commit history can be found on the branch: bake-cycles
Probably will not be noticed in most scenes. This helps reduce noise when you
have multiple lamps with MIS enabled, at the cost of some performance, but from
testing some scenes this seems better.
The was actually caused by the way how Cycles uses objects
layers. It's not possible to rely on the fact that layers
are flushed from Base to Object. It's only valid when rendering
active scene.
Now made it so layers are used from the base.
This also updates the configurations to build kernels for compute capability
5.0 cards, when using and older CUDA toolkit version this will be skipped.
Also includes tweaks to improve performance with this version:
* Increase max registers on sm_30, sm_35 and sm_50
* No longer use texture storage on sm_30
The formula was not consistent across Blender and behaved strangely, now it is
a simple linear blend between color1 and min(color1, color2).
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D489
This caused a couple of fireflies in koro_final.blend. The wrong normal would
cause the shading point to be set as backfacing, which triggered another bug
with hair BSDFs on the backface of hair curves. That one is not fixed yet but
there's a comment in the code about it now.
Old algorithm:
Raytrace from one transparent surface to the next step by step. To minimize
overhead in cases where we don't need transparent shadows, we first trace a
regular shadow ray. We check if the hit primitive was potentially transparent,
and only in that case start marching. this gives extra ray cast for the cases
were we do want transparency.
New algorithm:
We trace a single ray. If it hits any opaque surface, or more than a given
number of transparent surfaces is hit, then we consider the geometry to be
entirely blocked. If not, all transparent surfaces will be recorded and we
will shade them one by one to determine how much light is blocked. This all
happens in one scene intersection function.
Recording all hits works well in some cases but may be slower in others. If
we have many semi-transparent hairs, one intersection may be faster because
you'd be reinteresecting the same hairs a lot with each step otherwise. If
however there is mostly binary transparency then we may be recording many
unnecessary intersections when one of the first surfaces blocks all light.
We found that this helps quite nicely in some scenes, on koro.blend this can
give a 50% reduction in render time, on the pabellon barcelona scene and a
forest scene with transparent leaves it was 30%. Some other files rendered
maybe 1% or 2% slower, but this seems a reasonable tradeoff.
Differential Revision: https://developer.blender.org/D473
This was the original code to get things working on old GPUs, but now it is no
longer in use and various features in fact depend on this to work correctly to
the point that enabling this code is too buggy to be useful.
This can for example be useful if you want to manually terminate the path at
some point and use a color other than black.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D454
for one of the input shaders is zero.
This gives about 5% speedup for koro_final.blend. In general this is important
so you can design shaders that run faster for shadows, diffuse bounces, etc, for
example by skipping procedural textures or even using a single fixed color.
Otherwise devices used for display will lock up the UI too much. This means
you might still get 100% CPU for the display device, but for others CPU usage
should be low still.
The check to see if a device is used for display may not be entirely reliable,
it checks if there is a watchdog timeout on the device, but I'm not entirely
sure that always exists for display devices or is disabled for non-display
devices, though some tools like cuda-gdb seem to make the same assumption.
Ref T39559
This makes it easier to have per kernel number of registers. Also, all the
tunable parameters for this are now in kernel.cu, rather than spread over cmake,
scons and device_cuda.cpp.
The problem here was that animation buffers got initialized with zeros in the beginning for unknown parts. Now it gets initialized with the first known value.
The bug's result was that the animation of the pitch started with 0 on first playback and thus any seeking while the pitch is zero resulted in seeking to the beginning.
Nice little mistake, since the invalid mem access only happened once (the first time),
was close to valid mem, and was only used to read, it would not crash often...
Idea and code by Brecht, many thanks!
Reviewers: brecht
Reviewed By: brecht
CC: campbellbarton, dingto
Differential Revision: https://developer.blender.org/D369
This fixes the ptaxs "ACCESS_VIOLATION" error and should allow our Linux and Windows build bots to compile again.
Unfortunately this comes with a performance penalty on sm_2x cards, so this is only a workaround for now. Branched Path is still globally disabled on GPU.
In practice this means that if you don't connect a texture to your volume nodes
it will figure that out and render the node faster, rather than you having to
specify it manually.
Main weakness is custom OSL nodes where we have to assume it is heterogeneous
because we don't know what kind of data the node accesses.
This now uses decoupled ray marching, and removes the probalistic scattering.
What this means is that each AA sample will be slower but contain less noise,
hopefully giving less render time to reach the same noise levels.
For those following along, there's still a bunch of volume sampling improvements
to do: all-light sampling, multiple importance sampling, transmittance threshold,
better indirect light handling, multiple scatter approximation.
This basically records all volumes steps, which can then later be used multiple
time to take scattering samples, without having to step through the volume
again. From the paper:
"Importance Sampling Techniques for Path Tracing in Participating Media"
This works only on the CPU, due to usage of malloc/free.
Similar to surfaces, this will now always scatter rather than probabilistically
scattering or not depending on the transmittance.
This also makes calculation of branched path throughput non-probalistic, which
makes thing slower too. That's to be solved by decoupled ray marching later.
This removes a few optimizations to avoid exp() calls and division, they will be
added back later, at the moment it's more important to make the code easier to
understand and refactor.
Rather use random point in each step instead of giving the steps random sizes.
Gives a bit more accurate results with large step sizes, but also convenient
convention for later changes.
These can currently be accessed by adding an Attribute node and specifying one
of those three names. A Smoke/Fire node should be added at some point to make
this more convenient.
These values might change still before the release, in particular for flame the
meaning seems unclear, it's just values in the 0..1 range. This is useful for
color ramps, but it might be good if this was also available as temperature in
kelvin so it can be plugged into the blackbody node. But I couldn't figure out
from the smoke code if or how this corresponds to a physical unit.
Here's a (quite poor) example file for a fire + smoke setup:
http://www.pasteall.org/blend/27990
These are internally stored as a 3D image textures, but accessible like e.g.
UV coordinates though the attribute node and getattribute().
This is convenient for rendering e.g. smoke objects where data like density is
really a property of the mesh, and it avoids having to specify the smoke object
in a texture node, instead the material will work with any smoke domain.
Notes:
* The motion steps only affect deformation motion blur.
* The actual number of steps is 2^(steps - 1). This avoids having to sample at
many different times for object with more/fewer steps, now the times overlap.
* Deformation motion blur is enabled by default in existing files that have
motion blur enabled. If the object is not deforming, this will be detected at
export time, so raytracing performance will not be affected.
Part of the code is from the summer of code project by Gavin Howard.
This now supports multiple steps and subframe sampling of motion.
There is one difference for object and camera transform motion blur. It still
only supports two steps there, but the transforms are now sampled at subframe
times instead of the previous and next frame and then interpolated/extrapolated.
This will give different render results in some cases but it's more accurate.
Part of the code is from the summer of code project by Gavin Howard, but it has
been significantly rewritten and extended.
added in rB74518b28267e9b18199212fbaa3c689fa018d20c.
No special bind/unbind needed for standalone viewer, so can just use a
static stub in the display callback.
Issue was caused by the wrong usage of OCIO GLSL binding API. To make it
work properly on pre-GLSL-1.3 drivers shader is to be enabled after the
texture is binded to the opengl context. Otherwise it wouldn't know the
proper texture size.
This is actually a regression in 2.70 and to be ported to 'a'.
There are a couple of bugs that come together here:
* Particle hacks: extra modifier stack evaluation just for particles in
rna_Object_create_duplilist. This is where the primary issue stems from,
the "for_render" setting replaced the G.is_rendering flag in threaded
depsgraph. This causes particles to recalculate the entire modifier
stack with _render_ settings instead of viewport settings now. Fixed by
taking the 'preview' parameter in Cycles into account.
* Buggy skin modifier: The skin modifier generates a different amount of
vertices and faces **on every execution**. This must be looked at
separately, but it could be another reason why cycles constantly
restarted the sync process.
* Particles get re-distributed randomly every time (changing seed). This
could be caused just by the broken skin modifier, but might still be an
issue when simply rendering with cycles, since the psys will be
evaluated for render settings, if just temporarily.
Gives ~11% speedup for hair.blend, ~10% for koro_final.blend
Also extract few common subexpressions in hair calculation.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D318