* Remove tex_* and pixels_* functions, replace by mem_*.
* Add MEM_TEXTURE and MEM_PIXELS as memory types recognized by devices.
* No longer create device_memory and call mem_* directly, always go
through device_only_memory, device_vector and device_pixels.
Steps to reproduce:
- Create shader Image texture -> Diffuse BSDF -> Output. Do NOT select image yet!
- Start viewport render.
- Select image from the ID browser of Image Texture node.
Thing is: with the memory manager we always need to inform device that memory
was freed.
Image textures were being packed into a single buffer for OpenCL, which
limited the amount of memory available for images to the size of one
buffer (usually 4gb on AMD hardware). By packing textures into multiple
buffers that limit is removed, while simultaneously reducing the number
of buffers that need to be passed to each kernel.
Benchmarks were within 2%.
Fixes T51554.
Differential Revision: https://developer.blender.org/D2745
This commit unifies the flattened texture slot names for bindless and regular CUDA textures. Texture indices are now identical across all CUDA architectures, where before Fermi used different indices, which lead to problems when rendering on multi-GPU setups mixing Fermi with newer hardware.
The issue was caused by unlimited textures commit, root of the issue is that
displacement code updates some of the image slots directly, so it needs to
ensure device vectors are all proper size.
Not sure if this is a proper fix, but was getting frequent crashes, so
committing this real quick just to make master sable again. Can be
reverted later if there's a better fix. The changes to images really
need a closer look...
Previous fix did not work for mixed textures. This one will over-allocate
information array, but it's better than not being able to render at all.
Some more cleanup and improvement is coming.
This patch allows for an unlimited number of textures in Cycles where the hardware allows. It replaces a number static arrays with dynamic arrays and changes the way the flat_slot indices are calculated. Eventually, I'd like to get to a point where there are only flat slots left and textures off all kinds are stored in a single array.
Note that the arrays in DeviceScene are changed from containing device_vector<T> objects to device_vector<T>* pointers. Ideally, I'd like to store objects, but dynamic resizing of a std:vector in pre-C++11 calls the copy constructor, which for a good reason is not implemented for device_vector. Once we require C++11 for Cycles builds, we can implement a move constructor for device_vector and store objects again.
The limits for CUDA Fermi hardware still apply.
Reviewers: tod_baudais, InsigMathK, dingto, #cycles
Reviewed By: dingto, #cycles
Subscribers: dingto, smellslikedonkey
Differential Revision: https://developer.blender.org/D2650
The idea is to make include statements more explicit and obvious where the
file is coming from, additionally reducing chance of wrong header being
picked up.
For example, it was not obvious whether bvh.h was refferring to builder
or traversal, whenter node.h is a generic graph node or a shader node
and cases like that.
Surely this might look obvious for the active developers, but after some
time of not touching the code it becomes less obvious where file is coming
from.
This was briefly mentioned in T50824 and seems @brecht is fine with such
explicitness, but need to agree with all active developers before committing
this.
Please note that this patch is lacking changes related on GPU/OpenCL
support. This will be solved if/when we all agree this is a good idea to move
forward.
Reviewers: brecht, lukasstockner97, maiself, nirved, dingto, juicyfruit, swerner
Reviewed By: lukasstockner97, maiself, nirved, dingto
Subscribers: brecht
Differential Revision: https://developer.blender.org/D2586
This is a way to avoid possible memory corruption when render threads works
in parallel with UI thread.
Not guarantees complete safe, but makes things easier to check anyway.
Main intention is to give some quick way to control scene's memory
usage by clamping textures which are too big. This is really handy
on the early production stages when you first create really nice
looking hi-res textures and only when it all works and approved
start investing time on optimizing your scene.
This is a new option in Scene Simplify panel and it acts as
following: when texture size is bigger than the given value it'll
be scaled down by half for until it fits into given limit.
There are various possible improvements, such as:
- Use threaded scaling using our own task manager.
This is actually one of the main reasons why image resize is
manually-implemented instead of using OIIO's resize. Other
reason here is that API seems limited to construct 3D texture
description easily.
- Vectorization of uchar4/float4/half4 textures.
- Use something smarter than box filter.
Was playing with some other filters, but not sure they are
really better: they kind of causes more fuzzy edges.
Even with such a TODOs in the code the option is already quite
useful.
Reviewers: brecht
Reviewed By: brecht
Subscribers: jtheninja, Blendify, gregzaal, venomgfx
Differential Revision: https://developer.blender.org/D2362
The code was templated already, so don't see big reason to have
3 versions of templated functions. It was giving some extra code
to maintain and in fact already had divergency for support of huge
image resolution (missing size_t cast in byte image loading).
There should be no changes visible by artists.
Note that volume rendering is not supported yet, this is a step towards that.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D2299
The title says it all actually. From tests with barber shop scene here
gives 2-3x speedup for shader compilation on my oldie i7 machine. The
gain is mainly due to textures metadata query from jpeg files (which
seems to requite de-compression before metadata can be read). But in
theory could give nice improvements for scenes with huge node trees
as well (i'm talking about node trees of complexity of fractal which
we had reports about in the past).
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: brecht
Subscribers: monio, Blendify
Differential Revision: https://developer.blender.org/D2215
This way OpenCL devices can also benefit from a smaller memory footprint, when using e.g. bumpmaps (greyscale, 1 channel).
Additional target for my GSoC 2016.
Now we have the 4 component ones first (float4, byte4, half4) followed by the 1 component ones (float, byte, half).
Makes code a bit more consistent and also reduces code a bit when enabling half support on GPU in next commit.
This also exposed a typo in half CPU images for 3D textures, which wasn't used yet, but good to have that one fixed anyway.
This is an initial commit for half texture support in Cycles.
It adds the basic infrastructure inside of the ImageManager and support for these textures on CPU.
Supported:
* Half Float OpenEXR images (can be used for e.g HDRs or Normalmaps) now use 1/2 the memory, when loaded via disk (OIIO).
ToDo:
Various things like support for inbuilt half textures, GPU... will come later, step by step.
Part of my GSoC 2016.