This enables VEX-encoding in AVX kernel for windows msvc builds and gives 5-10% speedup for different scenes.
Reviewers: juicyfruit, dingto, brecht
Reviewed By: brecht
CC: brecht
Differential Revision: https://developer.blender.org/D284
On Linux/Mac OS X, simply type "make cycles" inside the Blender source directory, to get a standalone build of the engine.
Reviewed by: Brecht
Differential Revision: https://developer.blender.org/D228
* AVX is available on Intel Sandy Bridge and newer and AMD Bulldozer and newer.
* We don't use dedicated AVX intrinsics yet, but gcc auto vectorization gives a 3% performance improvement for Caminandes. Tested on an i5-3570, Linux x64.
* No change for Windows yet, MSVC 2008 does not support AVX.
Reviewed by: brecht
Differential Revision: https://developer.blender.org/D216
This code can't actually be enabled for building and is incomplete, but it's
here because we know we want to support this at some point and there's not much
reason to have it in a separate branch if a simple #ifdef can disable it.
This code can't actually be enabled for building and is incomplete, but it's
here because we know we want to support this at some point and there's not much
reason to have it in a separate branch if a simple #ifdef can disable it.
This is mostly work towards enabling the __KERNEL_SSE__ option to start using
SIMD operations for vector math operations. This 4.1 kernel performes about 8%
faster with that option but overall is still slower than without the option.
WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel.
Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2%
speedup on tested systems with the current kernel already, so is enabled now.
Issue is caused by missing sse flags for Clang compilers,
this flags only was set for GNU C compilers.
Added if branch for Clang now, which contains the same
flags apart from -mfpmath=sse, This is because Clang was
claiming it's unused argument.
Probably OSX would need some further checks since it's
also using Clang. I've got no idea why it could have
worked for OSX before..
precompiled cubins instead,
Logic here is following now:
- If there're precompiled cubins, assume CUDA compute is available,
otherwise
- If cuda toolkit found, assume CUDA compute is available
- In all other cases CUDA compute is not available
For windows there're still check for only precompiled binaries,
no runtime compilation is allowed.
Ended up with such decision after discussion with Brecht. The thing
is, if we'll support runtime compilation on windows we'll end up
having lots of reports about different aspects of something doesn't
work (you need particular toolkit version, msvc installed, environment
variables set properly and so) and giving feedback on such reports
will waste time.
* Compile all of cycles with -ffast-math again
* Add scons compilation of cuda binaries, tested on mac/linux.
* Add UI option for supported/experimental features, to make it
more clear what is supported, opencl/subdivision is experimental.
* Remove cycles xml exporter, was just for testing.
* Fix excessive fireflies in Velvet BSDF (patch by David).
* Disable some unused SSE code
* Remove RTTI disabling flags for now, this is giving some compile issues and
was only needed of OSL which we're not using yet.
* Add back option to bundle CUDA kernel binaries with builds.
* Disable runtime CUDA kernel compilation on Windows, couldn't get this working,
since it seems to depend on visual studio being installed, even though for
this particular case it shouldn't be needed. CMake only at the moment.
* Runtime compilation on linux/mac should now work if nvcc is not installed in
the default location, but available in PATH.