* Reduce kernel arguments size, helps compile for apple nvidia.
* Fix use of unitialized variable in displace kernel.
* Use build flags in opencl kernel md5 hash.
* Reorganize code for kernel feature #defines a bit.
* OpenCL now only uses GPU/Accelerator devices, it's only confusing if CPU
device is used, easy to enable in the code for debugging.
* OpenCL kernel binaries are now cached for faster startup after the first
time compiling.
* CUDA kernels can now be compiled and cached at runtime if the CUDA toolkit
is installed. This means that even if the build does not have CUDA enabled,
it's still possible to use it as long as you install the toolkit.
* Add alpha pass output, to use set Transparent option in Film panel.
* Add Holdout closure (OSL terminology), this is like the Sky option in the
internal renderer, objects with this closure show the background / zero
alpha.
* Add option to use Gaussian instead of Box pixel filter in the UI.
* Remove camera response curves for now, they don't really belong here in
the pipeline, should be moved to compositor.
* Output full float values for rendering now, previously was only byte precision.
* Add a patch from Thomas to get a preview passes option, but still disabled
because it isn't quite working right yet.
* CUDA: don't compile shader graph evaluation inline.
* Convert tabs to spaces in python files.