blender

Author	SHA1	Message	Date
Sergey Sharybin	1788293a01	Fix T45381: Crash Blender 2.75 in Win7 x64 AMD card Previous fix didn't work well enough because on Windows Python has different environment than Blender ans setting variables in there made no effect from Blender point of view.	2015-07-23 12:10:38 +02:00
Sergey Sharybin	f2c54df625	Cycles: Expose image image extension mapping to the image manager Currently only two mappings are supported by API, which is Repeat (old behavior) and new Clip behavior. Internally this extension is being converted to periodic flag which was already supported but wasn't exposed. There's no support for OpenCL yet because of the way how we pack images into a single texture. Those settings are not exposed to UI or anywhere else and there should be no functional changes so far.	2015-07-21 21:58:19 +02:00
Sergey Sharybin	4bca8a6bc5	Fix T45484: Regression OpenCL split: access violation That was a primary school error caused by moving statements inside assert() which effectivly disabled crucial code in release builds.	2015-07-18 23:30:19 +02:00
Sergey Sharybin	cf14437ac9	Cycles: Log requested device features Useful to have this always logged because otherwise it's needed to remove cached kernels and check build flags to see which features are enabled.	2015-07-18 16:02:09 +02:00
Sergey Sharybin	45b5bf034b	Cycles; Make baking a feature-specific option This means render devices now might skip building baking kernels in cases when only actual render-related functionality is used. For now it's only implemented for OpenCL split kernel device and mainly needed to work around some compiler-specific bugs which crashes on building the kernel. Using OpenCL for baking might still crash the driver, but at least there is now higher probability of that GPU will be usable to render the scene. Real fix should actually be done in the driver side.	2015-07-18 16:02:08 +02:00
Sergey Sharybin	36a952e3e4	Cycles: Use feature-selective base kernel compilation when using split kernel The idea is to make all kernels as small as possible to work around possible issues with buggy drivers which might fail building feature-complete kernels. It's indeed just a workaround to make at last simple test scenes to render on OpenCL. Real fix should happen from the driver side.	2015-07-18 16:02:08 +02:00
Sergey Sharybin	5e4a8c6a87	Cycles: Some cleanup if OpenCL base kernel load_kernel() Hopefully makes it less clumzy, should be no functional changes still.	2015-07-18 16:02:08 +02:00
Sergey Sharybin	025eda57da	Cycles: Make OpenCL cache follow out code style a bit closer	2015-07-18 16:02:08 +02:00
Sergey Sharybin	548e650252	Cycles: Merging of patch from OSX went wrong in the previous change That's what happens when you can't commit from a system you're making changes at and someone is behind your back... Sorry for the noise.	2015-07-15 15:12:19 +02:00
Sergey Sharybin	2b97ad348c	Cycles: Missed this in the previous commit	2015-07-15 15:11:02 +02:00
Sergey Sharybin	56bf25d219	Cycles: Enable OpenCL rendering on Apple OSX Requires having latest El Capitan beta 3 OSX due to ome crucial fixes made in the compiler. Supports same features as NVidia OpenCL apart from CMJ (there's no experimental feature set support in megakernel yet). Uses megakernel internally, which works much better than the split kernel. Split kernel is not supported on OSX still, needs to be investigated still. Some more details can be found there: http://wiki.blender.org/index.php/Dev:2.6/Source/Render/Cycles/OpenCL#AMD_on_OSX	2015-07-15 14:20:59 +02:00
Sergey Sharybin	a79d47b14e	Cycles: Add logging to detected OpenCL platforms and devices Happens on verbosity level 2, should help looking into some of the bug reports in the tracker.	2015-07-14 09:56:00 +02:00
Sergey Sharybin	3dc86f586c	Cycles: Add debug print about CLEW initialization status	2015-07-07 14:37:12 +02:00
Sergey Sharybin	37539962fe	Cycles: Add an option to force disable all OpenCL devices This way it's possible to disable OpenCL devices for AMD devices which are considered whitelisted.	2015-07-07 14:18:45 +02:00
Sergey Sharybin	36426c3ee2	Cycles: Code cleanup, double semicolon	2015-07-03 15:44:57 +02:00
Sergey Sharybin	c864f5d140	Cycles: Error enqueueing split kernels should no longer cause infinite loop	2015-07-03 12:13:38 +02:00
Sergey Sharybin	78de47ca24	Cycles: Fix zero-size buffer allocation with OpenCL devices This is not really supported by OpenCL but might happen in certain configurations. There might be some remained cases when this happens but so far can not find any,	2015-07-01 11:56:48 +02:00
Sergey Sharybin	cf1bac3f69	Cycles: Solve some harmless NULL pointer magic Was harmless but confused some sanity checks, also kinda makes sense to be more verbose about what's going on there.	2015-06-30 23:41:19 +02:00
Sergey Sharybin	4d74180b9f	Cycles: Fix for wrong device enumeration in CUDA it is the same issue as described in the previous commit, original changes in this area were wrong and only worked on a bugger optimus driver which simply appeared to work by co-incident and in fact used wrong device..	2015-06-27 15:13:08 +02:00
Sergey Sharybin	09dc470982	Cycles: Rework the way how OpenCL devices are created It was annoying copy-paste happened across OpenCL device constructor, device enumeration and split kernel checks. Now those areas are using an utility function which returns pairs of platform and device IDs for devices which are supported by Cycles and enumeration is happening inside that list. This makes it so filtering is happening in a single place, so there's no need to keep 3 different functions in sync. This commit also fixes a bug with wrong enumeration of devices caused by recent fixes. Those fixes were in fact wrong and only happened to appear to be working on laptop with optimus card on Linux. Root of those issues is in fact in bad Linux driver for optimus cards.	2015-06-27 15:13:08 +02:00
Campbell Barton	c40759e678	Cleanup: warnings	2015-06-24 18:42:16 +10:00
Sergey Sharybin	63dd554ff1	Cycles: Don't show pre-sm_20 CUDA cards in the device list	2015-06-20 17:34:12 +02:00
Sergey Sharybin	4ed6605d65	Cycles: Don't show devices which does not support OpenCL 1.1 in the menu They'll be checked for the version later and that check will fail anyway, so better to not allow user to see unsupported device in the list. Also corrected one more issue with the device enumeration.	2015-06-18 11:26:22 +02:00
Sergey Sharybin	ae3e37b899	Cycles: Fix wrong numbering of OpenCL devices when some of them are skipped Skipped devices did not reflect in the device number, which might result in bad array indices. This might also resolve T45037, and need to be ported to a release branch.	2015-06-17 11:35:39 +02:00
Sergey Sharybin	2ebaa69676	Cycles: Move requested feature conversion to an own function This way it could be used for the shader/baking kernels easily n the future. making those kernels more optimal.	2015-06-08 11:15:40 +02:00
Sergey Sharybin	8c2750bc82	Cycles: Remove round-up trickery for max closure in split OpenCL kernel Round-up was only enabled for viewport render, which was for a long time hardcoded to use 64 closures. This was done in order to avoid unnecessary kernel re-compilations when tweaking the shader tree. We could enable selective closure compilation in the viewport later if it'll give measurable speed improvements, but even then round-up is to happen outside of the device level, This commit also removes early output which happened in cases when max closure did not change. It was wrong because other requested kernel features might have been changed.	2015-06-08 11:15:39 +02:00
Sergey Sharybin	27ed75271c	Cycles: Make hair, object and motion blur selective compiled into OpenCL This features are now based on the scene settings, so scenes without those features used are rendered even faster. This gives about 30% speedup on the AMD A10 APU here, but at the same time it does not mean such an improvement will happen on all the hardware. That being said, the Tonga device here seems to have no measurable difference. In any case it seems handy to have for the future, when we'll want to support SSS in the kernel or to port selective compilation/split kernel to CUDA devices.	2015-06-08 11:15:39 +02:00
Martijn Berger	b79a33e2d4	Allow compilation of cycles network with WITH_CYCLES_LOGGING is ON	2015-06-07 15:24:31 +02:00
Sergey Sharybin	28f798f86e	Cycles: Initial support for OpenCL capabilities reports For now it's just generic information, still need to expose memory, workgorup sizes and so on.	2015-06-05 14:17:30 +02:00
Sergey Sharybin	9d4d55e78b	Cycles: Strip meaningless empty output form the MVidia OpenCL compiler	2015-06-01 19:49:53 +05:00
Sergey Sharybin	399a27b261	Cycles: Code cleanup, spaces around keyword and brace	2015-06-01 19:49:52 +05:00
Sergey Sharybin	36ef6d1532	Cycles: Report build flags used for the OpenCL kernel compilation For now it's reported to the stdout, matching to the CUDA behavior. In the future we can hide this into GLog logging once the kernels are considered all stable and so.	2015-06-01 19:49:52 +05:00
Sergey Sharybin	cf19012fb0	Fix T44831: Crash when using Intel OpenCL with split kernel The issue was caused by underallocation of object motion related arrays, which happened by accident.	2015-05-26 21:29:21 +05:00
Thomas Dinges	c3ab5b3089	Fix T44830, wrong sample progress number when using split device. Value was not set, moved it out of the constructor into device_opencl_create() now.	2015-05-25 00:37:01 +02:00
Sergey Sharybin	2c503d8303	Cycles: Restructure kernel files organization Since the kernel split work we're now having quite a few of new files, majority of which are related on the kernel entry points. Keeping those files in the root kernel folder will eventually make it really hard to follow which files are actual implementation of Cycles kernel. Those files are now moved to kernel/kernels/<device_type>. This way adding extra entry points will be less noisy. It is also nice to have all device-specific files grouped together. Another change is in the way how split kernel invokes logic. Previously all the logic was implemented directly in the .cl files, which makes it a bit tricky to re-use the logic across other devices. Since we'll likely be looking into doing same split work for CUDA devices eventually it makes sense to move logic from .cl files to header files. Those files are stored in kernel/split. This does not mean the header files will not give error messages when tried to be included from other devices and their arguments will likely be changed, but having such separation is a good start anyway. There should be no functional changes. Reviewers: juicyfruit, dingto Differential Revision: https://developer.blender.org/D1314	2015-05-22 16:31:34 +05:00
Thomas Dinges	a934730368	Cycles: Remove TM / R and whitespace from OpenCL device names. Was already done for CPU devices, now we also do this for OpenCL.	2015-05-21 23:43:18 +02:00
Sergey Sharybin	d4c676e81b	Cycles: CYCLES_OPRNCL_DEBUG now affects on split kernel as well	2015-05-21 14:30:33 +05:00
Sergey Sharybin	f18d77b874	Cycles: Restore some lost custom cflags passed to the kernel compilation They were lost during simplification of kernel loading but might be rather crucial for the performance. Also made it so cflags are shared across kernels. Surely it might lead to some unwanted kernel re-compilation but at the same time they might easily run out of sync with the changes in kernel and so.	2015-05-21 14:05:53 +05:00
Sergey Sharybin	148ed4e05e	Cycles: Cleanup, synchronize name across file name, program and kernel names	2015-05-20 23:10:07 +05:00
Sergey Sharybin	6f48df45ee	Cycles: Simplify code around kernel loading	2015-05-20 23:10:07 +05:00
Sv. Lockal	88acb3c599	Fix T44707: cycles border render regression	2015-05-18 11:37:19 +10:00
Thomas Dinges	105b87a3f7	Cycles: Enable advanced shading on AMD / OpenCL. That is needed for Motion Blur and Render Passes to work properly. I hope there are no nasty side effects, but we need to test this.	2015-05-17 19:29:33 +02:00
Thomas Dinges	14c2bc53c0	Cleanup: Typos, typos everywhere. :D	2015-05-17 18:32:31 +02:00
Campbell Barton	daeb3069cf	Cleanup: typos	2015-05-17 16:09:32 +10:00
Campbell Barton	31e96cbf96	Cleanup: style, spelling	2015-05-15 23:38:53 +10:00
Sergey Sharybin	c2b9f78415	Cycles: Pass __KERNEL_EXPERIMENTAL__ to OpenCL split kernels Experimental feature set id currently unavailable for megakernel, it'll require some changes to the cache system to distinguish cached regular kernels from cached experimental kernels. Currently unused, but some features will be enabled soon.	2015-05-15 13:22:47 +05:00
Sergey Sharybin	2ab909a88c	Cycles: Make experimental kernel build option more generic Previously it was explicitly mentioning it's NVidia kernel related option, but in fact it's also handy for the OpenCL kernel.	2015-05-15 13:22:47 +05:00
Sergey Sharybin	960d7df56f	Cycles: Pass device compute capabilities to kernel via build options This way it's possible to do device-selective feature disabling/enabling. Currently only supported for NVidia devices via OpenCL extension.	2015-05-15 13:22:47 +05:00
Sergey Sharybin	03f9d5a4cf	Cycles: Cleanup, move build options string calculation into the device class This way it's easier to access platform name, device ID and other stuff which might be needed to define build options.	2015-05-15 13:22:47 +05:00
Sergey Sharybin	3c10ec96b5	Cycles: Enable object motion blur on Intel OpenCL platform This required allocating some memory related on object transform needed by ShaderData and currently it is done for all the platforms. Since we're targeting full feature-complete platforms this is rather acceptable at this point and in the future we'll do selective NO_HAIR/NO_SSS/NO_BLUR kernels. This is experimental still and in fact there're some major issues on NVidia platform and it's not really clear if it's a bug in compiler, some uninitizlied variable or other kind of issue.	2015-05-15 00:48:12 +05:00
Sergey Sharybin	03565218d5	Cycles: Various fixes Some stupid fixes like spaces around operator and missing semicolon, plus fix for wrong detecting of ShaderData SOA size. Thar was harmless since there's only one closure array, but still better to fix this.	2015-05-15 00:42:05 +05:00
Sergey Sharybin	f6c6dd44de	Cycles: Remove meaningless ifdef checks for features in device_opencl This file was actually checking for features enabled on CPU and surely all of them were enabled, so removing them does not cause any difference. ideally we'll need to do runtime feature detection and just pass some stuff as NULL to the kernel, or maybe also have variadic kernel entry points which is also possible quite easily.	2015-05-14 23:44:19 +05:00
Sergey Sharybin	93867ae549	Cycles: Cleanup: use generic utility function to set kernel arguments	2015-05-13 19:56:24 +05:00
Sergey Sharybin	51a6bc8faa	Cycles: Inline sizeof of elements needed for the split kernel No need to store them in the class, they're unlikely to be changed and if they do change we're in big trouble anyway. More appropriate approach would be then to typedef this things in kernel_types.h, but still use inlined sizeof(),	2015-05-13 19:56:24 +05:00
Antony Riakiotakis	4fc3188112	Cycles: Get rid of one more OpenGL matrix manipulation/push/pop.	2015-05-11 16:41:18 +02:00
Antony Riakiotakis	e38f914421	Cycles: use vertex buffers when possible to draw tiles on the screen. Not terribly necessary in this case, since we are just drawing a quad, but makes blender overall more GL 3.x core ready.	2015-05-11 16:28:41 +02:00
Antony Riakiotakis	5588a51c9c	Cycles OpenGL: Don't use full matrix transform when we can just use simple addition.	2015-05-11 13:10:19 +02:00
Sergey Sharybin	3a2c0ccdd0	Cycles: Correction to opencl whitelist check Was using platform as a device id accidentally.	2015-05-10 20:02:06 +05:00
Sergey Sharybin	136d7a4f62	Cycles: Only whitelist AMD GPU devices in the OpenCL section Only those ones are priority for now, all the rest are still testable if CYCLES_OPENCL_TEST or CYCLES_OPENCL_SPLIT_KERNEL_TEST environment variables are set.	2015-05-09 23:40:26 +05:00
George Kyriazis	7f4479da42	Cycles: OpenCL kernel split This commit contains all the work related on the AMD megakernel split work which was mainly done by Varun Sundar, George Kyriazis and Lenny Wang, plus some help from Sergey Sharybin, Martijn Berger, Thomas Dinges and likely someone else which we're forgetting to mention. Currently only AMD cards are enabled for the new split kernel, but it is possible to force split opencl kernel to be used by setting the following environment variable: CYCLES_OPENCL_SPLIT_KERNEL_TEST=1. Not all the features are supported yet, and that being said no motion blur, camera blur, SSS and volumetrics for now. Also transparent shadows are disabled on AMD device because of some compiler bug. This kernel is also only implements regular path tracing and supporting branched one will take a bit. Branched path tracing is exposed to the interface still, which is a bit misleading and will be hidden there soon. More feature will be enabled once they're ported to the split kernel and tested. Neither regular CPU nor CUDA has any difference, they're generating the same exact code, which means no regressions/improvements there. Based on the research paper: https://research.nvidia.com/sites/default/files/publications/laine2013hpg_paper.pdf Here's the documentation: https://docs.google.com/document/d/1LuXW-CV-sVJkQaEGZlMJ86jZ8FmoPfecaMdR-oiWbUY/edit Design discussion of the patch: https://developer.blender.org/T44197 Differential Revision: https://developer.blender.org/D1200	2015-05-09 19:52:40 +05:00
Sergey Sharybin	f680c1b54a	Cycles: Communicate number of closures and nodes feature set to the device This way device can actually make a decision of how it can optimize the kernel in order to make it most efficient.	2015-05-09 19:28:00 +05:00
Sergey Sharybin	b3299bace0	Cycles: Pass requested tile size to the device via device task This is currently unused but crucial for things like calculating amount of device memory required to deal with the tasks. Maybe not really best place to store it, but consider it good enough for now.	2015-05-09 19:09:07 +05:00
Sergey Sharybin	0e4ddaadd4	Cycles: Change the way how we pass requested capabilities to the device Previously we only had experimental flag passed to device's load_kernel() which was all fine. But since we're gonna to have some extra parameters passed there it makes sense to wrap them into a single struct, which will make it easier to pass stuff around.	2015-05-09 19:05:49 +05:00
Sergey Sharybin	35812e65f4	Cycles: Fix compilation error on windows after recent logging changes	2015-04-10 22:35:10 +05:00
Sergey Sharybin	2f5dd83759	Cycles: Add some statistics logging Covers number of entities in the scene (objects, meshes etc), also reports sizes of textures being allocated.	2015-04-10 15:37:49 +05:00
Martijn Berger	f01456aaa4	Optionally use c++11 stuff instead of boost in cycles where possible. We do and continue to depend on boost though Reviewers: dingto, sergey Reviewed By: sergey Subscribers: #cycles Differential Revision: https://developer.blender.org/D1185	2015-03-29 22:12:40 +02:00
Sergey Sharybin	5ff132182d	Cycles: Code cleanup, spaces around keywords This inconsistency drove me totally crazy, it's really confusing when it's inconsistent especially when you work on both Cycles and Blender sides. Shouldn;t cause merge PITA, it's whitespace changes only, Git should be able to merge it nicely.	2015-03-28 00:15:15 +05:00
Sergey Sharybin	585dd26120	Cycles: Code cleanup, prepare for strict C++ flags	2015-03-27 18:23:31 +05:00
Sergey Sharybin	7f406a53c7	Cycles: Cleanup for indentation in device_cpu.cpp Perhaps became broken after rather recent change about which entry point to kernel to use.	2015-02-19 19:05:04 +05:00
Sergey Sharybin	7ea7c2aab2	Cycles: Fix inconsistent command line used for runtime kernel compilation Basically build-time compiled kernels were using --fast-math (which is correct) but run-time compiled did not.	2015-02-02 15:00:21 +05:00
Sergey Sharybin	77e6f2212f	Cycles: Allow paths customization via environment variables This is for development and test environment setup only, not for regular users usage hence no mentioning in the man page needed.	2015-02-02 02:02:10 +05:00
Sergey Sharybin	a922be9270	Cycles: Repot CPU and CUDA capabilities to system info operator For CPU it gives available instructions set (SSE, AVX and so). For GPU CUDA it reports most of the attribute values returned by cuDeviceGetAttribute(). Ideally we need to only use set of those which are driver-specific (so we don't clutter system info with values which we can get from GPU specifications and be sure they stay the same because driver can't affect on them).	2015-01-06 14:13:21 +05:00
Sergey Sharybin	1369bd562c	Cycles: Fix compilation error on AVX platforms with -arch-native Was a conflict in headers between clew and util_optimization.h.	2015-01-03 00:11:28 +05:00
Sergey Sharybin	9e2e408323	Cycles: Add logging to OSL and CUDA initialization/compilation This is what was handy troubleshooting issues in the studio, plus this is exactly the same thing which would be helpful when solving issues with paths to compiled shaders and cubins for standalone repository.	2015-01-01 01:31:08 +05:00
Sergey Sharybin	4497b6ac84	Cycles: Synchronize changes with standalone repository This changes were done in original commit of the standalone Cycles repository and needed here for easier patch synchronization.	2015-01-01 01:31:07 +05:00
Thomas Dinges	ee36e75b85	Cleanup: Fix Cycles Apache header. This was already mixed a bit, but the dot belongs there.	2014-12-25 02:50:24 +01:00
Campbell Barton	c07f6c02b3	Docs: reference the new manual	2014-12-08 11:18:58 +01:00
Thomas Dinges	e3a6f1c152	Cycles: Remove workaround for missing sm_52 kernel, now we require it for Maxwell cards.	2014-12-02 13:45:39 +01:00
Bastien Montagne	c14d34322b	Fix typo breaking compilation with SSE2. Spotted by sybrenstuvel (Sybren Stüvel), thanks!	2014-11-02 23:01:09 +01:00
Thomas Dinges	4ff8744669	Cycles / CUDA: Better fix for missing sm_52 kernel, in case user compiles himself.	2014-10-30 11:42:59 +01:00
Martijn Berger	4b33667b93	Deduplicate some code by using a function pointer to the real kernel This has no performance impact what so ever and is already used in the adaptive sampling patch	2014-10-30 10:23:44 +01:00
Sergey Sharybin	e556670b36	Cycles: Do cuda pointer arithmetic in integers, don't use pointer arithmetic This should hopefully fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765187	2014-10-14 17:54:41 +02:00
Jason Wilkins	8d084e8c8f	Ghost Context Refactor https://developer.blender.org/D643 Separates graphics context creation from window code in Ghost so that they can vary separately.	2014-10-07 15:47:32 -05:00
Sergey Sharybin	cd6129d1ff	Cycles: Workaround dead-slow expf() on 64bit linux Single precision exponent on 64bit linux tends to be order of magnitude slower than double precision version even with single<->double precision conversion. Some feedback in the mailing lists also suggests that logf() is also slow, but this i didn't confirm here in the studio yet. Depending on the shader setup it gives ~3% with the secret agent shot and up to around 15% with the bmw scene here.	2014-10-06 12:36:46 +02:00
Sergey Sharybin	68f2066602	Cycles: Make OpenCL folks happy to use __KERNEL_DEBUG__ Quite straightforward change, the only annoying thing is that we can't use indentation for include directive just because of the way headers inlineing works for OpenCL. Might do smarter job in path_source_replace_includes() but don't want to spend time on this yet.	2014-10-05 16:00:23 +06:00
Sergey Sharybin	0106b94f9d	Cycles: Fix for debug kernel not working with CUDA	2014-10-05 15:31:48 +06:00
Thomas Dinges	a613290775	Cycles / CUDA: Workaround to make sm_52 (Maxwell) cards work. * sm_52 can run a sm_50 kernel, so tell runtime detection to use that until we build a dedicated sm_52 kernel.	2014-10-05 04:13:40 +02:00
Sergey Sharybin	a654512356	Cycles: Implement preliminary test for volume stack update from SSS This adds an AABB collision check for objects with volumes and if there's a collision detected then the object will have SD_OBJECT_INTERSECTS_VOLUME flag. This solves a speed regression introduced by the fix for T39823 by skipping volume stack update in cases no volumes intersects the current SSS object.	2014-10-03 10:52:04 +02:00
Sergey Sharybin	fbed2047c8	Fix wrong track of the memory when doing device vector resize before freeing it This is rather legit case which happens i.e. when having persistent images enabled and session is updating the lookup tables. Now device_memory keeps track of amount of memory being allocated on the device, which makes freeing using the proper allocated size, not the CPU side buffer size.	2014-09-04 17:25:12 +06:00
Thomas Dinges	fb3f32760d	Cycles: Add an experimental CUDA kernel. Now we build 2 .cubins per architecture (e.g. kernel_sm_21.cubin, kernel_experimental_sm_21.cubin). The experimental kernel can be used by switching to the Experimental Feature Set: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Experimental_Features This enables Subsurface Scattering and Correlated Multi Jitter Sampling on GPU, while keeping the stability and performance of the regular kernel. Differential Revision: https://developer.blender.org/D762 Patch by Sergey and myself. Developer / Builder Note: CUDA Toolkit 6.5 is highly recommended for this, also note that building the experimental kernel requires a lot of system memory (~7-8GB).	2014-08-26 17:02:26 +02:00
Thomas Dinges	603348c56e	Cycles: Drop support for CUDA 5.0 Toolkit, only 6.0 and 6.5 (recommended) are supported now.	2014-08-21 23:35:20 +02:00
Dalai Felinto	8d3cc431d7	Fix T41471 Cycles Bake: Setting small tile size results in wrong bake with stripes rather than the expected noise pattern This problem was introduced in 983cbafd1877f8dbaae60b064a14e27b5b640f18 Basically the issue is that we were not getting a unique index in the baking routine for the RNG (random number generator). Reviewers: sergey Differential Revision: https://developer.blender.org/D749	2014-08-19 11:40:33 +02:00
Dalai Felinto	2c5b6859d9	Revert "Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU" This reverts commit a48b372b04421b00644a0660bfdf42229b5ffceb. Leaving only the part that fix device_multi.cpp	2014-08-15 11:27:42 +02:00
Martijn Berger	c020bd2e73	Cycles OpenCL error to string removed in favour of the same function in clew.	2014-08-09 14:27:40 +02:00
Dalai Felinto	a48b372b04	Fix T41222 Blender gives weird output when baking (4096*4096) resolution on GPU In collaboration with Sergey Sharybin. Also thanks to Wolfgang Faehnle (mib2berlin) for help testing the solutions. Reviewers: sergey Differential Revision: https://developer.blender.org/D690	2014-08-05 13:50:50 -03:00
Sergey Sharybin	77b7e1fe9a	Deduplicate CUDA and OpenCL wranglers For now it was mainly about OpenCL wrangler being duplicated between Cycles and Compositor, but with OpenSubdiv work those wranglers were gonna to be duplicated just once again. This commit makes it so Cycles and Compositor uses wranglers from this repositories: - https://github.com/CudaWrangler/cuew - https://github.com/OpenCLWrangler/clew This repositories are based on the wranglers we used before and they'll be likely continued maintaining by us plus some more players in the market. Pretty much straightforward change with some tricks in the CMake/SCons to make this libs being passed to the linker after all other libraries in order to make OpenSubdiv linked against those wranglers in the future. For those who're worrying about Cycles being less standalone, it's not truth, it's rather more flexible now and in the future different wranglers might be used in Cycles. For now it'll just mean those libs would need to be put into Cycles repository together with some other libs from Blender such as mikkspace. This is mainly platform maintenance commit, should not be any changes to the user space. Reviewers: juicyfruit, dingto, campbellbarton Reviewed By: juicyfruit, dingto, campbellbarton Differential Revision: https://developer.blender.org/D707	2014-08-05 13:57:50 +06:00
Campbell Barton	9c3025cd26	Spelling	2014-08-02 16:53:52 +10:00
Martijn Berger	65bf694331	Implement get_split_task_count to make device_network compile again.	2014-07-29 07:40:04 +02:00
Dalai Felinto	fc55c41bba	Cycles Bake: show progress bar during bake Baking progress preview is not possible, in parts due to the way the API was designed. But at least you get to see the progress bar while baking. Reviewers: sergey Differential Revision: https://developer.blender.org/D656	2014-07-25 11:42:53 -03:00
Martijn Berger	bae2b3a688	Switch to Cuda 4.0 style api for kernel invocation. This is a small clean-up that has no functional changes but makes code a bit more readable. Differential revision: https://developer.blender.org/D659 Reviewed by: Sergey Sharybin, Thomas Dinges	2014-07-25 13:33:19 +02:00
Thomas Dinges	9acabc13de	Cleanup: Typo fixes.	2014-07-05 14:25:34 +02:00
Thomas Dinges	5898abe99d	Cycles: Update CUDA error messages, based on Toolkit 6.0. * Removed deprecated erros, and added some new ones, which might help to figure out problems in the future.	2014-07-02 01:50:42 +02:00
Thomas Dinges	4800c52700	Cleanup: Remove unused checks in CUDA device code.	2014-07-02 01:12:13 +02:00
Thomas Dinges	866c7fb6e6	Cycles: Add an AVX2 CPU kernel. This kernel is compiled with AVX2, FMA3, and BMI compiler flags. At the moment only Intel Haswell benefits from this, but future AMD CPUs will have these instructions as well. Makes rendering on Haswell CPUs a few percent faster, only benchmarked with clang on OS X though. Part of my GSoC 2014.	2014-06-13 22:26:20 +02:00
Brecht Van Lommel	e4e58d4612	Fix T40370: cycles CUDA baking timeout with high number of AA samples. Now baking does one AA sample at a time, just like final render. There is also some code for shader antialiasing that solves T40369 but it is disabled for now because there may be unpredictable side effects.	2014-06-06 15:39:04 +02:00
Brecht Van Lommel	865dfa8a7e	Fix T40228: cycles CUDA multi GPU + world MIS giving error.	2014-06-05 18:10:32 +02:00
Brecht Van Lommel	69c7522b24	Fix T40379: world MIS causing too much CUDA memory usage. The kernel for baking the world texture was the same as the one used for baking. Now that's separate which allows the kernel to reserve much less memory.	2014-05-27 15:11:32 +02:00
Brecht van Lommel	0075efc4d2	Fix T40306: cycles baking not distributing work among CPU cores well.	2014-05-26 13:51:11 +02:00
Brecht Van Lommel	3b53fffb77	Cycles: revert async CUDA changes, these are giving too much trouble still. Fixes T40027. This means we get more CPU usage again when using multiple CUDA, but the impact on performance is too big a problem with the current code.	2014-05-19 19:33:09 +02:00
Thomas Dinges	c08c931fb6	Cycles / CUDA: Increase maximum image textures on GPU. Instead of 95, we can use 145 images now. This only affects Kepler and above (sm30, sm_35 and sm_50). This can be increased further if needed, but let's first test if this does not come with a performance impact. Originally developed during my GSoC 2013.	2014-05-11 03:38:39 +02:00
Thomas Dinges	fd26a32aa5	Fix T40119, CUDA Toolkit version mismatch	2014-05-10 01:26:04 +02:00
Campbell Barton	dc13969e48	Style cleanup: indentation, braces	2014-05-05 02:19:08 +10:00
Campbell Barton	1618329b00	Code cleanup: style, require ; for cuda_assert, opencl_assert	2014-05-04 03:57:50 +10:00
Brecht Van Lommel	198f5e506a	Cycles: CUDA changes for kernel evaluation cancel	2014-05-02 21:19:10 -03:00
Dalai Felinto	eec3eaba08	Cycles Bake Expand Cycles to use the new baking API in Blender. It works on the selected object, and the panel can be accessed in the Render panel (similar to where it is for the Blender Internal). It bakes for the active texture of each material of the object. The active texture is currently defined as the active Image Texture node present in the material nodetree. If you don't want the baking to override an existent material, make sure the active Image Texture node is not connected to the nodetree. The active texture is also the texture shown in the viewport in the rendered mode. Remember to save your images after the baking is complete. Note: Bake currently only works in the CPU Note: This is not supported by Cycles standalone because a lot of the work is done in Blender as part of the operator only, not the engine (Cycles). Documentation: http://wiki.blender.org/index.php/Doc:2.6/Manual/Render/Cycles/Bake Supported Passes: ----------------- Data Passes * Normal * UV * Diffuse/Glossy/Transmission/Subsurface/Emit Color Light Passes * AO * Combined * Shadow * Diffuse/Glossy/Transmission/Subsurface/Emit Direct/Indirect * Environment Review: D421 Reviewed by: Campbell Barton, Brecht van Lommel, Sergey Sharybin, Thomas Dinge Original design by Brecht van Lommel. The entire commit history can be found on the branch: bake-cycles	2014-05-02 21:19:09 -03:00
Campbell Barton	8d16869d83	Code cleanup: Add -Werror=float-conversion to Cycles	2014-05-03 07:31:46 +10:00
Brecht Van Lommel	741f17f05b	Cycles CUDA: make CUDA toolkit 6.0 the official supported version. This also updates the configurations to build kernels for compute capability 5.0 cards, when using and older CUDA toolkit version this will be skipped. Also includes tweaks to improve performance with this version: * Increase max registers on sm_30, sm_35 and sm_50 * No longer use texture storage on sm_30	2014-04-30 16:07:27 +02:00
Thomas Dinges	f6abc96b6b	Cleanup: Remove OpenCL __MULTI_CLOSURE__ sanity check, not needed anymore after 04a10907dc41.	2014-04-21 18:08:01 +02:00
Brecht Van Lommel	39bfde674c	Cycles CUDA: don't use cuLaunchGridAsync at all for display devices. As suggested by Martijn, this is slower than cuLaunchGrid.	2014-04-17 12:18:49 +02:00
Brecht Van Lommel	18da79f471	Cycles CUDA: only do async execution for GPUs not used for display. Otherwise devices used for display will lock up the UI too much. This means you might still get 100% CPU for the display device, but for others CPU usage should be low still. The check to see if a device is used for display may not be entirely reliable, it checks if there is a watchdog timeout on the device, but I'm not entirely sure that always exists for display devices or is disabled for non-display devices, though some tools like cuda-gdb seem to make the same assumption. Ref T39559	2014-04-17 12:08:18 +02:00
Brecht Van Lommel	415e10a0ef	Fix another compile error with recent commit on visual studio.	2014-04-16 21:36:19 +02:00
Brecht Van Lommel	6f1afdbbfc	Cycles CUDA: enabled branched path kernel again, with more registers.	2014-04-16 21:05:04 +02:00
Brecht Van Lommel	2851ed4a55	Cycles code refactor: use __launch_bounds__ instead of -maxrregcount for CUDA. This makes it easier to have per kernel number of registers. Also, all the tunable parameters for this are now in kernel.cu, rather than spread over cmake, scons and device_cuda.cpp.	2014-04-16 21:05:04 +02:00
Thomas Dinges	297a2223b5	Cycles / CUDA: Increase sm_2x registers to 40. This fixes the ptaxs "ACCESS_VIOLATION" error and should allow our Linux and Windows build bots to compile again. Unfortunately this comes with a performance penalty on sm_2x cards, so this is only a workaround for now. Branched Path is still globally disabled on GPU.	2014-04-08 23:25:54 +02:00
Martijn Berger	163a3212b4	OpenCL Change opencl_assert to be more like cuda assert where possible. added some extra warnings and feedback if things go wrong	2014-04-07 16:17:20 +02:00
Thomas Dinges	d923720312	Cycles: Disable Branched Path on all GPUs for now, until we separate the cubins. SM_20 fails now as well, reported by Zanqdo in IRC.	2014-04-03 22:18:40 +02:00
Brecht Van Lommel	a2e4ebd36a	Cycles code internals: add CPU kernel support for 3D image textures.	2014-03-29 13:03:48 +01:00
Thomas Dinges	859039f732	Cycles: Raise a proper error message when using Branched Path on sm_30, this is currently still disabled.	2014-03-27 10:29:22 +01:00
Sergey Sharybin	74518b2826	Fix T39420: Cycles viewport/preview flickers, when moving mouse across editors Issue was caused by the wrong usage of OCIO GLSL binding API. To make it work properly on pre-GLSL-1.3 drivers shader is to be enabled after the texture is binded to the opengl context. Otherwise it wouldn't know the proper texture size. This is actually a regression in 2.70 and to be ported to 'a'.	2014-03-26 15:58:53 +06:00
Martijn Berger	28c1a860e2	Fix T39247 Changes to interpolation break texture allocation on sm35 and greater.	2014-03-19 07:37:18 +01:00
Martijn Berger	dd2dca2f7e	Add support for multiple interpolation modes on cycles image textures All textures are sampled bi-linear currently with the exception of OSL there texture sampling is fixed and set to smart bi-cubic. This patch adds user control to this setting. Added: - bits to DNA / RNA in the form of an enum for supporting multiple interpolations types - changes to the image texture node drawing code ( add enum) - to ImageManager (this needs to know to allocate second texture when interpolation type is different) - to node compiler (pass on interpolation type) - to device tex_alloc this also needs to get the concept of multiple interpolation types - implementation for doing non interpolated lookup for cuda and cpu - implementation where we pass this along to osl ( this makes OSL also do linear untill I add smartcubic to the interface / DNA/ RNA) Reviewers: brecht, dingto Reviewed By: brecht CC: dingto, venomgfx Differential Revision: https://developer.blender.org/D317	2014-03-07 23:16:33 +01:00
Martijn Berger	1d01675833	Cuda use streams and async to avoid busywaiting This switches api usage for cuda towards using more of the Async calls. Updating only once every second is sufficiently cheap that I don't think it is worth doing it less often. Reviewed By: brecht Differential Revision: https://developer.blender.org/D262	2014-03-06 20:51:46 +01:00
Brecht Van Lommel	6b1a4fc66e	Cycle CUDA: revert the f1aeb2ccf4 and 84f958754 busywait fixes for now. It's unclear what kind of impact they have on performance at the moment, so I rather play it safe and postpone this for 2.71. Ref T38679, Ref T38712	2014-02-19 16:08:08 +01:00
Martijn Berger	f1aeb2ccf4	this is an attempted Fix: T38679 Cycles GPU Performance Regression From my testing this (what i should have done in the first place) reduces the regression a lot. Lets hope it is enough or we have to go back to busy waiting.	2014-02-17 20:11:45 +01:00
Martijn Berger	0f91f56ce3	Cycles Network rendering, remove some exception throwing, replace with saner error handling This patch adds a network_error() function more alike how other devices handle error's - it adds a check for errors on load_kernels to make sure we do not crash if rendering without a server. - it uses the non throwing variation of boost::asio::read. Reviewers: brecht Reviewed By: brecht CC: brecht Differential Revision: https://developer.blender.org/D86	2014-02-05 21:55:51 +01:00
Martijn Berger	84f9587540	Cuda use streams and async to avoid busywaiting This is my first stab at this and is based on this IRC converstation: <mib2berlin> brecht: this is meaning as reminder only, I know you have other things to do > http://openvidia.sourceforge.net/index.php/Optimization_Notes#avoiding_busy_waits <brecht> mib2berlin: thanks, bookmarked only tested on Ubuntu 14.04 / cuda 5.0 but ill do some more testing tomorrow. Also unsure about the placement and the lifetime of the stream and the event. But creating / deleting these seems to incur a non trivial cost. Reviewers: brecht Reviewed By: brecht CC: mib2berlin, dingto Differential Revision: https://developer.blender.org/D262	2014-01-28 18:40:08 +01:00
Thomas Dinges	de28a4d4b2	Cycles: Add an AVX kernel for CPU rendering. * AVX is available on Intel Sandy Bridge and newer and AMD Bulldozer and newer. * We don't use dedicated AVX intrinsics yet, but gcc auto vectorization gives a 3% performance improvement for Caminandes. Tested on an i5-3570, Linux x64. * No change for Windows yet, MSVC 2008 does not support AVX. Reviewed by: brecht Differential Revision: https://developer.blender.org/D216	2014-01-16 17:04:11 +01:00
Brecht Van Lommel	d9e52ac98b	Code cleanup: move half float functions to separate header file.	2014-01-15 15:29:22 +01:00
Thomas Dinges	5d88f7c7db	Cycles: Build SSE41 kernel per default, remove build option. This hopefully also fixes some compile errors on various systems.	2014-01-14 22:04:32 +01:00
Thomas Dinges	9351ac0d85	Cycles: Skip the compilation of the dedicated SSE2 kernel on x86-64, we can assume SSE2 here, so just re-use the regular one. Saves 500kb in the blender binary. Reviewed by: brecht Differential Revision: https://developer.blender.org/D199	2014-01-14 20:39:54 +01:00
Brecht Van Lommel	241fccaf6a	Fix T37817: cycles CUDA detection problem on Windows with non-ascii paths.	2014-01-11 00:47:58 +01:00
Thomas Dinges	ce6dce3b13	Code cleanup / Cycles: else/if for SSE41 kernel functions.	2014-01-06 03:22:14 +01:00
Thomas Dinges	ad0a3de3ce	Cycles / OpenCL: Let the OpenCL runtime determine its optimal work-group size automatically, by passing a NULL pointer here. This is recommended in the Intel OpenCL optimization docs (http://software.intel.com/en-us/vcsource/samples/optimizing-opencl) and I can confirm a small performance increase here (1-2% on nVidia OpenCL, up to 8% on Intel OpenCL).	2013-12-24 20:20:57 +01:00
Thomas Dinges	011ae78857	Cycles / OpenCL: Fix compile error on OS X After update to Mac OS X 10.9.1, OpenCL works now on my Intel CPU in the 2013 Macbook Pro (even the entire kernel). The Intel Iris Pro GPU still segfaults here though, even when all flags are disabled (building "clay like" kernel only). Maybe we need the -no-missing-prototypes for AMD hardware still, but I couldn't find a way to distuinguish here.	2013-12-17 09:59:18 +01:00
Martijn Berger	85a0c5d4e1	Cycles: network render code updated for latest changes and improved This actually works somewhat now, although viewport rendering is broken and any kind of network error or connection failure will kill Blender. * Experimental WITH_CYCLES_NETWORK cmake option * Networked Device is shown as an option next to CPU and GPU Compute * Various updates to work with the latest Cycles code * Locks and thread safety for RPC calls and tiles * Refactored pointer mapping code * Fix error in CPU brand string retrieval code This includes work by Doug Gale, Martijn Berger and Brecht Van Lommel. Reviewers: brecht Differential Revision: http://developer.blender.org/D36	2013-12-07 12:26:58 +01:00
Martijn Berger	e3a79258d1	Cycles: test code for sse 4.1 kernel and alignment for some vector types. This is mostly work towards enabling the __KERNEL_SSE__ option to start using SIMD operations for vector math operations. This 4.1 kernel performes about 8% faster with that option but overall is still slower than without the option. WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel. Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2% speedup on tested systems with the current kernel already, so is enabled now.	2013-11-22 14:42:41 +01:00
Campbell Barton	48c1e0c0fc	spelling: use American spelling for canceled	2013-10-26 01:06:19 +00:00
Brecht Van Lommel	451607630e	Fix #37134 : cycles viewport not displaying correct with multi GPU render and graphics card that does not support CUDA OpenGL interop.	2013-10-18 20:11:07 +00:00
Brecht Van Lommel	9d7567d6ac	Fix #37002 : cycles viewport render shows white on old graphics cards with no support for non-power-of-two textures.	2013-10-12 13:55:52 +00:00
Thomas Dinges	b5a5773fa9	Cycles / CUDA: * Remove support for CUDA Toolkit 4.x, only Toolkit 5.0 and above are supported now. * Remove support for sm_1x cards (< Fermi) for good. We didn't officially support those cards for a few releases already, now remove some special code that was still there.	2013-10-08 15:29:28 +00:00
Brecht Van Lommel	cbb783f1d6	Fix cycles OpenCL compile error on AMD, and fix assert in debug builds.	2013-10-02 14:41:04 +00:00
Brecht Van Lommel	31e6181187	Fix #36873 : cycles opencl render status show negative sample count.	2013-09-30 12:11:25 +00:00
Brecht Van Lommel	fa352bb749	Fix #35684 : cycles unable to use full 6GB of memory on NVidia Titan GPU. We now use arrays instead of textures for general storage on this card (image textures are still stored as texture). Textures were found to be faster on older cards, but the limits on 1D texture size have not increased along with the memory size, which meant that the full 6 GB could not be used. The performance actually seems to be slightly better with arrays in some tests on Titan. For older cards there seems to be a bit of a mix, some are better and others not. We may change those to use arrays too, but more testing is needed, only Titan and Tesla K20 (sm_35) is changed for now. The fact that arrays are faster is a bit surprising, as others found textures to be faster on Kepler. However even if they were, the memory limitation is more important to solve anyway. https://research.nvidia.com/publication/understanding-efficiency-ray-traversal-gpus-kepler-and-fermi-addendum	2013-09-27 19:09:31 +00:00
Thomas Dinges	cb19d9fa35	Code cleanup / Cycles: * Removed unused member of the device_memory template.	2013-09-04 16:24:58 +00:00
Brecht Van Lommel	29f6616d60	Cycles: viewport render now takes scene color management settings into account, except for curves, that's still missing from the OpenColorIO GLSL shader. The pixels are stored in a half float texture, converterd from full float with native GPU instructions and SIMD on the CPU, so it should be pretty quick. Using a GLSL shader is useful for GPU render because it avoids a copy through CPU memory.	2013-08-30 23:49:38 +00:00
Brecht Van Lommel	6785874e7a	Fix #36137 : cycles render not using all GPU's when the number of GPU's is larger than the number of CPU threads	2013-08-30 23:09:22 +00:00
Brecht Van Lommel	01e22d1b9f	Cycles: more code refactoring to rename things internally as well. Also change property name back so we keep compatibility.	2013-08-23 14:34:34 +00:00
Brecht Van Lommel	b9ce231060	Cycles: relicense GNU GPL source code to Apache version 2.0. More information in this post: http://code.blender.org/ Thanks to all contributes for giving their permission!	2013-08-18 14:16:15 +00:00
Thomas Dinges	743a7a4a4b	Cycles: * GPU kernel can now be compiled without __NON_PROGRESSIVE__ again, was broken after my last commit. Also add a check for have_error(), in case the GPU kernel comes without Non-Progressive, to avoid a crash. * Don't compile progressive kernel twice on CPU, if __NON_PROGRESSIVE__ would be disabled there.	2013-08-09 20:03:49 +00:00
Thomas Dinges	a18112249d	Cycles / Non-Progressive integrator: * Non-Progressive integrator is now available on the GPU (CUDA, sm_20 and above). Implementation details: * kernel_path_trace() has been split up into two functions: kernel_path_trace_non_progressive() and kernel_path_trace_progressive(). * We compile two CUDA kernel entry functions (in kernel.cu) for the two integrators, they are still inside one .cubin file but due to the kernel separation there should be no performance problem. I tested with the BMW file on my Geforce 540M and the render times were the same for 100 samples (1.57 min in my case). This is part of my GSoC project, SVN merge of r59032 + manual merge of UI changes for this from my branch.	2013-08-09 18:47:25 +00:00
Thomas Dinges	9732c6283e	Cycles / CPU Rendering: * "Auto Detect" now again uses the umber of cores, instead number of cores + 1. This was added before we had Tile rendering and benchmarks on several systems showed that there is no gain with this now. There might be some slight difference (0.5% or so) slower/faster depending on the scene, but this is negligible.	2013-07-20 00:40:03 +00:00
Brecht Van Lommel	7902fa57b6	Code cleanup: cycles * Reshuffle SSE #ifdefs to try to avoid compilation errors enabling SSE on 32 bit. * Remove CUDA kernel launch size exception on Mac, is not needed. * Make OSL file compilation quiet like c/cpp files.	2013-06-26 23:29:33 +00:00
Brecht Van Lommel	e11e30aadf	Fix Cycles OpenCL issue if context/program creation fails, mistake by me, patch #35866 by Doug Gale to fix it.	2013-06-26 12:24:33 +00:00
Brecht Van Lommel	2e3035dd80	Cycles OpenCL: make displacement and world importance sampling work.	2013-06-21 13:05:08 +00:00
Brecht Van Lommel	8d6e5e2fee	Cycles: update build configurations to include CUDA sm_35 architecture. When using a compiler older than CUDA 5.0 it will give a warning and skip this architecture.	2013-06-20 13:10:47 +00:00
Brecht Van Lommel	16204bd647	Cycles: prepare to make CUDA 5.0 the official version we use * Add CUDA compiler version detection to cmake/scons/runtime * Remove noinline in kernel_shader.h and reenable --use_fast_math if CUDA 5.x is used, these were workarounds for CUDA 4.2 bugs * Change max number of registers to 32 for sm 2.x (based on performance tests from Martijn Berger and confirmed here), and also for NVidia OpenCL. Overall it seems that with these changes and the latest CUDA 5.0 download, that performance is as good as or better than the 2.67b release with the scenes and graphics cards I tested.	2013-06-19 17:54:23 +00:00
Brecht Van Lommel	0ad88d1001	Fix another windows / msvc build error.	2013-06-01 02:39:34 +00:00
Brecht Van Lommel	4f056d1be7	Fix windows / msvc build error.	2013-06-01 02:28:57 +00:00
Brecht Van Lommel	2d0a586c29	Cycles OpenCL: keep the opencl context and program around for quicker rendering the second time, as for example Intel CPU startup time is 9 seconds. * Adds an cache for contexts and programs for each platform and device pair, which also ensure now no two threads try to compile and write the binary cache file at the same time. * Change clFinish to clFlush so we don't block until the result is done, instead it will block at the moment we copy back memory. * Fix error in Cycles time_sleep implementation, does not affect any active code though. * Adds some (disabled) debugging code in the task scheduler. Patch #35559 by Doug Gale.	2013-05-31 16:19:03 +00:00
Thomas Dinges	722680d7cf	Cycles / OpenCL: * Use advanced shading for nvidia as well, works fine on my Geforce 540M with sm_21. I tested the files from regression suite.	2013-05-27 17:13:36 +00:00
Brecht Van Lommel	4bdb54a76e	Cycles OpenCL: patch #35514 by Doug Gale * Support using devices from all OpenCL platforms, so that you can use e.g. both Intel and NVidia OpenCL implementations if you have them installed. * Fix compile error due to missing fmodf after recent math node change. * Enable advanced shading for Intel OpenCL. * CYCLES_OPENCL_DEBUG environment variable for generating debug symbols so you can debug with gdb. This crashes the compiler with Intel OpenCL on Linux though. To make this work the preprocessed kernel source code is written out, as gdb needs this. * Show OpenCL compiler warnings even if the build succeeded. * Some small fixes to initialize cdDevice to NULL, add missing NULL check when creating buffer and add missing space at end of build options for Apple OpenCL. * Fix crash with multi device + opencl, now e.g. CPU + GPU render should work. I did a few tweaks to the code and also: * Fix viewport render failing sometimes with Apple CPU OpenCL, was not taking workgroup size limits into account properly. * Add compile error when advanced shading in the Blender binary and OpenCL kernel are not in sync.	2013-05-27 16:21:07 +00:00
Thomas Dinges	11707119de	Cycles: * Code cleanup, remove unused "resolution" variable from the DeviceTask class, was never used.	2013-05-14 21:18:20 +00:00
Brecht Van Lommel	cd3283f573	Cycles CUDA: in case of cryptic error messages in the console, refer to wiki documentation for possible solutions.	2013-05-13 21:36:48 +00:00
Thomas Dinges	522eeaa6a0	Cycles / OpenCL: * Remove old comment for sm_13 cards and really check for OpenCL 1.1.	2013-05-09 16:16:41 +00:00
Brecht Van Lommel	d0ffbeec73	Cycles OpenCL: a few fixes to get things compiling after kernel changes, for Apple OpenCL on OS X 10.8 and simple AO render. Also environment variable CYCLES_OPENCL_TEST can now be set to CPU, GPU, ACCELERATOR, DEFAULT or ALL values to test particuler devices.	2013-05-09 14:05:40 +00:00
Brecht Van Lommel	40b05d364e	Cycles: code refactoring to add generic lookup table memory.	2013-04-01 20:26:43 +00:00
Thomas Dinges	50c28740d4	Cycles / CUDA: * Simplify Computing Capability Check, only check for major.	2013-03-17 14:32:50 +00:00
Thomas Dinges	dc90ce5b6d	Cycles GPU rendering: * Deprecate computing capability 1.3 (sm_13) This commit disables auto build of sm_13 CUDA platform, which means that starting with Blender 2.67, we don't support sm_13 devices anymore. It has become difficult to support that and it was already feature incomplete (no render-passes, AO, Multi Closure etc). It's still possible to manually enable sm_13 for own tests, but building might break in the future.	2013-02-21 17:14:07 +00:00
Thomas Dinges	a239700f43	Cycles: * Code cleanup, remove deprecated support_advanced_shading() functions. Left over from r43734.	2013-02-21 17:10:14 +00:00
Brecht Van Lommel	313dfbe35d	Add some more detailed CUDA error prints to try to debug #34166 .	2013-02-15 14:54:11 +00:00
Brecht Van Lommel	909d64079a	Fix #34226 : cycles shadow pass got incorrectly influenced by world multiple importance sampleing.	2013-02-13 16:46:18 +00:00
Brecht Van Lommel	d095bcc8aa	Fix cycles not using SSE3 kernel after recent, order with SSE2 should be switched, pointed out by Chad Fraleigh.	2013-02-12 14:58:46 +00:00
Campbell Barton	a78cf854b4	add missing bullet header to cmake, quiet reports from 'make test_cmake'	2013-02-06 04:16:28 +00:00
Brecht Van Lommel	7c9d993347	Fix cycles intersection issue with overlapping faces on windows 32 bit and CPU without SSE3 support, due to 80 bit precision float register being used for one bounding box but not the one next to it.	2013-02-04 16:12:37 +00:00
Thomas Dinges	f146317b09	Cycles: * CUDA: Make it more clear that sm_12 and below is not supported. * OpenCL: __KERNEL_SHADING__ was declared twice for nvidia opencl device. * Some reshuffle of defines in kernel_types.h. No functional changes.	2013-01-15 19:02:17 +00:00
Sergey Sharybin	e5179bfefc	Remove usage WITH_CYCLES_CUDA_BINARIES in code, use check for precompiled cubins instead, Logic here is following now: - If there're precompiled cubins, assume CUDA compute is available, otherwise - If cuda toolkit found, assume CUDA compute is available - In all other cases CUDA compute is not available For windows there're still check for only precompiled binaries, no runtime compilation is allowed. Ended up with such decision after discussion with Brecht. The thing is, if we'll support runtime compilation on windows we'll end up having lots of reports about different aspects of something doesn't work (you need particular toolkit version, msvc installed, environment variables set properly and so) and giving feedback on such reports will waste time.	2013-01-14 17:30:33 +00:00
Jason Wilkins	0ea3c285ea	device_network.cpp is completely elided when WITH_NETWORK is not defined, so do not include it in the build in that case	2013-01-06 07:10:22 +00:00
Brecht Van Lommel	35c0b821a5	Cycles: deal a bit better with errors when CUDA runs out of memory, try to avoid crashes.	2012-12-23 12:53:58 +00:00
Brecht Van Lommel	e5b457dbc9	Cycles: merge some changes from a local branch to bring network rendering a bit more up to date, still nowhere near working though, but might as well commit this in case someone else is interested in working on it.	2012-12-21 11:13:46 +00:00
Brecht Van Lommel	7c0a0bae79	Fix #33375 : OSL geom:trianglevertices gave wrong coordinates for static BVH. Also some simple OSL optimization, passing thread data pointer directly instead of via thread local storage, and creating ustrings for attribute lookup.	2012-12-01 19:15:05 +00:00
Brecht Van Lommel	fdadfde5c5	Fix #33158 : motion vector pass wrong in cycles in some scenes, wrong vectors due to float precision problem in matrix inverse.	2012-11-21 01:00:03 +00:00
Brecht Van Lommel	a80b0915c7	Fix #33243 : cycles CUDA going missing sometimes, disabled the new code now that can detect if a device becomes available while Blender runs, appears to be unreliable for some reason.	2012-11-20 17:39:56 +00:00
Campbell Barton	353ad46e10	code cleanup: quiet double promotion warnings	2012-11-07 23:52:33 +00:00
Brecht Van Lommel	204113b791	Fix #33107 : cycles fixed threads 1 was still having two cores do work, because main thread works as well.	2012-11-07 21:00:49 +00:00
Sergey Sharybin	6eec49ed20	Cycles: memory usage report This commit adds memory usage information while rendering. It reports memory used by device, meaning: - For CPU it'll report real memory consumption - For GPU rendering it'll report GPU memory consumption, but it'll also mean the same memory is used from host side. This information displays information about memory requested by Cycles, not memory really allocated on a device. Real memory usage might be higher because of memory fragmentation or optimistic memory allocator. There's really nothing we can do against this. Also in contrast with blender internal's render cycles memory usage does not include memory used by scene, only memory needed by cycles itself will be displayed. So don't freak out if memory usage reported by cycles would be much lower than blender internal's. This commit also adds RenderEngine.update_memory_stats callback which is used to tell memory consumption from external engine to blender. This information is used to generate information line after rendering is finished.	2012-11-05 08:04:57 +00:00
Brecht Van Lommel	57b7f405a4	Fix related to #32929 : update list of available devices for cycles rendering while Blender is running, not only on load. It might help a case when Blender is started before the CUDA driver is fully initialized.	2012-10-22 14:04:44 +00:00
Campbell Barton	536d9fec80	code cleanup: - move object_iterators.c --> view3d_iterators. (ED_object.h had to include ED_view3d.h which isn't so nice) - move projection functions from view3d_view.c --> view3d_project.c (view3d_view was becoming a mishmash of utility functions and operators). - some some cmake includes as system-includes.	2012-10-17 04:13:03 +00:00
Sergey Sharybin	3b88a29abf	Cycles: progressive refine option Just makes progressive refine :) This means the whole image would be refined gradually using as much threads as it's set in performance settings. Having enough tiles is required to have this option working as it's expected. Technically it's implemented by repeatedly computing next sample for all the tiles before switching to next sample. This works around 7-12% slower than regular tile-based rendering, so use this option only if you really need it. This commit also fixes progressive update of image when Save Buffers option is enabled. And one more thing this commit fixes is handling display buffer with Save Buffers option enabled. If this option is enabled image buffer wouldn't have neither byte nor float buffer until image is fully rendered which could backfire in missing image while rendering in cases color management cache became full. This issue solved by allocating byte buffer for image buffer from tile update callback. Patch was reviewed by Brecht. He also made some minor edits to original version to patch. Thanks, man!	2012-10-13 12:38:32 +00:00
Campbell Barton	b0c7c8756f	code cleanup: cycles now uses system includes for boost/oiio.. etc, so we dont get warnings from system headers.	2012-09-20 09:04:43 +00:00
Lukas Toenne	efaf512406	Revert r50528: "Performance fix for Cycles: Don't wait in the main UI thread when resetting devices." This commit leads to random freezes in Cycles rendering: https://projects.blender.org/tracker/index.php?func=detail&aid=32545&group_id=9&atid=498 The goal of this commit was to remove UI lag for OSL, but since that is not officially supported yet, better revert it until a proper fix can be implemented in 2.65.	2012-09-17 12:07:06 +00:00

... 2 3 4 5 6 ...

421 Commits