[Bf-committers] CUDA backend implementation for GSoC?
tbaldridge at gmail.com
Mon Dec 15 18:05:37 CET 2008
There's several issues involved in getting CUDA/OpenCL working with
Blender. The biggest is memory bandwidth. Let me explain: a Core i7
Processor can pull about 20GB/sec from the system memory. The Max a
PCIe bus can push through is 4GB/sec. Internally the high end GF8
cards have about 100GB/sec. However there's a ton of other issues.
Inside of a GF8 there's three types of memory: global, static and
local(forget the name for that). Then you have blocks of processors,
each block can run a single program at a time. IIRC a block contains 8
stream processors. So if you have 128 stream processors, you really
only have 16 SIMD processors. Each stream processor has a vector and a
scalar unit. So on a good day, it is possible to do something like 4 *
8 floating point multiplications at once, on a good day.
Back to memory, only one block can access the memory at a given
position at once (I think they work on banks of memory). But each
block can have its own static memory...but that static memory is only
16KB in size.
So yeah, it's possible to do some Blender stuff in CUDA/OpenCL but
it's hard, with tons of limitations. It works great for situations
where you only need to pull a small amount of data from the card, but
where there's a lot of processing being done on parts of that data,
otherwise...the coding takes a bit more time. What remains to be seen
is if OpenCL will have a good CPU implementation. If so, then it might
be possible to starting coding parts of Blender in OpenCL, with a
option to use the GPU version of OpenCL if available.
>> You can find some CUDA examples on nVidia's site:
>> They claims that some of that solutions reach 100x, or even much more,
>> speed up in comparison to CPU... I'm not sure what it means (which
>> GPU? which CPU? what input? I would like to see some more detailed
>> results of tests ;-)), but generally those numbers are really
>> attractive and maybe it would be a good idea to use CUDA in some parts
>> of Blender... :-)
Two wrights don't make a rong, they make an airplane. Or bicycles.
More information about the Bf-committers