[Bf-committers] CUDA backend implementation for GSoC?

Timothy Baldridge tbaldridge at gmail.com
Tue Dec 16 17:31:56 CET 2008

> Indeed with PCIe 2.0 you have doubled the bandwidth, and have a 0.5GB/s
> per lane, thus
> allowing 16GB/s (consider also you have configuration with SLI or quad-SLI).

Right, but the comment still stands....in many (perhaps most) cases,
going from memory->PCIe->GPU->Stream Processor->GPU->PCIe->memory is
going to be slower or at least have more overhead than

Perhaps that's the best starting point. Can we get some solid
benchmarks that show overhead (latency and bandwidth) for transfering
data to and from the CPU (and setting up a simple program on the GPU)
vs doing it all in memory. Don't forget, in Blender you will have to
grab data from and insert data back into the Blender structures,
unless you plan on handing data to CUDA/OpenCL in the format Blender
uses it in.

>From what I last heard, there is no good way to get data from CUDA
driectly into OpenGL without taking it out of the GPU and inserting it
back in. I think OpenCL allows inserting data into textures from
OpenCL. So if we were going to use this for Subdivision surfaces,
you'd have to upload the data to the GPU then stream the verticies out
of the GPU and back into the GPU. Whereas the current method only
streams them to the CPU.


More information about the Bf-committers mailing list