[Bf-committers] "Official" CUDA Benchmark/Implementation Thread

Marko Radojcic sambucuself at gmail.com
Sun Dec 21 09:48:01 CET 2008

Some time ago, I read a paper on GPU ray tracing that was implemented
on previous generation graphics cards (separate pixel and vertex
A trick was used to perform data harvesting by setting a single quad
mesh over entire viewport and using data mapped into texture (by using
individual texture and their areas as data structures (classes and
arrays)). So, the algorithm is stored in a pixel shader and result
ends up in the output viewport and then gets read back into memory -
which is one of the main flaws and narrow throats for the data stream
and processing kernels to work more efficient)
According to my understanding the idea of cuda processing that really
makes a difference in acceleration terms is that we need to create a
set (preferably a large one) of data that can be processed in the same
time and writing a cuda kernel program that runs in parallel on many
GPU stream processors and then we easily harvest the results by
copying the result data structure back to main memory.
So, as I understand it it's a trade-off, by creating fast number
crunching process you have to write ugly code.

My idea was to try implementing something like that into yaf(a)ray for
easier maintainence and I spent couple of days studying yaf(a)ray
code, but what strikes me is that actual processing is done in regular
object oriented structures (that are executed in the main memory, so
accessing structures from different object types is not that
performance costing)...

I think that best GPU acceleration performance is achieved if we do
the work in layers...

First we generate all (or most rays) and then do all the traversing,
then we do all the intersections and then all the ray collection -
this is just a rough sketch ...
Basically we execute a sequence of kernels on a sequence of input and
output streams, without much algorithm branching on an individual
piece of data...

I think this needs a big code rewriting effort and this was why I had
the idea to do it separately (in yaf(a)ray)...

Any idea, suggestion or cooperation offer is welcome and I am also
interested in involving in any currently being developed code

Thank you for your time, sincerely,
                        Marko Radojcic

More information about the Bf-committers mailing list