[Bf-committers] Volumetric question

Lars Krueger lars_e_krueger at gmx.de
Tue Jul 29 21:43:30 CEST 2008


>  Regarding volumetrics from datasets I need a way to cache in main memory
> files of any sizes (in theory there's should be no restriction ) and
> process each chunk at a time. For unstructured datasets particulary, so I
> preciate any idea ,suggestion or papers that directly talk about files
> cache sistems
>  or data structures for handlimg big raw datasets.

How about datadraw? Have not used it myself, but description is interesting.
http://datadraw.sourceforge.net/

If you're having same-size chunks, how about a classical least recently or least frequently used algo? There is a crossover called "least recently/frequently used", which you can tune somewhere between either algo (http://citeseer.ist.psu.edu/15889.html). I have used this in a commercial software a few years ago and it worked quite well. It was either simple to implement or a free implementation was to be found on the net, I don't remember.

The operating system community should be a good source for such algorithms. Wikipedia gives a good starting point (http://en.wikipedia.org/wiki/Page_replacement_algorithm). 

In any case you have to keep in mind that caching stuff in files is expensive in terms of wall clock time, not processing time. So avoid it if you can. 

Memory mapping is alternative to reading the file. Sadly not really portable (Windows + Linux have different APIs).

When caching stuff, watch your access pattern. Going through stuff linearly is not good for either lfu or lru because the algorithm cannot beforehand which buffer you need next. A simple read-ahead buffer is far better as it does know what buffer you need next.

One large file is better than 1000 small files. It saves overhead for finding the data that belong to the file name. 

Hope these tips help.
-- 
Dr. Lars Krueger


GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion!
http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196


More information about the Bf-committers mailing list