It'd be interesting to modify my bucketing system to store actual faces/strands per tiles, and not just references to them, and see if zbuffering goes faster.<div><br></div><div>BTW, what about reading one stream, and using it to write another? That's basically what DSM does in a lot of its code. This is really interesting; I may need to do this sort of optimization (I'm nearing, or at, the limits of what algorithmic improvements can give me). If you can find that google tech talk that'd be really great.</div>
<div><br></div><div>Joe<br><br><div class="gmail_quote">On Fri, Dec 19, 2008 at 1:09 PM, Timothy Baldridge <span dir="ltr"><<a href="mailto:tbaldridge@gmail.com">tbaldridge@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="Ih2E3d">> I'm not sure how you'd avoid cache misses though. . .we simply have to deal<br>
> with too much data. About the only thing I can think of is sorting<br>
> faces/strands (I actually do this in my DSM branch) per tile and using a<br>
> more optimal render order then simply going over the scanlines. The ray<br>
> tracing traversal could be made more efficient, but optimizing what the<br>
> renderer does between could be more difficult.<br>
> You know I think the CodeAnalyst profiling tool from AMD can measure cache<br>
> misses, I'll have to try and figure out how it works.<br>
<br>
</div>You cannot avoid all cache misses, but it is possible to avoid many<br>
cache misses. Modern CPUs load cache lines in 64byte segements. This<br>
means that if you read one byte from memory the CPU really loads<br>
64bytes. Thus, if you can arrange data in such a way that it can be<br>
read and processed as sequential data, the performance will be greatly<br>
enhanced.<br>
<br>
I wish I could find it, but there is an excellent video on youtube<br>
from a Google Tech Talk. In the talk the speaker explains these<br>
caches, and goes to show that reading items from a linked list or<br>
vector can be (IIRC) up to a order of magnitude slower than reading<br>
items from an array. That is if the entire set does not lie in memory.<br>
This is due to the fact that linked lists require allot of jumping<br>
around in memory, which causes the cache to be come less useful.<br>
<font color="#888888"><br>
Timothy<br>
</font><div><div></div><div class="Wj3C7c">_______________________________________________<br>
Bf-committers mailing list<br>
<a href="mailto:Bf-committers@blender.org">Bf-committers@blender.org</a><br>
<a href="http://lists.blender.org/mailman/listinfo/bf-committers" target="_blank">http://lists.blender.org/mailman/listinfo/bf-committers</a><br>
</div></div></blockquote></div><br></div>