[Bf-committers] BGE optimisation tweaks.

Benoit Bolsee benoit.bolsee at online.be
Wed Sep 2 16:30:26 CEST 2009


Fast math has little influence on the gameengine because the MT library
uses double as scarlar type. Fast math speeds up things mostly when you
work with float and the code is optimized for SSE: you double the speed
straight away compared to using double as scalar.
SSE option has little influence on the gameengine since the compiler
cannot do much about it on general C++ code. You need to write SSE code
yourself to take most advantage of it. A good example is the Eigen2
library which implements SSE optimization and cache friendly techniques
on matrix algebra. When you combine float+fast math+SSE you get an
enormous speed up, also because the library avoids C++ overhead. 
A nice performance boost could be expected in the GE if we replace MT
library by Eigen2.

/benoit

On Date: Wed, 2 Sep 2009 02:23:38 -0700, Mitchell Stokes
<mogurijin at gmail.com> wrote:
> 
> This is the BGE, not Blender itself. When it comes to a game 
> engine, precision isn't as import as speed. However, doing 
> some benchmarking, it seems that the two optimizations did 
> little for frame rates. Here are the results of the default 
> cube scene:
> 
> Base line (BGE_CXXFLAGS = ['/O2', '/EHsc', '/GR'])
> Min: 93.9956808054
> Max: 1032.34567385
> Average: 923.693778411
> 
> With "/fp:fast"
> Min: 88.6841658558
> Max: 1030.30926569
> Average: 926.200376381
> 
> With "/arch:SSE2"
> Min: 81.7896209204
> Max: 1027.67823461
> Average: 925.338981321
> 
> With "/fp:fast" and "/arch:SSE"
> Min: 66.4432752384
> Max: 1040.1976284
> Average: 920.833553036
> 
> Operating System: Windows 7 RC
> Graphics card: NVIDIA Geforce 8600m GT
> Processor: Intel Core 2 Duo T8100 (2.1GHz)
> 
> As one can see, the differences are negligible. When testing 
> I noticed that the average framerate would vary about 5~10fps 
> per run with the same flags/settings.
> 
> I also did a quick test with the YoFrankie! svn to test some 
> logic, etc. I wasn't as exact when recording results from 
> this, but just from watching the framerate, it looked like 
> there was little difference between the baseline and then 
> with fp:fast and arch:SSE.
> 
> I encourage people to do their own benchmarks and testing as well!
> 
> Cheers,
> Mitchell Stokes (Moguri)
> 
> 



More information about the Bf-committers mailing list