[Bf-committers] optimising the code

Jean-Luc Peurière bf-committers@blender.org
Sat, 29 May 2004 01:00:24 +0200


I had the curiosity of using a profiler and code analyser (shark on Os 
X) on blender application.

I was surprised that the analyser was picking some serious bottlenecks 
which are fairly easy to solve.

By changing a few lines in *ONE* function I was able to get a 11% 
speed-up of rendering time !

1' 44" against 1'56" in one particular blend (which use a lot of noise 
in textures).

time passed in function orgBlenderNoise has dropped from 28% to 13% too 
(that means that
the whole function execute 2 times faster).

the changes were simply getting rid of various casts :

* floor --> floorf
* constant of the correct type (float)
* spliting 3 variables in int and float versions

Now I would be interrest to know if theses simple changes have the same 
drastic effect on other plaforms ?

CVS diff :

Index: source/blender/blenlib/intern/noise.c
===================================================================
RCS file: 
/cvsroot/bf-blender/blender/source/blender/blenlib/intern/noise.c,v
retrieving revision 1.6
diff -r1.6 noise.c
260,261c260,261
<       float ox, oy, oz, jx, jy, jz;
<       float n= 0.5;
---
 >       float ox, oy, oz, jx, jy, jz, ixf, iyf, izf;
 >       float n= 0.5f;
264,266c264,272
<       ox= (x- (ix= (int)floor(x)) );
<       oy= (y- (iy= (int)floor(y)) );
<       oz= (z- (iz= (int)floor(z)) );
---
 >       ixf= floorf(x);
 >       iyf= floorf(y);
 >       izf= floorf(z);
 >       ox= x-ixf;
 >       oy= y-iyf;
 >       oz= z-izf;
 >       ix= (int)ixf;
 >       iy= (int)iyf;
 >       iz= (int)izf;
268,270c274,276
<       jx= ox-1;
<       jy= oy-1;
<       jz= oz-1;
---
 >       jx= ox-1.0f;
 >       jy= oy-1.0f;
 >       jz= oz-1.0f;
275,280c281,286
<       cn1= 1.0-3.0*cn1+2.0*cn1*ox;
<       cn2= 1.0-3.0*cn2+2.0*cn2*oy;
<       cn3= 1.0-3.0*cn3+2.0*cn3*oz;
<       cn4= 1.0-3.0*cn4-2.0*cn4*jx;
<       cn5= 1.0-3.0*cn5-2.0*cn5*jy;
<       cn6= 1.0-3.0*cn6-2.0*cn6*jz;
---
 >       cn1= 1.0f-3.0f*cn1+2.0f*cn1*ox;
 >       cn2= 1.0f-3.0f*cn2+2.0f*cn2*oy;
 >       cn3= 1.0f-3.0f*cn3+2.0f*cn3*oz;
 >       cn4= 1.0f-3.0f*cn4-2.0f*cn4*jx;
 >       cn5= 1.0f-3.0f*cn5-2.0f*cn5*jy;
 >       cn6= 1.0f-3.0f*cn6-2.0f*cn6*jz;

-- 
Jean-luc