Transcoding and codec optimization: tips & tricks
> Use combined function PMADDWD
Complex instruction like: PMADDWD: Multiply and Add save many clock cycles.

Before: (24 clock cycles)


pmullw xmm1, xmm6	(8)
punpcklwd xmm2, xmm1	(2)
punpckhwd xmm3, xmm1	(2)
psrad xmm3, 16		(2)
psrad xmm2, 16		(2)
paddd xmm3, xmm2	(2)
movq xmm2, xmm3		(2)
psrlq xmm2, 32		(2)
paddd xmm3, xmm2	(2)

After: (18 clock cycles)


pmaddwd xmm1, xmm6	(8)
movq xmm2, xmm1		(2)
psrlq xmm2, 32		(2)
paddd xmm2, xmm1	(2)
psrldq xmm1, 8		(2)
paddd xmm1, xmm2	(2)

By using the function PMADDWD, we combine two operations Addition and multiplication into one, which saves six clock cycles in this case.

Eliminate Branching conditions
Original


for (j = 0; j < 4; j++) 

	for (i = 0; i < 4; i++)
    	
		for (result = 0, z = -2; z > 4; z++)
		result += list[fr]->test[max(0,min(max_y,y+j))]
				[max(0,min(max_x,x+i+z))]*COEF[z+2];
		block[i][j] = max(0, min(255, (result+16)));
		

Optimized #1: Eliminate Branches


if(  (x < max_x)& (y < max_y)& (0 > y) &   (0 < x-2))

   for (j = 0; j < 4; j++)
   
	for (i = 0; i < 4; i++) 
      	
	     For (result = 0, z = -2; z < 4; z++)
	   	result += list[fr]->test[y+j][x+i+z]*COEF[z+2];
	     block[i][j] = max(0, min(255, (result+16)));
	
    

else
for (j = 0; j < 4; j++) 

		     for (i = 0; i < 4; i++)
    		    
		    for (result = 0, z = -2; z < 4; z++)
		     	     result += list[fr]->test[max(0,min(max_y,y+j))]
		     		     [max(0,min(max_x,x+i+z))]*COEF[z+2];
			block[i][j] = max(0, min(255, (result+16)));
		       
       

Optimized #2: Eliminate Branching and Unroll Inner Loops


if(  (x+6 < max_x)& (y+3 < max_y)& (0 < y) &   (0 < x-2)) 
     
	     for (j = 0; j < 4; j++) 
		result   = list[fr]->test[y+j][x-2]*COEF[0];
		result += list[fr]->test[y+j][x-1]*COEF[1];
		result += list[fr]->test[y+j][x]*COEF[2];
		result += list[fr]->test[y+j][x+1]*COEF[3];
		result += list[fr]->test[y+j][x+2]*COEF[4];
		result += list[fr]->test[y+j][x+3];
		block[0][j] = max(0, min(255, (result+16)));
		�
		block[1][j] = max(0, min(255, (result+16)));
		�
		block[2][j] = max(0, min(255, (result+16)));
		�
	     
    
else 
     for (j = 0; j < 4; j++) 
    
		for (i = 0; i < 4; i++)
    		
		      for (result = 0, z = -2; z < 4; z++)
		  result += list[fr]->test[max(0,min(max_y,y+j))]
			  [max(0,min(max_x,x+i+z))]*COEF[z+2];
		      block[i][j] = max(0, min(255, (result+16)));
		 
     

Subscribers who liked this article also read:
The "Rich-Client" Advantage for .NET Web Services
by Dan Fineberg, enterprise/business marketing manager, and Gary Hayco...

If you're interested in this topic, these articles may be helpful:

Optimize Game Code for Better Real-Time Physics
Gamers are constantly looking for the next hot playing experience. Gam...
Web Code Optimization: Google does it. Yahoo! does it. Why don't you do it?
by Tad Fleshman. Port80 Software Inc. Google and Yahoo! know that s...
Getting the bubbles out of code: designing for the Itanium 2 processor
by Andrew Binstock, principal analyst, Pacific Data Works LLC. Intel C...
Web services essentials: code examples
by Ethan Cerami, O'Reilly Media Inc. This .zip file contains code e...
Writing robust code
by Glen McCluskey, Glen McCluskey & Associates LLC Many of the te...

Related Jobs:

Java Application Developer (Client Side) (3855) - MD - Annapolis - Windermere Group
Project Description: A team is being assembled to design, build, and t...
Product Development Engineers - Mixed Signal ICs #G1660117 - CA - San Diego - QUALCOMM Incorporated
The Mixed Signal IC product engineering group has multiple openings fo...
Product Development Engineers - RF/Analog IC #G1660110 - CA - San Diego - QUALCOMM Incorporated
The RF/Analog IC product engineering group has multiple openings for b...
Software Development Engineer #132333 - WA - Redmond - Microsoft Corporation
Help us reach the goal of $1Billion in sales by 2008 for smartphones a...
Software Development Engineer in Test #131958 - WA - Redmond - Microsoft Corporation
Interested in assuring quality concerning some of the most exciting, f...
System Architect/Hardware Engineer #2329 - CA - San Jose - Flextronics Corporation
System Architect/Hardware Engineer Job ID: 2329 Location: San Jos...
Software Development Engineer #132781 - WA - Redmond - Microsoft Corporation
You can be part of history and play a critical role in launching the n...
Senior Web Programmer - MN - Mendota Heights - Internet Broadcasting Systems
Internet Broadcasting Systems is looking for an individual with strong...
Senior Web Programmer #043-05 - MN - Mendota Heights - Internet Broadcasting Systems
Internet Broadcasting is looking for an individual with strong softwar...
Senior Software Engineer - CA - San Diego - Musicmatch, Inc
SENIOR SOFTWARE ENGINEER The Windows Senior Software Engineer, work...