Transcoding and codec optimization: tips & tricks
->
Use combined function PMADDWD
Complex instruction like: PMADDWD: Multiply and Add save many clock cycles.
Before: (24 clock cycles)
After: (18 clock cycles)
By using the function PMADDWD, we combine two operations Addition and multiplication into one, which saves six clock cycles in this case.
Eliminate Branching conditions
Original
Optimized #1: Eliminate Branches
Optimized #2: Eliminate Branching and Unroll Inner Loops
![]()
Subscribers who liked this article also read:
![]() | The "Rich-Client" Advantage for .NET Web Services by Dan Fineberg, enterprise/business marketing manager, and Gary Hayco... |
If you're interested in this topic, these articles may be helpful:
![]() | Optimize Game Code for Better Real-Time Physics Gamers are constantly looking for the next hot playing experience. Gam... |
![]() | Web Code Optimization: Google does it. Yahoo! does it. Why don't you do it? by Tad Fleshman. Port80 Software Inc. Google and Yahoo! know that s... |
![]() | Getting the bubbles out of code: designing for the Itanium 2 processor by Andrew Binstock, principal analyst, Pacific Data Works LLC. Intel C... |
![]() | Writing robust code by Glen McCluskey, Glen McCluskey & Associates LLC Many of the te... |
![]() | Web services essentials: code examples by Ethan Cerami, O'Reilly Media Inc. This .zip file contains code e... |
![]()
Related Jobs:


