Loop Optimization--Part 3 of Chapter 10 from Speed Up Your Site (4/6)--WebReference.com
[previous] [next] |
Speed Up Your Site, Chapter 10: Optimizing JavaScript for Execution Speed
Fast Duff's Device
You can avoid the complex do/switch
logic by unrolling Duff's
Device into two loops. So instead of the original, do this (see Listing
10.17):
Listing 10.17 Fast Duff's Device
function duffFastLoop8(iterations) {
// from an anonymous donor to Jeff Greenberg's site
var testVal=0;
var n = iterations % 8;
while (n--)
{
testVal++;
}
n = parseInt(iterations / 8);
while (n--)
{
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
}
}
This technique is about 36 percent faster than the original Duff's
Device on IE5 Mac. Even better, optimize the loop constructs by converting
the while
decrement to a do while
pre-decrement like
this (see Listing 10.18):
Listing 10.18 Faster Duff's Device
function duffFasterLoop8(iterations) {
var testVal=0;
var n = iterations % 8;
if (n>0) {
do
{
testVal++;
}
while (--n); // n must be greater than 0 here
}
n = parseInt(iterations / 8);
do
{
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
testVal++;
}
while (--n);
}
This optimized Duff's Device is 39 percent faster than the original
and 67 percent faster than a normal for
loop (see Table 10.3).
Table 10.3 Duff's Device Improved
500,125 Iterations |
Normal |
Duff's Device |
Duff's Fast |
Duff's Faster |
Total time (ms) |
1437 |
775 |
493 |
469 |
Cycle time (ms) |
0.00287 |
0.00155 |
0.00099 |
0.00094 |
How Much to Unroll?
To test the effect of different degrees of loop unrolling, I tested large iteration loops with between 1 and 15 identical statements for the Faster Duff's Device. Table 10.4 shows the results.
Table 10.4 Faster Duff's Device Unrolled
Duff's Faster |
1 Degree |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
Total time (ms) |
925 |
661 |
576 |
533 |
509 |
490 |
482 |
469 |
467 |
457 |
453 |
439 |
437 |
433 |
433 |
Cycle time (ms) |
0.00184 |
0.00132 |
0.00115 |
0.00106 |
0.00101 |
0.00097 |
0.00096 |
0.00093 |
0.00093 |
0.00091 |
0.00090 |
0.00087 |
0.00087 |
0.00086 |
0.00086 |
As you can see in Table 10.4, the effect diminishes as the degree of
loop unrolling increases. Even after two statements, the time to loop
through many iterations is less than 50 percent of a normal for
loop. Around seven statements, the time is cut by two-thirds. Anything
over eight reaches a point of diminishing returns. Depending on your requirements,
I recommend that you choose to unroll critical loops by between four and
eight statements for Duff's Device.
Fuse Loops
If you have two loops in close proximity that use the same number of iterations (and don't affect each other), you can combine them into one loop. So instead of this:
for (i=0; i<j; i++) {
sumserv += serv(i);
}
for (i=0; i<j; i++) {
prodfoo *= foo(i);
}
Do this:
for (i=0; i<j; i++) {
sumserv += serv(i);
prodfoo *= foo(i);
}
Fusing loops avoids the additional overhead of another loop control structure and is more compact.
[previous] [next] |
Created: January 27, 2003
Revised: January 27, 2003
URL: https://webreference.com/programming/optimize/speedup/chap10/3/4.html