C provides several looping constructs---namely. do-while
,while
,and for
.No corresponding instructions exist in machine code. Instead, combinations of conditional tests and jumps are used to implement the effect of loops.GCC and other compilers generate loop code based on the two basic loop patterns. We will study the translation of loops as a progression(发展;前进;进程),starting with do-while
and then working toward ones with more complex implementations,convering both patterns.
1. Do-While Loops
The general form of a do-while
statement is as follows:
do
body-statement
while (test-expr);
The effect of the loop is to repeatedly execute body-statement
,evaluatetest-expr
,and continue the loop if the evaluation result is nonzero. Observe that body-statement
is executed at least once.
This general form can be translated into conditionals and goto
statements as follows:
loop:
body-statement
t = test-expr;
if (t)
goto loop;
This is, on each iteration the program evaluates the body statement and then the test expression. If the test succeeds, the program goes back for another iteration
(a)C code
long fact_do(long n)
{
long result = 1;
do {
result *= n;
n = n - 1;
} while (n > 1)
return result;
}
(b)Equivalent goto version
long fact_do_goto(long n)
{
long result = 1;
loop:
result *= n;
n = n - 1;
if (n > 1)
goto loop;
return result;
}
(c) Corresponding assembly-language code
// long fact_do(long n) n in %rdi
fact_do:
movl $1, %eax
.L2:
imulq %rdi, %rax
subq $1, %rdi
cmpq $1, %rdi
jg .L2
rep; ret
Reverse engineering assembly code, such as that of (c),requires determining which registers are used for which program values.(,
).In this case the mapping is fairly simple to determine: We know that
n
will be passed to the function in register%rdi
. We can see register%rax
getting initialized to 1(Recall that, although the instruction has %eax
as its destination, it will also set the upper 4 bytes of %rax
to 0.)We can see that this register is also updated by multiplication on line 4. Furthermore, since %rax
is used to return the function value, it is often chosen to hold program values that are returned. We therefore conclude that %rax
corresponds to program value result
.
2. While Loops
The general form of a while statement is as follows:
while (test-expr)
body-statement
It differs from do-while
in that test-expr
is evaluated and the loop is potentially terminated before the first execution of body-statement
.There are a number of a ways to translate a while
loop into machine code, two of which are used in code generated by GCC.Both use the same loop structure as we saw for do-while
loops but differ in how to implement the initial test.(,
,
,
)
The first translation method,which we refer to asjump to middle
,performs the initial test by performing an unconditional jump to the test at the end of the loop.It can be expressed by the following template for translating from the general while
loop form to goto code:
goto test;
loop:
body-statement
test:
t = test_expr;
if (t)
goto loop;
The second translation method,which we refer to as guarded do
,first transforms the code into a do-while
loop by using a conditional branch to skip over the loop if the initial test fails.(首先使用条件分支,如果条件分支不成立就跳过循环,把代码变换为do-while
循环)GCC follows this strategy when compiling with higher levels of optimization, for example, with command-line opition -O1
.This method can be expressed by the following template for translating form the general while
loop from to ado-while
loop:
t = test-expr;
if (!t)
goto done;
do
body-statement
while (test-expr);
done:
This, in turn, can be transformed into goto code
as
t = test-expr;
if (!t)
goto done;
loop:
body-statement
t = test-expr;
if (t)
goto loop;
done:
Using this implementation strategy,the compiler can often optimize the initial test, for example, determining that the test condition will always hold.
3.For Loops
The general form of a for
loop is as follows:
for (init-expr; test-expr; update-expr)
body-statement
The C language standard states(with one exception, highlighted in Problem3.29)that the behavior of such a loop is identical to the following code using a while
loop:
init-expr;
while (test-expr) {
body-statement;
update-expr;
}
The program first evaluates the initialization expressioninit-expr
.It enters a loop where it first evaluates the test condition test-expr
,exiting if the test fails,then executes the body of the loop body-statement
,and finally evaluates the update expression update-expr
.
The code generated by GCC for a for
loop the follows one of our two translation strategies for while
loops, depending on the optimization level.That is , the jump-to-middle strategy yields the goto code
init-expr;
goto test;
loop:
body-statement;
update-expr;
test:
t = test-expr;
if (t)
goto loop;
while the guarded-do
strategy yields
init-expr;
t = test-expr;
if (!t)
goto done;
loop:
body-statement;
update-expr;
t = test-expr;
if (t)
goto loop;
done:
As examples,consider a factorial function written with a for
loop:
long fact_for (long n)
{
long i;
long result = 1;
for (i = 2; i <= n; i++)
result *= i;
return result;
}
As shown,the natural way of writing a factorial function with a for
loop is to multiply factors from 2
to n
, and so this function is quite different from the code we showed using either a while
or a do-while
loop.
We can identify the different components of the for
loop in this code as follows:
init-expr i = 2
test-expr i <= n
update-expr i ++
body-statement result *= i;
Substituting these components into the template we have shown to transform a for
loop into a while
loop yields the following:
long fact_for_while(long n)
{
long i = 2;
long result = 1;
while (i <= n) {
result *= i;
i++;
}
return result;
}
Applying the jump-to-middle transformation to the while
loop then yields the following version in goto code:
long fact_for_jm_goto(long n)
{
long i = 2;
long result = 1;
goto test;
loop:
result *= i;
i++;
test:
if (i <= n)
goto loop;
return result;
}
Indeed, a close examination of the assembly code produced by GCC with command-line option Og
closely follows this template:
long fact_for(long n)
n in %rdi
fact_for:
movl $1, %eax
movl $2, %edx
jmp .L8
.L9:
imulq %rdx, %rax
addq $1, %rdx
.L8:
cmpq %rdi, %rdx
jle .L9
rep; ret
We see from this presentation that all three forms of loops in C-do-while
,while
,and for
-can be translated by a simple strategy,generating code that contains one or more conditional branches.Conditional transfer of control provides the basic mechanism for translating loops into machine code.
网友评论