Programmer14's Blog: ARM7TDMI program that computes polynomial

Let's write a program in ARM assembly language that computes the value of a polynomial. A polynomial of degree 2 is an expression of the form a*x**2 + b*x + c, where a, b, and c are constants and x is an input parameter.

Example: factorial calculation

Before writing the polynomial program, let's look at another program that involves multiplication. n factorial, written n!, is defined for an integer n as

n! = n * (n-1) * (n-2) * ... * 1

The following is a modified version of the factorial program from Chapter 3 of ARM Assembly Language Fundamentals and Techniques, Second Edition:

Listing: factorial.s

; Compute n!
; The input parameter n is in r6.
; The output, n!, will be written in r7.

        .text                       ; AREA Prog2,CODE,READONLY
        .global _start              ; ENTRY
_start:
        mov     r6, #10             ; n = r6 = 10;
        mov     r7, #1              ; fact = r7 = 1;
loop:   cmp     r6, #0
        mulgt   r7, r6, r7          ; if (n > 0) fact = n * fact;
        subgt   r6, r6, #1          ; if (n > 0) n = n - 1;
        bgt     loop                ; if (n > 0) goto loop;

stop:   B       stop
        .end                        ; END

Problem description: compute polynomial

Write a program for ARM7TDMI that computes 6x**2 - 9x + 2. Assume x is initially in r3. Assume x is nonnegative. Write the result in r2.

Solution

First, write a version in Python:

Listing: poly.py

I32_MIN = -2**31
I32_MAX = 2**31-1

def poly(x, f=lambda x: 6*x**2 - 9*x + 2):
    r = f(x)
    if (r < I32_MIN):
        print('*** Result is too small: {}'.format(r))
        return None
    elif (r > I32_MAX):
        print('*** Result is too large: {}'.format(r))
        return None
    else:
        return r

For this version I added a check for 32-bit signed integer overflow. For the real program, no check will be done.

According to Brian Kernighan and P. J. Plauger, we should test programs at boundaries -- low values and high values in this case. If our program works at the boundaires, it will probably work for the values in between.

>>> poly(0)
2
>>> poly(8)
314
>>> poly(10)
512

For the largest input, I want the largest x value whose result fits in a signed 32-bit integer. Mathematically speaking, I want to solve 6*x**2 - 9*x + 2 = I32_MAX for x. Set the right-hand side to zero to make it easily solvable:

6*x**2 - 9*x + 2 - I32_MAX = 0

Now we can use the quadratic formula:

Listing: poly.py

def find_maxx(eq=[6, -9, 2-I32_MAX, 0]):
    """ Solve linear equation given by eq = [a,b,c,rhs]. """
    a = eq[0]    
    b = eq[1]
    c = eq[2]
    assert eq[3]==0, "rhs must be 0."
    # Use quadratic formula.
    r1 = (-b + sqrt(b**2 - 4*a*c)) / (2*a)
    r2 = (-b - sqrt(b**2 - 4*a*c)) / (2*a)
    return int(max(r1,r2))

>>> find_maxx()
18919
>>> poly(18919)
2147401097L
>>> poly(18920)
*** Result is too large: 2147628122

To compute the polynomial using ARM instructions, we need to convert the high-level expression 6*x**2 - 9*x + 2 into more basic operations:

Listing: poly.py

def f2(x):
    """ 6*x**2 - 9*x + 2
    """
    term1 = x
    term2 = term1
    term3 = 2
    #
    term1 *= term1
    term1 *= 6
    term2 *= -9
    #
    res = term1
    res += term2
    res += term3
    return res

Verify that the same answers are produced when using the low-level version f2:

>>> x=0; (poly(x), poly(x,f2))
(2, 2)
>>> x=8; (poly(x), poly(x,f2))
(314, 314)
>>> x=10; (poly(x), poly(x,f2))
(512, 512)
>>> x=18919; (poly(x), poly(x,f2))
(2147401097L, 2147401097L)

To translate the f2 procedure to assembly language instructions, write each statement as a comment and then come up with suitable register assignments:

Listing: poly.s

        ; term1 = x         // x: r3, term1: r3
        ; term2 = term1     // term2: r2
        ; term3 = 2         // term3: immediate

I have used the notation x: r3, term1: r3 to mean that I want to store x in register r3 and term1 in r3. For the first statement (term1 = x), this means that no instruction is needed; the source and destination registers are the same. For the multiplications I expand *= to more closely reflect the assembly language equivalent:

Listing: poly.s

        ; term1 = term1 * term1 
        ; term1 = term1 * 6
        ; term2 = term2 * -9

For the final paragraph, the following needs adjustment:

Listing: poly.s

        ; res = term1       // !!! res: r2
        ; res = res + term2
        ; res = res + term3

I want to assign res to r2, as required by the problem statement, but on the line with !!!, term2 is already assigned to r2; performing this assignment would destroy term2's contents. I thought about using another register, but in this case, the following rearrangement suffices:

Listing: poly.s

        ; res = term2       // res: r2
        ; res = res + term1       
        ; res = res + term3

In this form, translation to an assembly language program is straightforward:

Listing: poly.s

; Compute the value for the polynomial 6*x**2 - 9*x + 2
; Input parameter x is in r3.
; The result will be written to r2.

        .text
        .global _start
_start:

        ;ldr        r3, =0      ; Example input: x = 0. Expected result: 2
        ;ldr        r3, =8      ; Example input: x = 8. Expected result: 314 (0x13a)
        ;ldr        r3, =10     ; Example input: x = 10. Expected result: 512 (0x200)
        ;ldr        r3, =18919  ; Example input: x = 18919. Expected result: 2'147'401'097 (0x7ffebd89)

                            ; term1 = x     // x: r3, term1: r3
        mov     r2, r3      ; term2 = term1 // term2: r2
                            ; term3 = 2     // term3: immediate

        mul     r3, r2, r3  ; term1 = term1 * term1   // Note: term2 = term1
        mov     r1, #6
        mul     r3, r1, r3  ; term1 = 6 * term1
        mov     r1, #-9
        mul     r2, r1, r2  ; term2 = -9 * term2

                            ; res = term2   // res: r2, term2: r2
        add     r2, r2, r3  ; res = res + term1
        add     r2, r2, #2  ; res = res + term3

stop:   B       stop
        .end                        ; END

Note 1: The instruction

mul    r2, r2, r1   ; term1 = term1 * 6

is not permitted by the ARM ISA and must be reversed to

mul     r2, r1, r2   ; term1 = 6 * term1

For the same reason, squaring term1 is not possible using the original instruction (mul r3, r3, r3), but since the initial values of term1 and term2 are equal, the square can be computed with this functional equivalent:

mul     r3, r2, r3     ; term1 = term1 * term

Note 2: To multiply each term by a constant, I employed a scratch register. For example, the following two instructions perform the multiplication of term1 by 6:

mov     r1, #6
mul     r3, r1, r3 ; term1 = 6 * term1

Running in ARMSim#

We can run the program in the simulator (ARMSim#) to verify correct operation:

Save the assembly source file as poly.s.
Start ARMSim#.
Uncomment one of the LDR pseudoinstructions near the top of the file to select a test input value.
Click File -> Load and select poly.s.
The file should load without errors.

If there are errors, make appropriate changes to the source file and then click File -> Reload in ARMSim#.

Set a breakpoint on the line with instruction B stop.
Click Run.
Examine the contents of r2 at the end of the program to verify the result is correct.

References

ARM Assembly Language Fundamentals and Techniques, Second Edition. Hohl, William et al. CRC Press. 2014.
Software Tools in Pascal. Kernighan, Brian et al. Addison-Wesley. 1976.
Learning about ARM architecture using ARMSim#.
ARMSim# - A Simulator for the ARM Architecture.

Programmer14's Blog

03 November 2017

ARM7TDMI program that computes polynomial

Example: factorial calculation

Problem description: compute polynomial

Solution

Running in ARMSim#

References

Files

No comments:

Post a Comment