Machine SUIF Back-end for the ARM Architecture

Grégory Théoduloz
Darío Suárez Gracia


The ARM architecture is the most widespread architecture in the embedded market. The flexible, retargetable compiler infrastructure Machine SUIF was lacking a back-end for that architecture and therefore we have developed a first implementation of such a back-end to support ARMv5 and the FPA coprocessor.

In addition to the target library (i.e., the library that describes the architecture and allows to generate and manipulate code for a given architecture) some optimisation passes have been developed that are designed to improve performance by taking advantage of advanced architectural features. The provided passes are:

  • Complex addressing modes generation (cplxldst)
  • Conversion of if to conditional instructions (armif2cond)
  • Constant and copy propagation on ARM code (armmovp)

Download

  • ARM back-end with optimisation passes (.tar.gz) (downloaded 999 times)
  • Project report (.pdf)

Installation

  1. Download the archive from the link above
  2. Uncompress it
  3. Define the following environment variable to match your installation:
    • LOCAL_BASE: Root of the installation. Machine SUIF will create bin, solib and include subdirectories there. The compiled version of the back-end and the passes will be installed in that directory
  4. You may have to modify the following environment variables as well:
    • LD_LIBRARY_PATH: $LOCAL_BASE/solib should be added to it to make the libraries available
    • PATH: $LOCAL_BASE/bin should be added to it
  5. Compile and install the back-end and its accompanying passes by running the following command in the distribution's src directory: gmake all

Use of the Back-end

Generating ARM code is simply done by passing the argument -target_lib arm to the do_gen pass. The use of optimisation passes requires a special care since they have to be run at specific moments in the compilation flow. A typical compilation flow is the following:

c2s foo.c
do_lower foo.suif foo.lsf
do_s2m foo.lsf foo.svm
do_il2cfg foo.svm foo.svmcfg
do_cplxldst -setbrp 1 foo.svmcfg foo.cplx

do_dce foo.cplx foo.cplxdce
do_cfg2il foo.cplxdce foo.svmopt
do_gen -target_lib arm foo.svmopt foo.avr
do_il2cfg foo.avr foo.afg
do_armmovp -maxiter 10 foo.afg foo.afgp
do_armdce foo.afgp foo.afgopt

do_raga foo.afgopt foo.ara
do_cfg2il foo.ara foo.ail
do_armif2cond -max-block-size 8 foo.ail foo.ailcond
do_fin foo.ailcond foo.asa
do_m2a foo.asa foo.s
arm-linux-gcc -c foo.s -o foo.o

The steps in bold are the ones specific to ARM. More details on the limitations and optimisation passes are provided in the following sections.

Limitations

Variadic Functions

The provided back-end should be able to compile any valid C code except if it contains the implementation of a variadic function (i.e., functions that can take varying number of parameters, e.g. fprintf, can be called but cannot be implemented). For instance the following two first codes will generate a compile-time error while the third one will not.

Example 1 (non-working)

#include <varargs.h>

void f(va_list)
    va_dcl
{
    va_start(va_list);
    while (...) {
        int s = va_arg(va_list, int);
        /* some more code */
    }
    va_end(va_list);
}

Example 2 (non-working)

#include <stdarg.h>

void f(char *s, ...)
{
    va_list ap;
    va_start(ap, s);
    while (...) {
        int s = va_arg(ap, int);
        /* some more code */
    }
    va_end(ap);
}

Example 3 (working)

/* Variadic function declaration */
void fprintf(FILE *fp, const char *fmt, ...);

/* Variadic function call */
void f(const char *name) {
    fprintf(stdout, "Hello %s!", name);
}

Optimisation Passes

Complex Addressing Modes Generation

Description

The ARM architecture specifies several addressing modes. Some of them can be used to significantly improve performance. The current implementation supports one typically useful addressing mode: scaled register offset. The pass do_cplxldst will replace loads and stores with an ANY instruction whenever the scaled register offset addressing mode can be used. The code generation pass later translates ANY instructions back to an appropriate ARM load or store. Consequently do_cplxldst has to be run on suifVM code, before the do_gen pass.

In addition to complex addressing mode generation, this pass can perform another optimisation: whenever possible, it replaces SL/SLT/SEQ/SNE followed by a BTRUE/BFALSE by a single branch instruction (in the suifVM code) (only if the setbrp argument has value 1).

Usage

do_cplxldst -debug <debug level> -setbrp <0|1> <file1> <file2>

Conversion of if to Conditional Instructions

Description

Every ARM instruction can be conditionally executed. Branches can be saved if that feature is taken advantage of. The provided implementation looks for if-like construct on an ARM instruction lists and whenever the number of instructions is below a user-provided limit, the branches are removed and a condition code is added to the instructions within the then- and else-block.

Usage

do_armif2cond -debug <debug level> -max-block-size <max #instr in a transformed block> <file1> <file2>

Constant and Copy Propagation on ARM Code

Description

This very simple pass uses the reaching definition analysis provided in the BVD library to propagate constants and copy as far as possible in the code. The algorithm is run several times, until either a fixpoint is reached or the user-defined maximum number of iteration has been reached.

Usage

do_armmovp -debug <debug level> -maxiter <max #iterations to be run> <file1> <file2>

Dead Code Elimination on ARM Code

Description

Due to some problems with the CFG simplifications part of the "normal" dead code elimination pass (because of the immediate values data mixed within optimisation unit by the ARM back-end), we have taken the code of the existing DCE pass (from the Machine SUIF distribution) keeping it unchanged but for the CFG simplifications that have been slightly modified. Always use this pass instead of dce on ARM code!

Usage

do_armdce -debug <debug level> <file1> <file2>

About the Authors

Darío Suárez Gracia is the author of the original back-end written at LAP, EPFL in 2003, during a semester project. Grégory Théoduloz extended it, corrected some parts and provided the optimisation passes specific to the ARM architecture. He did it as part of his Winter 2004/2005 semester project at LAP, EPFL.

Acknowledgments

Both projects were supervised by Laura Pozzi from LAP, EPFL while Glenn Holloway from Harvard University provided much help.


Comments to the webadmin
Modified 13-Dec-2007 9:39
(C) EPFL 2003-2005
LAP Homepage EPFL Homepage I&C Homepage