Function Implementation
Compiler Implementation
COP-3402 COP-3402
What are functions?
How do functions implement local state?
Application Binary Interface (ABI)
Binary (machine code) conventions for implementing functions
Conventions
- Stack frame layout
- Parameters
- Return address
- Base pointer
- Locals
- Parameter passing
- Return values
System/architecture specific ABIs
System V AMD64 ABI
Eli Bendersky, Stack frame layout on x86 64
- Specific to System V (many *nix systems)
- Specific to AMD64 (x86 64) instruction set architecture (ISA)
Calling convention example
# assembly file metadata
.file "stdin"
.section .note.GNU-stack,"",@progbits
.text
# function label
.globl main
.type main, @function
main:
# prologue (update base pointer, save rbx)
pushq %rbp
movq %rsp, %rbp
push %rbx
call func
# set return value
mov $10, %rax
# epilogue (restore stack pointer, restore base pointer)
pop %rbx
mov %rbp, %rsp
pop %rbp
ret
Stack frame management
(Diagram)
Show what happens to the stack and register states changes after each instruction.
main:
# prologue
pushq %rbp # save old base ponter
movq %rsp, %rbp # set new base pointer
push %rbx # %rbx is callee-saved
# call
call func
# return value
mov $10, %rax
# epilogue
pop %rbx # restore rbx for the caller
mov %rbp, %rsp # restore old stack pointer
pop %rbp # restore old base pointer
ret
func:
# prologue
pushq %rbp # save old base ponter
movq %rsp, %rbp # set new base pointer
push %rbx # %rbx is callee-saved
# return value
mov $5, %rax
# epilogue
pop %rbx # restore rbx for the caller
mov %rbp, %rsp # restore old stack pointer
pop %rbp # restore old base pointer
ret
Language processing review
Syntax trees
- Explicit names for syntactic constructs, e.g.,
- Arithmetic expression
- For-loop
- etc.
Language implementation
- Tree traversal
- Apply rules to translate/interpret
- One for each syntactic construct
Infix-to-postfix translator
Assume we have string concat function to join any number of strings.
E -> E + E { E.value = concat(E1.value, E2.value, "+") }
E -> E - E { E.value = concat(E1.value, E2.value, "-") }
E -> E * E { E.value = concat(E1.value, E2.value, "*") }
E -> E / E { E.value = concat(E1.value, E2.value, "/") }
E -> N { E.value = N.value }
N -> 0 { N.value = "0" }
N -> 1 { N.value = "1" }
N -> 2 { N.value = "2" }
...
N -> 9 { N.value = "9" }
Note that we no longer need to convert ascii to a machine integer, since we can just print the output program in ascii as well.
CodeGen with an ANTLR Listener
Syntax tree for SimpleIR
(Diagram)
Draw tree for
function main phonyvar := call func return 10
# assembly file metadata (enterUnit)
.file "stdin"
.section .note.GNU-stack,"",@progbits
.text
# function label (enterFunction)
.globl main
.type main, @function
main:
# prologue (enterFunction)
pushq %rbp
movq %rsp, %rbp
push %rbx
# enterCall
call func
# set return value (enterReturn)
mov $10, %rax
# epilogue (exitFunction)
pop %rbx
mov %rbp, %rsp
pop %rbp
ret
Note that the return statement doesn't call return. That's done in/after the epilogue. Return sets the value of the return register, i.e., %rax.
enterUnit
- Generate file pseudo-ops
- (Given for the project)
enterFunction
- Generate function label
- Generate prologue
enterCall
- Generate function call
exitFunction
- Generate epilogue (and return instruction)
Code generation
(Diagram)
- Use syntax tree for main.ir
- Apply
Example: enterUnit
- Function writes out assembly code
Tips
- Remember: we are writing a translator not trying to run the IR
- Code generator prints out strings (
self.outfile.write("...")) - Use the text from the input program to fill in assembly language templates (
ctx.NAME()) - The grammar gives names to the parts of the syntax tree