Skip to main content

Assembly emitter

x86-64 has a lot of instructions. They are described in Volume 2 of the 5 Volume "Intel® 64 and IA-32 Architectures Software Developer’s Manual". Just this volume alone is over 2000 pages, which would take forever to fully implement. As a result, we will use only a subset of these instructions. This the rough plan:

  • Most instructions like add will only be implemented with r64 r64 versions.
  • To accomplish something like add rax, 1, we will use a temporary register X
    • mov X, 1
    • add rax, X
    • The constant propagation system will be able to provide enough information that we could eventually use add r64 immX and similar if needed.
    • Register allocation should handle the case (set! x (+ 3 y)) as:
      • mov x, 3
      • add x, y
    • but (set! x (+ y 3)), in cases where y is needed after and x can't take its place, will become the inefficient
      • mov x, y
      • mov rtemp, 3
      • add x, rtemp
  • Loading constants into registers will be done efficiently, using the same strategy used by modern versions of gcc and clang.
  • Memory access will be done in the form mov rdest, [roff + raddr] where roff is the offset register. Doing memory access in this form was found to be much faster in simple benchmark test.
  • Memory access to the stack will have an extra sub and more complicated dereference. GOAL code seems to avoid using the stack in most places, and I suspect the programmers attempted to avoid stack spills.
    • mov rdest, rsp : coloring move for upcoming subtract
    • sub rdest, roff : convert real pointer to GOAL pointer
    • mov rdest, [rdest + roff + variable_offset] : access memory through normal GOAL deref.
    • Note - we should check that the register allocator gets this right always, and eliminates moves and avoid using a temporary register.
    • Again, the constant propagation should give use enough information, if we ever want/need to implement a more efficient mov rdest, [rsp + variable_offset] type instructions.
  • Memory access to static data should use rip addressing, like mov rdest, [rip + offset]. And creating pointers to static data could be lea rdest, [rip - roff + offset]