I want to write a JIT compiler.
Prologue
I want to write a JIT compiler.
I've been saying this for the past two years...
I feel like I won't be able to do it if things stay like this.
I need to take action at some point.
What I Want to Build
- JIT Compiler (Interpreter)
- Implemented in Rust
- Do not use inline assembly
- Do not use LLVM or related wrappers
- Preferably from scratch
- Target initially AArch64
- Considering abstraction of arch and creating IR
- For now, it's better to start small without packing too much, so start simple
- Considering abstraction of arch and creating IR
- Aim for a dynamically typed language with JavaScript syntax
- In terms of approach, first write with the mindset of implementing a VM-type interpreter, and the evaluator can be done casually
- Compile VM's Instructions to native (AArch)
Vec<u8>
as needed- Cast to a function and call with std::mem::transmute Reference: How can I execute hex opcodes directly in Rust?
Things I Don't Understand Yet (Todo?)
- Review of VM-type interpreters
- Review of Instruction Set and implementation/design?
- What about GC
- Understanding AArch64 machine language
- Think about what kind of assembly instructions are generally needed by recalling the time of making a C compiler
- Identify language features and required instructions
- Write some manually
- Understand machine language corresponding to that assembly
- Write some manually
- Think about what kind of assembly instructions are generally needed by recalling the time of making a C compiler
- JIT strategy
- What to JIT and what not to JIT
- Design of implementation
- Researching reference materials
- What to JIT and what not to JIT
AArch64 Reintroduction
Let's start with something.
I created a repository called compilepedia before, but at that time I was still using Intel chips, so I only made an x86 index. Now, let's try producing binaries on the Apple Silicon I'm currently using.
As a refresher from compilepedia, this repository was created by me about 2 years ago when I was studying binary, listing objdump, ll, and assembly output for basic C language syntax.
At that time, I was using an Intel Macbook, so this time I will create an Apple Silicon version of it.
Although this repository contains a bit of Rust along with C, this time it will focus only on C.
There are directories for syntax written in /c/
, and generally all directories contain a.out
, hexdumped
, main.c
, main.ll
, main.o
, obj dump
, main.s
, and objdumped
.
For now, let's change the directory structure a bit so that both x86 and arm can be handled.
One slightly annoying thing is that compile.sh
doesn't really have any special settings, so if we proceed as is, everything will be replaced with Arm.
Hmm, but it's too much trouble to prepare a compiler for x86 just for this, so for now, let's output the Arm version with a new file name like aarch64_xxx
.
By doing this, we won't be able to output new x86 files for now, but oh well, it can't be helped.
Ok, it's done.
Now, let's continue to understand the machine code while referring to the Arm documentation.