| My bytecode VM Lox interpreter
2024-11-06
Last edit: 2024-11-06
---------------------
The aim of this post is to describe the general operation of the program and some of the mechanisms that we consider to be of interest. For full details, a link to the source code is available at the bottom of the page.
So I continued with “Crafting Interpreters” by Robert Nystrom after making [My Tree-Walker Lox interpreter](/posts/jlox). In this part I tried to do as many challenges as possible and really understand how a VM bytecode works.
This version is written in C, which means we have to write a lot of code ourselves, but we don't use any external libraries.
## Compiler
The primary purpose of our compiler is to generate a chunk of code in bytecode form for interpretation by our bytecode virtual machine. Here are a few interesting features of the front end.
### Scanner
The token scanner is very classic, with only one thing to say: the function responsible for identifying the language's native keywords is very dirty. The author has chosen to use a large switch statement instead of implementing a sorting function, which is certainly powerful but not very elegant.
### Parser
An interesting point to note is that the author chose not to use a syntax tree for the front end. We therefore implemented a single-pass compiler (directly converts compile units into bytecode).
We also implemented a Vaughan Pratt's parser, in our case a “top-down operator precedence parser”. This means we have to define operator precedence in advance. Here's what it looks like in code.
```c
typedef enum {
PREC_NONE,
PREC_ASSIGNMENT, // =
PREC_OR, // or
PREC_AND, // and
PREC_EQUALITY, // == !=
PREC_COMPARISON, // < > <= >=
PREC_TERM, // + -
PREC_FACTOR, // * /
PREC_UNARY, // ! -
PREC_CALL, // . ()
PREC_PRIMARY
} Precedence;
```
This precedence is simply used to control the parsing of expressions. A rule with a lower precedence than the last parsed expression is not allowed.
## Bytecode
To manage conditions, we emit `OP_JUMP` operation code for conditions. If a condition expression is evaluated to false, it jumps to the end of the conditionnal block / expression. To do this, we use the concept of backpatching: we overwrite the immediate value of the instruction in the chunk during compilation.
In my implementation, all immediate values are encoded on 8 bits, with the exception of constants, which have a size of 24 bits.
## Virtual Machine
The VM is centered on a stack where we push operands, local variables, etc..
Everything at runtime is managed by callframes, even the top-level code is embed within a function object.
## Example
Here is a simple Lox example that can be evaluated by my interpreter.
```text
fun fib(n) {
if (n < 2) {
return n;
}
return fib(n - 2) + fib(n - 1);
}
print fib(10);
```
## Links
|