Building an Interpreter in Rust
I built an interpreter for the Monkey programming language in Rust. This is, hands down, my favorite project so far. Not because it’s the most practical thing I’ve ever made, but because it made me understand how programming languages actually work – not just use them.
I started with Writing an Interpreter in Go and Crafting Interpreters. Both are excellent. But here’s the thing: the first book uses Go, and I was learning Rust. So I decided to translate the concepts myself rather than copy-paste Go code.
This was the best decision I made. You don’t truly understand something until you’ve had to implement it in a language with different constraints. Go’s garbage collector lets you be relaxed about memory. Rust’s borrow checker makes you think about ownership at every step. Every concept had to be carefully considered.
Here’s what Monkey code looks like:
1let fibonacci = fn(x) {
2 if (x == 0) {
3 return 0;
4 } else {
5 if (x == 1) {
6 return 1;
7 } else {
8 fibonacci(x - 1) + fibonacci(x - 2);
9 }
10 }
11};
12
13fibonacci(10); // 55
Nothing groundbreaking, but enough complexity to make things interesting.
The Architecture
Building an interpreter means building four main pieces:
- Lexer – turns source code into tokens
- Parser – turns tokens into an Abstract Syntax Tree (AST)
- Evaluator – walks the AST and executes it
- Environment – stores variables and their values
The flow is straightforward: Source Code → Lexer → Parser → AST → Evaluator → Result
The Parts That Made Me Think
The Parser is Smarter Than I Expected
I thought parsing would be simple pattern matching. It’s not. The challenge is handling operator precedence without writing a giant nested if-statement.
The solution? Pratt parsing. Each operator has a precedence level, and the parser uses that to decide how to group expressions:
1fn parse_expression(&mut self, precedence: Precedence) -> Result<ast::Expression> {
2 let mut left_expression = self.parse_prefix(&self.cur_token.clone());
3 let mut peek_precedence = self.peek_token.precedence();
4
5 while !self.peek_token.is_of_type(&token::Token::SEMICOLON)
6 && precedence < peek_precedence
7 {
8 self.next_token();
9 left_expression = self.parse_infix(left_expression?);
10 peek_precedence = self.peek_token.precedence();
11 }
12
13 left_expression
14}
This tiny loop handles parsing 1 + 2 * 3 correctly (as 1 + (2 * 3)) without any special-casing. The precedence levels do all the work.
Closures Are Just Environments
When you define a function in Monkey, it captures its environment. This is how closures work:
let newAdder = fn(x) {
fn(y) { x + y };
};
let addTwo = newAdder(2);
addTwo(3); // 5
The inner function remembers x even after newAdder returns. In my implementation, functions are just:
1Object::Function {
2 parameters: Vec<ast::Identifier>,
3 body: Rc<RefCell<ast::BlockStatement>>,
4 env: Rc<RefCell<Environment>>, // ← The magic
5}
That env field holds the captured environment. When you call the function, it creates a new environment that extends the captured one. That’s it.
Hash Maps Need Hashable Keys
Implementing hash maps in Monkey was interesting because not everything should be a valid key:
1impl HashKey for Object {
2 fn hash_key(&self) -> Result<u64> {
3 match self {
4 Object::Integer(value) => Ok(*value as u64),
5 Object::Boolean(value) => Ok(if *value { 1 } else { 0 }),
6 Object::String(value) => {
7 let mut hasher = DefaultHasher::new();
8 (*value).hash(&mut hasher);
9 Ok(hasher.finish())
10 }
11 _ => anyhow::bail!(Error::KeyError(format!(
12 "unusable as hash key: {}",
13 self.object_type()
14 ))),
15 }
16 }
17}
Functions and arrays can’t be keys. Why? Because you can’t meaningfully compare them. Two functions with the same code but different closures aren’t “equal” in any useful sense.
The REPL is Just a Loop
The Read-Eval-Print Loop is exactly what it sounds like:
1pub fn start<R: Read, W: Write>(reader: R, writer: &mut W) -> io::Result<()> {
2 let mut reader = BufReader::new(reader);
3 let env = Rc::new(RefCell::new(Environment::new()));
4
5 loop {
6 write!(writer, "{}", PROMPT)?;
7 writer.flush()?;
8
9 let mut line = String::new();
10 if reader.read_line(&mut line)? == 0 { break; }
11
12 let mut lexer = lexer::Lexer::new(line.trim());
13 let mut parser = parser::Parser::new(&mut lexer);
14
15 if let Ok(program) = parser.parse_program() {
16 match evaluator::eval(program.into(), env.clone()) {
17 Ok(evaluated) => writeln!(writer, "{}", evaluated)?,
18 Err(e) => writeln!(writer, "{}", e)?,
19 }
20 }
21 }
22 Ok(())
23}
Read a line, parse it, evaluate it, print the result. The environment persists across iterations so variables stick around. That’s how a REPL works.
What Rust Taught Me
Reference Counting Everywhere
Because the AST needs to share ownership of nodes, I used Rc<> everywhere. The environment is Rc<RefCell<Environment>> so it can be mutated while shared. Function bodies are Rc<RefCell<BlockStatement>> for the same reason.
This felt clunky at first, but it’s actually explicit about what Go’s garbage collector hides: shared ownership requires runtime bookkeeping.
Enums Are Perfect for ASTs
Rust’s enums make AST nodes beautiful:
1pub enum Expression {
2 Identifier(Identifier),
3 Integer(IntegerLiteral),
4 String(StringLiteral),
5 Boolean(Boolean),
6 Prefix(PrefixExpression),
7 Infix(InfixExpression),
8 If(IfExpression),
9 Function(FunctionLiteral),
10 Call(CallExpression),
11 Array(ArrayLiteral),
12 Index(IndexExpression),
13 Hash(HashLiteral),
14}
Each variant holds its own data. Pattern matching makes the evaluator clean:
1match expr {
2 ast::Expression::Integer(literal) => Ok(Rc::new(Object::Integer(literal.value))),
3 ast::Expression::String(literal) => Ok(Rc::new(Object::String(literal.value))),
4 ast::Expression::Boolean(literal) => eval_boolean(literal),
5 ast::Expression::Prefix(literal) => eval_prefix_expression(literal, env),
6 ast::Expression::Infix(literal) => eval_infix_expression(literal, env),
7 // ... and so on
8}
No null checks, no inheritance hierarchies, just data and functions.
What I’d Do Differently
If I were starting over, I’d add:
- Better error messages with line numbers and column positions
- A type system (even a simple one would catch bugs)
- Bytecode compilation for performance (this is what Crafting Interpreters does in part 3)
Try It
The interpreter is on GitHub. You can run the REPL:
1cargo run
Or execute a file:
1cargo run -- examples/fibonacci.monkey
If you’ve ever wondered how programming languages work, build an interpreter. It’s one of those projects that changes how you think about code. You go from “I use functions” to “I understand why functions work.”