My own Programming Language: Part 1

By Vladimir Rubin 6 min read

The idea

So I was sitting there, thinking about lispy languages. Then I realized that not many Lisps have ADT support, which is really sad considering that I love ADTs. They allow for easy representation of basically any data (it all was enums all along), building ASTs, game state, etc… For example, this code

#[derive(Debug)]
enum IpAddr {
    V4(u8, u8, u8, u8),
    V6(u16, u16, u16, u16, u16, u16, u16, u16),
}

let v4 = IpAddr::V4(127, 0,0,1);
println!("{v4:?}");
// then you can pattern-match on it etc, etc.

match v4 {
    IpAddr::V4(127, x, y, z) => println!("local"),
    IpAddr::V4(g, x, y, z) => println!("generic v4"),
    _ => println!("something else"),
}

evals to

V4(127, 0, 0, 1)
local

So obviously, as any sane human being, I just decided to write my own toy Lisp with ADTs, as it would be a really fun exercise in an area of programming I have no experience in. Oh, and because I presumably love suffering I decided to restrict myself to only using stdlib, all for the sake of minimalism (no, I am NOT a suckless hippie). And I called it Blip because eh… I dunno, just wanted to call it that.

The start

But this time I didn’t want to write my project in Rust, because I already had 3 (this website, hinoirisetr and shantiunreleased ) active projects in it at the time. I wanted to try something new. So my choice landed on Odin, as it seemed suitable for this task. It has manual memory management and a nice syntax. Overall I would describe it as if Jai, Go and C had a child together, and Odin came out. Yeah it’s still not as mature as C or Rust or Zig, but does maturity even matter for a toy personal project (It doesn’t.). So I wrote a tiny base impl in Odin, and that worked fairly well, until… Until I ran it with valgrind and saw that it leaked memory and my oxidized brain went “Oh no, this is unacceptable!!” - and of course instead of just patching Odin version up I just decided “Why not rewrite Blip in a whole another language. Some jumping around and thinking later I chose OCaml as the language for the rewrite, I thought “what a good idea, OCaml is exactly what i want, something in-between rust and Haskell(my attempt to learn Haskell fried my brains)” and I wanted to dabble in FP some more, so OCaml rewrite it is.

OCaml

OCaml seemed nice… But my LSP server refused to work without dune, which really sucks, because I wanted to use a plain Makefile (again, for minimalism). Hm, okay, I prefer dev experience over minimalism, so I migrated to dune. Saw that it has a folder for tests and decided to write some… Only to realize that I need a library just to write TESTS?!? Your build system created a directory for tests, why don’t you provide it? I don’t have any excuses for having to write this

let assert_equal name printer expected actual : bool =
  let green = "\027[32m" in
  let red = "\027[31m" in
  let reset = "\027[0m" in
  if expected = actual then (
    Printf.printf "%s[PASS]%s %s\n" green reset name;
    true)
  else (
    Printf.printf "%s[FAIL]%s %s: expected %s but got %s\n" red reset name
      (printer expected) (printer actual);
    false)

Again, it might be my fault for not using any deps, but you kinda expect a modern language to have built-in testing utils… or at least if it doesn’t have those not have a tests/ dir in the build system. Ugh. Oh and have I told you about the cryptic compiler errors? Well OCaml has those too. The compiler often shows an error on a line before or after the one it actually is on, and says something about invalid syntax. It could be that I am just a shitty FP coder and write bad code but why do I have to wrap half of the match cases into a () block?.. At this point, Blip wasn’t a Lisp anymore - it was a rewrite circus. Every language I looked at was another possible contender: Zig, D, C, Vale, Hare, fix Odin version(lol, no), C3, Nim.

Zig

Zig felt like the natural next step(or clown, if you will): I have heard about it for a while, and it for sure suited this project with it’s manual memory management and such. Oh and of course, the Zig people wouldn’t shut up about how powerful comptime is. But ironically I ended up using it only 4 times across the ~860 line codebase, and all of those for implementing std.fmt.Format. So yeah not exactly the mega-macro system I was expecting. But still, I really liked Zig. It’s allocator model made it really easy to use an Arena allocator, it has nice pattern matching on tagged unions. Overall, I was happy. But midway through rewriting the interpreter in Zig, I came up with the idea that I should benchmark all of the implementations. But benchmarking just 2-3 impls is not so fun. So I decided to turn Blip from a programming language project into an experiment: “implement the same tiny interpreter in a load of languages and then compare the ergonomics and performance”. So, with the new frontiers in mind I merged the Zig and OCaml impls into one branch(commit), finished up the Zig impl and started working on the benchmarking utility(ironically, written in Rust).

C

Old (I used C99), bare-bones, portable. That’s the way I would describe C. And because of how small it is, I lifted the “stdlib-only” restriction for it, and went with nob.h, after all, I am a Tsoding fan lol. And I am too lazy to write my own dynamic arrays/arenas and all of that(yes I am lazy, I know that). I am still not done with this implementation as of the posting date, but I already met some pain-points:

Dealing with arenas

C doesn’t have any built-in utils to use arena allocators, so using one to store strings in the token tree was annoying. Since I use a single arena for storing all strings and identifiers I also have to pass the whole TokenArray into token_to_str function, even though it formats only a single Token.

Lack of namespacing

This was the issue I noticed first. Now, instead of having proper namespacing, I just had to call all enum variants TOKEN_LPAREN instead of simple LPAREN because otherwise I would get an error about collisions.

Having to use .h

Yeah this is a minor issue, but it still makes the experience worse.

Compiler defaults

To get actually useful errors/warnings I needed to use the following compiler parameters: -g -fsanitize=address,undefined -fno-omit-frame-pointer -Wshadow -Wall -Wextra. What is this??? I get that C is an OLD language, however I really don’t like this.

Planned next steps

Write the rest of the implementations, after which I’ll evaluate each one based on developer experience, using 5 criteria(1-5 scale, total ≥15/25 to pass):

Of course the I’ll be the judge, since this is my toy project :). I, personally think that those are the most important things for a language to have, and I want to be dead-sure since the final winner will be the language I use for implementing the rest of my dreams for Blip. After filtering the languages, I will add more features to each of the interpreters:

And THEN benchmark it all using the bench util I wrote(I’ll also add spec compliance tests to it later), as of course I prefer that my language has good performance. The languages I plan to use are all listed in the README.md, but I’ll provide more reasoning in the following parts.

As for this post, this is all, but wait for more parts, where i’ll speak about the some more of the languages(Nim, D and more C) and all of that.

REFERENCES