Swizec Teller - a geek with a hatswizec.com

Senior Mindset Book

Get promoted, earn a bigger salary, work for top companies

Senior Engineer Mindset cover
Learn more

    LOLCODE-to-JavaScript compiler babel macro

    A fully functioning LOLCODE-to-JavaScript compiler implemented as a Babel macro. You never thought you wanted this and now here it is. You're welcome. 🐱

    Let me start by proving that this crazy contraption works 👇

    Here you have a CodeSandbox with the legendary FizzBuzz implemented in LOLCODE. A Babel macro compiles it to a JavaScript function at build-time and you use it as any ole JavaScript at runtime.

    LOLCODE goes in 🐱

    HAI
        I HAS A count ITS 1
        IM IN YR fizzbuzz UPPIN YR count TIL BOTH SAEM count AN 30
            I HAS A div ITS MOD OF count AN 3
            IT R BOTH SAEM 0 AN div
            O RLY?
                YA RLY
                    VISIBLE "Fizz"
                MEBBE BOTH SAEM 0 AN MOD OF count AN 5
                    VISIBLE "Buzz"
                NO WAI
                    VISIBLE count
            OIC
        IM OUTTA YR fizzbuzz
    KTHXBYE
    

    JavaScript comes out ✌️

    var fizzbuzz = (function (stdlib) {
      return function () {
        var IT
        var count = 1
    
        for (; !stdlib["BOTH SAEM"](count, 30); count++) {
          var _IT = void 0
    
          var div = stdlib["MOD OF"](count, 3)
          _IT = stdlib["BOTH SAEM"](0, div)
    
          if (_IT) {
            var _IT2 = void 0
    
            console.log("Fizz")
          } else if (stdlib["BOTH SAEM"](0, stdlib["MOD OF"](count, 5))) {
            var _IT3 = void 0
    
            console.log("Buzz")
          } else {
            var _IT4 = void 0
    
            console.log(count)
          }
        }
      }
    })(lolcode_macro_dist_lolstdlib__WEBPACK_IMPORTED_MODULE_2__["default"])
    

    Taken from Chrome DevTools source maps. That's after Webpack and Babel do their thing. Intermediate output from lolcode.macro is modern JavaScript with lets and consts.

    Ok so how does this work?

    You can see the full source code on GitHub.

    You can also watch me build lolcode-babel-macro from scratch in a series of livecode videos. 👇

    part 2, part 3, part 4, part 5

    What is LOLCODE

    LOLCODE is an esoteric programming language from 2007. The peak of the lolcat meme when the internet was for cats.

    You might remember i can haz cheezburger?

    full 875511040 h8EB4D6E9

    Yeah that was 12 years ago my friends. We're getting old. Some of you might not even remember. A coworker recently said he's "happy that the internet has moved on from such silly nonsense".

    Kids these days have TikTok and Snapchat filters so has it really? 🤔

    Anyway, Adam Lindsay asked the important question: "What if you could write code like cats speak?". LOLCODE was the answer.

    Despite a never quite finished spec, several interpreters exist and maybe a compiler or two. I wasn't able to find a JavaScript compiler even though interpreters do exist.

    One such interpreter is loljs by Mark Watkinson. I used it as the basis for my compiler.

    _Aside: an interpreter executes your code as it's read, a compiler translates it to a different language (or machine code) to be executed later. Important distinction

    What is a Babel macro

    A Babel macro is a sort of language extension for JavaScript. Since modern JavaScript is a compiled language, usually from modern JS to ES5, we can add fun features to the compilation step.

    The most common type of Babel macro are prefixed ES6 strings. You may have seen them as GraphQL queries or CSS-in-JS. JSX is a Babel plugin. Similar to a macro but different mechanics.

    For example:

    const query = graphql`
      query {
        posts {
          author
          body
          time
        }
      }
    `
    

    That is a Babel macro.

    At compile-time Babel looks for the graphql function. graphql compiles the query into a JavaScript function, which gets inserted in place of that string.

    When your browser executes the resulting JavaScript it has no idea that a string used to live there.

    Zero run-time overhead 🤘

    Macros are extremely powerful and many programming communities have decided they're too powerful. More trouble than they're worth.

    JavaScript so far has been sensible about sticking with obvious macros and not, say, overwriting how + works.

    How to build a Babel macro

    The easiest way to build a Babel macro is with Kent C Dodds's babel-plugin-macros library. The author docs are pretty good.

    You follow a 3 step process:

    1. Create a macro.js file (naming convention matters)
    2. Get the wrapper function
    import { createMacro } from "babel-plugin-macros"
    
    1. Write your function
    function myMacro({ references, state, babel }) {
      // ...
    }
    
    1. Default export your function wrapped in createMacro
    export default createMacro(myMacro)
    

    Somehow this creates a named export. I'm not sure why or how, but that's how it is. Best not to poke.

    You also cannot export anything alongside your macro. Which is a shame and caused me many grief. You can work around it by letting people import other files in your library.

    Your macro function

    The macro function itself can get tricky. You're dealing with Abstract Syntax Trees (AST). It's up to you to modify as you see fit.

    In my case that was finding all lolcode nodes and replacing them with compiled code.

    function myMacro({ references, state, babel }) {
      references.lolcode.forEach((referencePath) => {
        const compiled = compileLolcode(referencePath)
        referencePath.parentPath.replaceWithSourceString(compiled)
      })
    }
    

    references is a reference to the JavaScript AST. state and babel are the same as they are in a normal Babel plugin. Which isn't helpful, if you've never built a Babel plugin 😅

    My macro ends up replacing each lolcode's parent node with the compiled string. That worked well.

    How to build a compiler

    Most compilers are split into 3 parts:

    1. The front-end, which uses a Lexer and a Parser to turn your code into an AST
    2. The middle-end, which performs optimizations and other transformations on your AST
    3. The back-end, which turns your AST into the final output code

    In theory you can swap these parts around.

    Once you have a back-end that generates JavaScript from an AST, you can attach different front-ends to compile other languages to JavaScript. Once you have a front-end, you can use different target outputs. Etc.

    Best explained with a LOLCODE example 👇

    Creating a LOLCODE-to-JavaScript compiler

    The swappability of compiler parts is what helped me build lolcode-babel-macro.

    I found a working lexer-parser-AST online, loljs by Mark Watkinson, updated it to work in modern JavaScript creating @swizec/loljs, and replaced the interpreting back-end with a compiler.

    The tokenizer, lexer, and parser

    The first step in compiling code is a tokenizer. Tokenizers take plain strings or files and turn them into lists of tokens. Usually a combination of split-by-space and regex.

    The LOLCODE tokenizer turns strings like O RLY? into O_RLY tokens, I HAS A into VAR_DEC, BIGGR THAN into BIN_OP etc. You can see the full list here, lines 6 to 86.

    The lexer

    The next step is a lexer, which turns combinations of tokens into recognizable sequences. This is your grammar definition and is best used with a parser generator like bison, or its JavaScript counterpart jison.

    You don't want to write your own parser from scratch, trust me.

    The full grammar definition for LOLCODE lives here 👉 loljs.jison.

    It's a series of lexical definitions 👇

    function_call
        : IDENTIFIER arg_list arg_end
            { $$ = new ast.FunctionCall(@$, $1, $2); }
        ;
    

    A function call has an IDENTIFIER, followed by an arg_list node, and an arg_end node.

    arg_end
        : MKAY {$$ = $1;}
        | eol
        ;
    
    arg_list
        : exp
            { $$ = new ast.ArgList(@$, [$1]); }
        | arg_list SEP exp
            {
                $1.push($3);
                $$ = $1
            }
        ;
    

    arg_end and arg_list in turn are built out of more tokens and nodes. The rabbit hole goes very deep and I'm happy that somebody else wrote all that for me :)

    I remember doing this in my compilers class in college and it was not fun. Easy to make mistakes, hard to verify.

    The parser

    Jison takes your LOLCODE grammar and turns it into a parser.

    A parser looks like normal JavaScript code except it's a little soupy and 3455 lines long. You really don't want to write parsers by hand.

    Here's what the parser code looks like

    case 62:
        this.$ = new ast.Assignment(this._$, $$[$0 - 2], $$[$0]);
        break;
    case 63:
        this.$ = new ast.Assignment(this._$, $$[$0 - 2], $$[$0]);
        break;
    case 64:
        this.$ = new ast.Assignment(this._$, $$[$0 - 2], $$[$0]);
        break;
    

    A convoluted series of hundreds of new ast.X calls to create an abstract syntax tree based on your grammar and your AST definition.

    no_idea giphy

    Best stick to defining the grammar and let parser generators do the rest.

    The AST

    As you can see above, your parser needs an AST definition. What does your code look like as a JavaScript object tree?

    Again thanks to Mark Watkinson, I didn't have to write my own 👉 the LOLCODE AST

    Defining your AST can be tedious, but it's not very complex. A function call node looks like this

    lol.ast.FunctionCall = function (location, name, args) {
      lol.ast.Node.call(this, location, "FunctionCall")
      this.name = name
      this.args = args
    }
    

    To generate a FunctionCall node, you need a location, a name, and some args. All coming from your parser.

    You return a Node called FunctionCall, define its name (identifier), and args node. All very recursive and following the visitor pattern

    The resulting AST is an object a little like this (writing from memory)

    {
    	location: { ... },
    	type: `FunctionCall`
    	name: 'myFunction',
    	args: {
    		location: { ... },
    		type: `ArgsList`,
    		argsList: [
    			Node, Node, Node ...
    		]
    	},
    	body: {
    		location: { ... }
    		type: `Body`,
    		lines: [ Node, Node, Node ]
    	}
    }
    

    An AST is a recursive JavaScript data structure. Once you get used to a node or two, everything follows familiar patterns.

    The JS output

    It's these recursive AST patterns that make compiling to JavaScript so accessible. Define a method for each node type and recursively call the compiler when needed.

    Like this 👇

    // The macro part
    function compileLolcode(referencePath) {
      const source = referencePath.parentPath.node.quasi.quasis
        .map((node) => node.value.raw)
        .join("")
    
      const ast = parser.parse(source)
      const jsify = new JSify()
    
      return `(function (stdlib) {
            return function() {
                ${jsify.compile(ast)} 
            }
        })(lolstdlib)`
    }
    

    Kent's babel macros plugin gives us the JavaScript AST. We use it to find our LOLCODE source. Hiding in parentPath.node.quasi.quasis in this case. We walk through all quasis, get their raw values, and join them into a string.

    That's how prefixed ES6 strings work, don't know why.

    Take the resulting source code, parse it, instantiate the JSify compiler backend, and return the output as a string. We wrap compiled code in a closure with the LOLCODE stdlib, which defines some basic functions.

    All your LOLCODE instances share the same stdlib. Assumed to exist in scope via an import.

    JSify – translate an AST to JavaScript

    Compiling an AST to JavaScript is now a matter of recursively calling node-specific methods on the JSify object and return strings. Each method on its own is pretty small, but when they work together, the result is magical.

    You can see the full JSify file here.

    Keeping with our FunctionCall example from earlier ...

    FunctionCall = (node, newIT) => {
      if (stdlib[node.name]) {
        return `stdlib["${node.name}"](${this.compile(node.args)})`
      } else {
        return `${node.name}(${this.compile(node.args)})`
      }
    }
    

    The FunctionCall method gets a node and a flag whether to instantiate a new IT context. This keeps the same function signature throughout JSify.

    IT in LOLCODE is the implicit variable, by the way. Supposed to hold the value of the last expression ... but I had to take some liberties because this is a compiler not an interpreter. You have to explicitly assign values to IT, but the variable is always there for you.

    FunctionCall then checks if this function is in stdlib and returns the appropriate code. Either a stdlib call or a normal function call.

    We call this.compile for the arguments node.

    ArgList = (node, newIT) => {
      return node.values.map(this.compile).join(", ")
    }
    

    Compiling the ArgList node is similar 👉 walk through list of values and recursively call this.compile for each node. Who knows what expression might lie in there :)

    this.compile itself is pretty simple, a switch statement 👇

    compile = (node, newIT = true) => {
      if (this[node._name]) {
        node = this[node._name](node, newIT)
      } else {
        throw new Error(`Not implemented: ${node._name}`)
      }
    
      return node
    }
    

    Checks if it recognizes the node type and calls the appropriate method. If not, it throws an error.

    And that's how a bunch of small functions work together to produce complex JavaScript code based on your LOLCODE.

    But why?

    but_why giphy

    Why not?

    LOLCODE is amazing, JavaScript is awesome, and putting them together is ... completely useless let's be honest. Intellectually gratifying but pointless.

    HOWEVER, this opens the door for future hacking 👇

    1. Great excuse to learn about Babel macros
    2. Superb way to practice building a small compiler
    3. Get some plug-and-play pieces of tech to build interesting DSLs for JavaScript
    4. Unlocks where I got stuck last time I tried to build an AI system that writes JavaScript based on specs

    That last part 😏

    Published on May 16th, 2019 in Computer Science, JavaScript, Technical

    Did you enjoy this article?

    Continue reading about LOLCODE-to-JavaScript compiler babel macro

    Semantically similar articles hand-picked by GPT-4

    Senior Mindset Book

    Get promoted, earn a bigger salary, work for top companies

    Learn more

    Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.

    Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.

    Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.

    Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev

    Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization

    Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections

    Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog

    Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com

    By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️

    Created by Swizec with ❤️