Over the weekend I started working on llamaduck- a simple tool that aims to figure out whether your code will run on the newly released node 0.6.0. Eventually it might be able to perform other compatibility assessment tasks as well, but I’m focusing on simple stuff first.

southridge-56gold-111105-891.JPG

Image by Scott Trotter via Flickr

Or at least I thought it was simple.

The list of API changes since 0.4.x doesn’t seem that long and it should be easy enough to digest. But as it turns out, I spent almost all of Sunday just figuring out how to turn javascript into a beautiful analyzable AST.

If you don’t know what an AST is – it’s a so called abstract syntax tree, which means it should look identical regardless of what the actual syntax is. Although it will differ for actually different languages. So a coffeescript AST should look the same as JavaScript, but Python’s will differ.

My research came up with three options:

  1. Take a parser generator and a JavaScript grammar, hope for the best
  2. JSLint has a parser … somewhere around line 2000
  3. Uglify-JS supposedly has a parser too

The only viable option was uglify-js. It’s a neatly packaged node.js module that does a bit more than I need, but at least it’s got an easy to use parser with an exposed api interface.

Score!

Here’s an example of a file that outputs its own AST to give you a feel for what I’m talking about:

var parser = require('uglify-js').parser;
var util = require('util');
 
(function get_ast (path, callback) {
    require('fs').readFile(path, 'utf-8', function (err, data) {
        if (err) throw err;
 
        callback(parser.parse(data));
    });
})('./example.js', function (data) {
    console.log(util.inspect(data, true, null));
});

The file parses itself and outputs a tree encoded as a javascript array (scroll past the insanity, there’s a bit more text there):

[ 'toplevel',
  [ [ 'var',
      [ [ 'parser',
          [ 'dot',
            [ 'call',
              [ 'name', 'require', [length]: 2 ],
              [ [ 'string', 'uglify-js', [length]: 2 ],
                [length]: 1 ],
              [length]: 3 ],
            'parser',
            [length]: 3 ],
          [length]: 2 ],
        [length]: 1 ],
      [length]: 2 ],
    [ 'var',
      [ [ 'util',
          [ 'call',
            [ 'name', 'require', [length]: 2 ],
            [ [ 'string', 'util', [length]: 2 ], [length]: 1 ],
            [length]: 3 ],
          [length]: 2 ],
        [length]: 1 ],
      [length]: 2 ],
    [ 'stat',
      [ 'call',
        [ 'function',
          'get_ast',
          [ 'path', 'callback', [length]: 2 ],
          [ [ 'stat',
              [ 'call',
                [ 'dot',
                  [ 'call',
                    [ 'name', 'require', [length]: 2 ],
                    [ [ 'string', 'fs', [length]: 2 ], [length]: 1 ],
                    [length]: 3 ],
                  'readFile',
                  [length]: 3 ],
                [ [ 'name', 'path', [length]: 2 ],
                  [ 'string', 'utf-8', [length]: 2 ],
                  [ 'function',
                    null,
                    [ 'err', 'data', [length]: 2 ],
                    [ [ 'if',
                        [ 'name', 'err', [length]: 2 ],
                        [ 'throw',
                          [ 'name', 'err', [length]: 2 ],
                          [length]: 2 ],
                        undefined,
                        [length]: 4 ],
                      [ 'stat',
                        [ 'call',
                          [ 'name', 'callback', [length]: 2 ],
                          [ [ 'call',
                              [ 'dot',
                                [ 'name', 'parser', [length]: 2 ],
                                'parse',
                                [length]: 3 ],
                              [ [ 'name', 'data', [length]: 2 ], [length]: 1 ],
                              [length]: 3 ],
                            [length]: 1 ],
                          [length]: 3 ],
                        [length]: 2 ],
                      [length]: 2 ],
                    [length]: 4 ],
                  [length]: 3 ],
                [length]: 3 ],
              [length]: 2 ],
            [length]: 1 ],
          [length]: 4 ],
        [ [ 'string', './example.js', [length]: 2 ],
          [ 'function',
            null,
            [ 'data', [length]: 1 ],
            [ [ 'stat',
                [ 'call',
                  [ 'dot',
                    [ 'name', 'console', [length]: 2 ],
                    'log',
                    [length]: 3 ],
                  [ [ 'call',
                      [ 'dot',
                        [ 'name', 'util', [length]: 2 ],
                        'inspect',
                        [length]: 3 ],
                      [ [ 'name', 'data', [length]: 2 ],
                        [ 'name', 'true', [length]: 2 ],
                        [ 'name', 'null', [length]: 2 ],
                        [length]: 3 ],
                      [length]: 3 ],
                    [length]: 1 ],
                  [length]: 3 ],
                [length]: 2 ],
              [length]: 1 ],
            [length]: 4 ],
          [length]: 2 ],
        [length]: 3 ],
      [length]: 2 ],
    [length]: 3 ],
  [length]: 2 ]

Conclusion

Now we have a simple tree we can recursively analyze and look for incompatibilities. But before anything really practical can be done I need to figure out how to track variable scope. That’s really the hard bit because the code needs to check when variables become a critical section and then confirm that they do in fact eventually get used in a critical way.

But once that nut is cracked llamaduck will be a neat little tool useful for many things.

If you’ve got some coding inclination, I’d love a helping hand over at the llamaduck github repo.

Enhanced by Zemanta
  • http://www.invoicefox.com Janko Itm

    what’s with all the lenght stuff?

  • http://twitter.com/Swizec Swizec

    I think that’s just how util.inspect works. Shows all the properties of an object and arrays have length as a property.

  • http://www.usrjoy.com Janko Itm

    well thats just duplication of information, and if you remove it ast actually doesn’t look that bad

  • Klemen

    You might wanna check out Codesurgeon ( https://github.com/hij1nx/codesurgeon ), it seems to do exactly what you’d like to, and offers code splicing and manipulation tools.

  • Pingback: Node.js: Links, News and Resources (4) « Angel “Java” Lopez on Blog