Learning Ruby, Step Two: Lisp

After my first exercises with Roman numerals I started on something slightly more ambitious: a small Lisp interpreter.

The Wikipedia definition of minimal Lisp has seven elements:

  • car -- given a pair, return the first element;
  • cdr -- given a pair, return the second element;
  • cons -- construct a new pair with given first and second elements;
  • quote -- denote an expression as literal, not to be interpreted;
  • eq -- compare two objects for equality, returning true or false values;
  • if or cond -- a single- or multiple-condition branch operation; and
  • some mechanism to define functions, such as Common Lisp's defun, or else lambda and Scheme's define.

By reputation, Scheme is the cleanest of the Lisp dialects. It's standard fodder for computer science courses thanks to the Abelson, Sussman and Sussman book. On the web I found another text, Introduction to Scheme and Its Implementation, that looked more attractive for my purposes. I skimmed most of it to the implementation section.

A Scheme applet was useful for testing purposes. My first test case was as simple as you could get.

  def test_simple
    assert_equal(2, @s.eval_scheme_str('(+ 1 1)'))
  end

Or was it? Of course, a function that reads in that input (and no other) and returns the right answer is trivial. However, languages and their interpreters (or compilers, the difference is fuzzy) are well understood and can be specified properly. I wrote down a quick, informal grammar for a minimal subset of Scheme:

expr => \d
operator => [+-*/]
expr => '(' operator expr more_exprs ')'
more_exprs => expr
more_exprs => expr more_exprs

I could not resist the temptation is to whip up some regular expressions, but of course this was a mistake. The code started to get more complicated and the unit tests showed I was regressing. I started again and wrote a tokenizing function that scans character by character. Much simpler and robust. My tokenizer returned an array of tokens.

  assert_equal(['(', '+', '1', '1', ')'], @s.tokenize('(+ 1 1)'))

Incidentally, the tests expose the implementation. Is this a violation of encapsulation? Not in spirit, because the tests can be considered documentation of the implementation. They need to change if it changes.

In general I suspect that objects cannot have many private methods when you take the TDD approach. A private method is hard to test and therefore hard to refactor.

I wrote a parser that would generate a tree from the tokens and an evaluator that walked the tree.

Some tests later, I was able to run my first non-trivial Scheme expression.

  def test_nested
    assert_equal(30, @s.eval_scheme_str('(* 3 (/ 100 10))'))
  end

I could make a really primitive interpreter with a simple loop:

  while str = gets
    print @s.eval_scheme_str(str), "\n"
  end

To my mind, lists map naturally to Ruby arrays. This made cons, car and cdr easy. They just became additional operators within the same grammar I already had working.

  def test_lists
    assert_equal([ 2, 3 ], @s.eval_scheme_str('(cons (+ 1 1)(+ 2 1))'))
    assert_equal(30, @s.eval_scheme_str('(car (cons (* 3 (/ 100 10)) 10))'))
    assert_equal('a', @s.eval_scheme_str('(car (cons a (cons b c)))'))
    assert_equal([ 'b', 'c' ], @s.eval_scheme_str('(cdr (cons a (cons b c)))'))
  end

That's as far as I've come up to now. I've seen enough to get a feel for Ruby. Frankly, compared to many other computer languages, it's addictive. It works mostly the way I expect it to work, but there is enough expressive power so I don't feel the need to write code to do the obvious. Using Ruby with TDD, you get short cycles with quick feedback.

I might go on and implement the missing parts of the scheme interpreter or I might try something closer to what I might need to do in the real world some time. For example there are Pragmatic Dave Thomas's Kata.

[Updated the link the scheme applet.]