12 Jun 2013

I've been a huge fan of LISP and related programming languages (scheme, racket, etc.) for a very long time. I've been evangelizing the utility of learning how the language works and understanding functional concepts among my friends for almost as long, and recently a friend of mine finally caved and read McCarthy's original paper. Once he realized how important it was, he decided to learn while solving the L-99, a list of 99 problems for LISP, (based on Haskell's list of 99 problems), except we use Scheme/Racket for our implementations, because of the excellent Dr. Racket environment on both Windows and Linux.

The purpose of the L-99 was to provide a series of challenges that would force someone to learn about the language, its syntax and structure, standard paradigms, etc., which something I think it accomplishes very well. But I think it also provides a great backdrop for discussing algorithms regardless of the language they are written in. L-99 therefore provides two opportunities: the first is learning how to do something in LISP, and the second is learning how to do it "the best way"—the one with the lowest time and/or memory complexities. An example I'd like to mention is the list reversal problem, #5, because it is easy to understand and has two easy solutions with very different characteristics. My friend's solution (in Racket), which works correctly, is given below:

(define (list-reverse x)
  (cond
    [(null? (cdr x)) x]
    [else (append (list-reverse (cdr x)) (list (car x)))]))

As I mentioned above, his solution works correctly (aside from a little bit of parameter validation that is missing), but a close reading reveals that its time complexity is O(n^2). This is because the append function is O(n), and it is called once per item in the list to reverse. In another language, with another library (say, C++ with STL containers), appending to a list is a constant time operation, which is likely why he didn't think much of using append here. That said, an opaque list structure can be made (in LISP) that offers constant time append operations, but this isn't how a chain of cons cells works. For comparison, here is my solution to the problem, which uses a helper function to reverse the list in O(n) time:

(define (list-reverse-linear x)
  (define (lr-inner xa xr)
    (cond
      [(empty? (cdr xa)) (cons (car xa) xr)]
      [else (lr-inner (cdr xa) (cons (car xa) xr))]))
  (cond
    [(not (pair? x)) x]
    [(empty? (cdr x)) x]
    [else (lr-inner (cdr x) (cons (car x) null))]))

I suppose it is obvious to most people that faster asymptotic complexity is always better, but just for the sake of argument, let's look at the time required for reversing a list with 10,000 elements in it:

; Input
(define (gen-list n)
  (cond
    [(= n 0) null]
    [else (cons n (gen-list (- n 1)))]))
(define tl (gen-list 10000))
(time-apply list-reverse (cons tl null))
(time-apply list-reverse-linear (cons tl null))

; Result is, in milliseconds (cpu real gc)
(3432 3427 2293)
(0 1 0)

A stark contrast—based on a measurement with time-apply, it takes nearly 3.5 seconds to reverse the list using the O(n^2) algorithm, but less than one millisecond to do so with the O(n) algorithm. A big part of this, of course, also comes from the garbage collection time, which accounts for nearly 2/3 of the run time in the slower algorithm, due to how many intermediate lists are generated and discarded, while the linear algorithm does not allocate any unused items. Another important difference is that my solution is tail-recursive, which enables significant optimizations (ones that Dr. Racket is capable of doing, too), compared to my friend's solution.

I think that the L-99 has a wonderful secondary purpose. Not only can it be used to introduce someone to the standard idioms, tools, and techniques of functional programming, but it can also be used to introduce and discuss analysis of algorithms, as the problems can generally be solved in multiple ways, with different time and space complexity requirements. It even provides a platform to discuss different implementation strategies (such tail-recursion vs. not) and experiment with their effects on performance, all with problems and solutions that are very simple to state and understand. This is a great aspect of LISP as a tool for learning in general—it frees you as a programmer from worrying about necessary but conceptually irrelevant implementation details, to focus on understanding what you're doing, why you're doing it, and how you're doing it, on an algorithmic level. Mastering the relevant skills translates readily into using other languages, where it helps separate the problems—algorithms, memory management, program structure, inheritance organization, the list goes on—out in your mind, allowing you to solve each individually.

Normally I'd write another 3+ pages, at minimum, trying to beat this to death, but I think this is a pretty self-defending position. Instead, I'd like to ask you a few questions to think about and maybe take with you as you do or learn about programming. Have you ever looked at the L-99? Can you solve all the problems, even if you don't take the time to write out complete solutions? Are your solutions optimal in their asymptotic complexities? I'd also love to hear about any particularly clever solutions people have created.