So, the last few weeks I’ve been learning the Haskell programming language, and thought I’d share my thoughts on it. Haskell is a pure functional programming language that supports lazy evaluation (I’ll explain all that in a moment), which has been gaining popularity in some circles (and particularly among academia and computer lingusts). It has (semi-)famously been used to write an implementation for the Perl 6 programming language, and also a distributed revision control system (DaRCS).
I’d heard of the language a couple years ago, and have been wanting to learn it since then, but the freely available resources I’d been able to find for learning about the language were frustratingly poor, and printed books were in the range of $80 and up, which is a bit steep for learning a language as a hobby. I finally found an upcoming O’Reilly book, Real World Haskell, whose full content is available for browsing online. Amazon says the printed version’s going to be $50, which is a marked improvement over other books I’d been considering buying. The book’s not perfect—there are several minor beefs I have with it—but the fact remains that it is the highest quality resource for learning Haskell that is freely available. It also has the major advantage that it focuses heavily on real-world applications (as the name implies), including coverage of network programming, and concurrency (multi-threaded or clustered computing), writing language parsers; it even guides the reader through writing software to retrieve ISBN numbers from cell-phone photos of book backs! This probably makes it the most practical resource for learning Haskell, too—pricey or free.
I’m still working through the book, but I’ve supplemented my understanding so far by reading through the official language specification, The Haskell 98 Report, so at this point I’ve gained a pretty solid understanding of the core language and minimum libraries.
Intro to Functional Programming
(If you’re fairly familiar with functional programming, might as well skip this article and wait for part 2.)
I’ve had previous experience with functional programming languages, mainly Lisp and XSLT, which helped in trying to learn Haskell.
Lisp is one of the more popular functional programming languages , and probably the oldest. My experience with it is primarily through the Emacs-Lisp dialect. Emacs is a very powerful text editor that is popular in the Unix world (and particularly on GNU/Linux). It is mostly implemented in Emacs-Lisp, and you can alter the program’s behavior while it’s running by editing the Lisp code. It’s the first Unix editor I learned to use, and though I primarily prefer Vim these days, I’m very comfortable with Emacs (these days I run it in a vi-emulating mode called viper), and have written code in Emacs-Lisp. I’ve also dabbled in Scheme, a Lisp dialect.
XSLT is a language for performing transformations on XML documents, and like Haskell but unlike Lisp is a “pure” functional programming language (more on that in a moment). Unfortunately, it doesn’t really provide a complete set of useful tools (version 1.0, anyway: I’m not very familiar with the XSLT 2.0 spec), so my experience with it was that writing XSLT can be an exercise in frustration.
In imperative programming languages like C, C++, BASIC, Java, and many others, the focus is on performing a series of actions: do A, then do B, etc. In imperative programming, one often deals with data in memory that can be modified through a series of actions (mutable data), and the behavior of bits of code in the program may depend on the current value of some particular piece of data in memory, which may be changed by other bits of code.
Here’s a simple example of imperative programming in C code:
1 2 3 4 5 6 |
int i, a[8], v=1; for (i=0; i<8; i = i + 1) { a[i] = v; v = v * 2 } |
This snippet produces an eight-element array a
whose elements hold successive powers of 2 (1,2,4,8,16,32,64,128). The code amounts to a series of instructions like:
- set the value of
v
andi
to 0 - set the
i
th element of the arraya
to the current value ofv
- set the value of
v
to twice its previous value. - set the value of
i
to one more than its previous value. - if the value of
i
is less than 8, repeat from step 2.
Notice the heavy focus on performing a series of steps, and a dependency on modifying data.
In contrast to imperative programming, functional programming focuses on transforming input data into output data, by defining the function that performs the transformation. In a “pure” functional programming language, it is not possible to change the value of some object of data from one thing to another; in fact, there are no objects—there are only values (and functions; immutable data). This tends to place a higher focus on what the desired result is, rather than how we want to arrive at a given result.
Here’s an example of some code in Haskell that produces a result similar to the imperative snippet above:
1 2 |
powers i v | i >= 0 && i < 8 = v : powers (i+1) (v*2) | otherwise = [] |
If you invoke the function above as powers 0 1
, its result will be a list of successive powers-of-two from 1 to 128, just like for our C code. However, you may notice that the method by which we’ve arrived at that list is somewhat different. Translated to English, this code says:
- Define the function powers(i,v) (for any v, and where 0 ≤ i < 8) to be a list whose first element has the value v, and whose remaining elements are the result of powers(i+1, v*2), or, for any i not matching the qualification just given, the empty list.
…and that’s it. When you invoke powers 0 1
, it constructs the element with value 1, then invokes itself again as powers 1 2
, which constructs an element whose value is 2 and invokes powers 2 4
to construct the third element with a value of 4. When it finally reaches powers 8 256
, it sees that i=8 doesn’t meet the criterion that 0 ≤ i < 8, and so it caps it off with an empty list result (which, since it doesn’t invoke itself recursively again, terminates the list, and the evaluation of powers
).
Functional code like the above can look cryptic to imperative programmers who aren’t used to seeing it, but it’s learned fairly quickly. It happens to correspond fairly closely to how a mathematician would formulate a function definition, so folks who are comfortable with math will tend to feel right at home.
Note that it’s just as easy to do recursion like this in C:
1 2 3 4 5 6 7 |
int *powers(int *a, int i, int v) { if (i != 8) { a[i] = v; powers(i+1, v*2); } } |
However, in real-world C implementations, each invocation of powers
will consume additional memory. It works fine for our example, where we’re limited to a total of nine calls to powers
, but for longer recursions you could run out of stack space, which is not a good thing.
Note that our recursive C example still isn’t pure functional programming, as we’re modifying the values of an existing array, and not producing new values.
Disclaimer: none of the C or Haskell examples are particularly idiomatic; they’re for rough example purposes only. I’d do them differently in real life; but I felt that these examples, as written, may be a little easier to discuss, and more accessible for folks that might not be terribly familiar with either language.
To be continued (very soon)… Part 2 will discuss some of the things in Haskell that I think are really cool. 🙂
well, to me what makes haskell’s awesomeness is lazy evaluation … I hope you will explain something like
> powers = 1 : map (*2) powers
in you second part …
Of course! 🙂
Pingback: micah.cowan.name » Adventures in Haskell, Part 2: Kewlness
if I remember correctly…
take 10 $ map (2^) [1,2..]
should do what you need perfectly. Lazy evaluation is awesome.