[This post is part of an ongoing challenge to understand 52 papers in 52 weeks. You can read previous entries, here, or subscribe to be notified of new posts by email]

The Hindley-Milner type system is one of the more impressive things in computer science. Global type inference that can figure out the general type of a whole program without a single type annotation anywhere.

I’ve only ever used it in Haskell and let me tell ya, when I get confused, I delete my type annotations and let the compiler tell me what the hell I’m doing. It’s usually right.

But while Hindley-Milner can do parametric polymorphism natively, it needed some work to support ad-hoc polymorphism and become what Haskell’s got today. In their 1988 paper, How to make ad-hoc polymorphism less ad hoc Wadler and Blott of the Haskell committee explain how to do just that by introducing type classes.

Type classes are the biggest extension Haskell adds to Hindley-Milner, which makes it a more practical language than its predecessors Miranda and ML. But no more powerful, of course.

Polymorphism

Polymorphism lets us define functions that can act on arguments of different types. Most obvious with operators where writing 1+4 works just as well as writing 1.3+3.14, you don’t have to use addInt or addFloat. The compiler handles that for you.

Strachey defined two types of polymorphism – ad-hoc and parametric.

Ad-hoc polymorphism occurs when a function behaves differently for different types, sometimes with completely heterogeneous implementations. Operator overloading is a common example of ad-hoc polymorphism

Parametric polymorphism occurs when a function behaves the same for different data types. length is a good example, because it doesn’t care what type of list it’s counting. You can implement a general length function to behave the same for any list type.

This paper expands Hindley-Milner’s parametric polymorphism, with type classes to introduce ad-hoc polymorphism. Because the paper shows how to translate between type classes and pure HM, the authors claim any language using HM typing could potentially be retrofitted with type classes via a preprocessor.

Limitations of ad-hoc polymorphism

The easiest places to look at issues arising from ad-hoc polymorphism are arithmetic operator overloading and equality.

Standard ML takes the simplest approach to operator overloading – arithmetic operators are overloaded, but functions that use them are not. This means while you can write 3*3 or 3.14*3.14, you cannot define a square function as square x = x*x and later use terms like square 3 or square 3.14.

You could solve this with an overloaded square function, using implementations of type Int -> Int and Float -> Float. This becomes unwieldy when you want to have a function squares that returns a tuple of three squared numbers. You’d need eight different implementations!

Generally speaking, overloaded functions grow exponentially with the number of arguments. Not good.

Equality doesn’t fare much better. If you treat it as overloaded, like Standard ML used to, you can use terms such as 3*4 == 12, but you cannot define functions based on equality. For instance, a function member that tells you whether something is in a list or not won’t have a defined type.

Miranda takes a slightly better approach in that it treats equality as fully polymorphic. Its type is then (==) :: a -> a -> Bool, but this forces the environment to perform run-time checks on the representation of abstract types.

Some might consider this a bug. Having to look inside an abstraction to decide its type definitely smells funny.

More recent versions of Standard ML take the approach of making equality polymorphic in a limited fashion using something called eqtype variables. This means that type clashes are correctly returned as type errors, but still poses some limitations on the run-time implementation.

Finally, object-oriented programming introduces the idea that users can define their own types. Getting these to support equality means having to force each object to carry with it a pointer to an equality function for that specific type. A dictionary of appropriate equality functions (to compare with different types) is even better.

But a lot of those dictionaries will look exactly the same so we might as well pass them around separately from objects. This is the intuition behind type classes.

Type classes

Let’s say we want to overload (+), (*), and negate on Int and Float. We can do this by introducing a type class called Num that says “a type a belongs to Num if (+), `(), andnegate` in appropriate types are defined on it”*.

Now we can define type instances such as Num Int and specify which functions to translate the overloaded symbols into. We assume things like addInt and mulInt are defined by default.

class Num a where
  (+), (*) :: a -> a -> a
  negate   :: a -> a

instance Num Int where
  (+)    = addInt
  (*)    = mulInt
  negate = negInt

instance Num Float where
  (+)    = addFloat
  (*)    = mulFloat
  negate = negFloat

This lets us define both the square and squares functions from before, but with a well-defined type at compile time.

square   :: Num a => a -> a
square x = x*x
 
squares           :: Num a, Num b, Num c => (a,b,c) -> (a,b,c)
squares (x, y, z) = (square x, square y, square z)

square is of type a -> a and the compiler will be able to resolve both square 3 and square 3.14 into their appropriate types. Similarly, squares no longer needs eight types, just one – (a,b,c) -> (a,b,c).

As expected, a call such as square 'c' will produce a type error because there is no Char instance of the Num type class.

Translating to Hindley-Milner

A compiler can use our class and instance definitions to create dictionaries holding pointers to correct methods. For Num we introduce NumD as a type constructor for a new type whose values are created using NumDict. Functions add, mul, and neg take a value of type NumD and return its first, second, or third component.

data NumD a = NumDict (a -> a -> a) (a -> a -> a) (a -> a)
 
add (NumDict a m n) = a
mul (NumDict a m n) = m
neg (NumDict a m n) = n
 
numDInt :: NumD Int
numDInt =  NumDict addInt muLInt negInt
 
numDFLoat :: NumD Float
numDFloat =  NumDict addFloat mulFloat negFloat

To use NumD a compiler would simply replace all instances of Num with their respective dictionary values, as identified by the type. For instance, x+y translates into add numD x y.

add numD returns the correct addInt or addFloat function as identified by the type of x and y, then applies said function on the arguments. It’s pretty nifty.

Our square example becomes square':

square'        :: NumD a -> a -> a
square' numD x =  mul numD x x

Which means that a call such as square 3 will translate into square' numDInt 3 and square 3.2 into square' numDFloat 3.

A similar conversion works for squares, just with more characters involved.

Type classes and equality

When applied to equality, type classes don’t differ much from Standard ML’s eqtype variables. But they allow the compiler to decide types at compile-time rather than run-time and a user can easily extend new classes to support abstract types.

The definition is similar to how we defined Num earlier – we’ll make a type class called Eq and define instances for Int and Char. We’ll also define a member function, which was giving us trouble earlier.

class  Eq a where
  (==) :: a -> a -> bool
 
instance Eq Int where
  (==) = eqInt
 
instance Eq Char where
  (==) = eqChar
 
member          :: Eq a => [a] -> a -> Bool
member [] y     =  False
member (x:xs) y =  (x == y) \/ member xs y

As you can imagine we can now write terms such as 5 == 4, 'a' == 'b', and member "Haskell" 'k' or member [1,2,3] 2. The compiler can infer the correct type each time and using member on a type that doesn’t have an Eq instance will produce a type error.

But what’s really cool is that we can define equality between lists and tuples. Even crazier things – sets, random data types we define ourselves, anything really.

instance Eq a, Eq b => Eq (a,b) where
 (u,v) == (x,y)     = (u == x) & (v == y)
 
instance Eq a => Eq [a] where
  [] == []         = True
  [] == y:ys       = False
 x:xs == []        = False
 x:xs == y:ys      = (x == y) & (xs == ys)

Essentially “two tuples are equal if their members are equal” and “lists are equal if they are both empty, or their heads and tails are equal”.

Now we can write terms such as "Haskell" == "Curry" and even member ["Haskell", "Alonzo"] "Moses".

The compiler figures this out in much the same way as before – using dictionaries. I’m not going to type it all out but, for instance, integers will have a corresponding eqDInt function, characters will have an eqDChar function and so on.

A term such as 3*4 == 12 will translate into eq eqDInt (mul numDInt 3 4) 12.

Subclasses

So far we’ve treated Num and Eq as completely different classes. But it makes sense that all numerical types should also be comparable, while all comparable types might not be numerical.

We can make Num a subclass of Eq:

class Eq a => Num a where
  (+)    :: a -> a -> a
  (*)    :: a -> a -> a
  negate :: a -> a

This asserts that a may belong to class Num only if it also belongs to Eq, making Num a subclass of Eq. All other class and instance declarations remain the same. Things magically just work.

Now we can write functions like this:

memsq :: Num a => [a] -> a -> Bool
memsq xs x = member xs (square x)

Because Eq is implied by Num, we didn’t have to mention it in the type. Neat.

A nice consequence of dictionary-based translation is also that we can define as many super- and subclasses as we want and it doesn’t confuse the compiler in the least. This is a great advantage from object-oriented languages where having many superclasses usually poses implementation problems.

Conclusion

Now you know how type classes work in Haskell. They introduce a lot of neat things that help us write more expressive code while, naturally, not increasing the power of the language.

The only issue with type classes is that they introduce extra parameters to be passed around at run-time (the dictionaries), but that’s not too bad.

The rest of the paper deals with formalising this intuitive definition of type classes using lambda calculus. But I’m not going to include that in my summary, it’s too mathsy and doesn’t add much to understanding what’s going on. At least it didn’t for me.

That said, I finally understand how Haskell’s type system works. Now if only I could find more excuses to actually use Haskell.

Enhanced by Zemanta