The Hindley-Milner type system is one of the more impressive things in computer science. Global type inference that can figure out the general type of a whole program without a single type annotation anywhere.
I’ve only ever used it in Haskell and let me tell ya, when I get confused, I delete my type annotations and let the compiler tell me what the hell I’m doing. It’s usually right.
But while Hindley-Milner can do parametric polymorphism natively, it needed some work to support ad-hoc polymorphism and become what Haskell’s got today. In their 1988 paper, How to make ad-hoc polymorphism less ad hoc Wadler and Blott of the Haskell committee explain how to do just that by introducing type classes.
Type classes are the biggest extension Haskell adds to Hindley-Milner, which makes it a more practical language than its predecessors Miranda and ML. But no more powerful, of course.
Polymorphism lets us define functions that can act on arguments of different types. Most obvious with operators where writing
1+4 works just as well as writing
1.3+3.14, you don’t have to use
addFloat. The compiler handles that for you.
Strachey defined two types of polymorphism – ad-hoc and parametric.
Ad-hoc polymorphism occurs when a function behaves differently for different types, sometimes with completely heterogeneous implementations. Operator overloading is a common example of ad-hoc polymorphism
Parametric polymorphism occurs when a function behaves the same for different data types.
length is a good example, because it doesn’t care what type of list it’s counting. You can implement a general
length function to behave the same for any list type.
This paper expands Hindley-Milner’s parametric polymorphism, with type classes to introduce ad-hoc polymorphism. Because the paper shows how to translate between type classes and pure HM, the authors claim any language using HM typing could potentially be retrofitted with type classes via a preprocessor.
Limitations of ad-hoc polymorphism
The easiest places to look at issues arising from ad-hoc polymorphism are arithmetic operator overloading and equality.
Standard ML takes the simplest approach to operator overloading – arithmetic operators are overloaded, but functions that use them are not. This means while you can write
3.14*3.14, you cannot define a square function as
square x = x*x and later use terms like
square 3 or
You could solve this with an overloaded
square function, using implementations of type
Int -> Int and
Float -> Float. This becomes unwieldy when you want to have a function
squares that returns a tuple of three squared numbers. You’d need eight different implementations!
Generally speaking, overloaded functions grow exponentially with the number of arguments. Not good.
Equality doesn’t fare much better. If you treat it as overloaded, like Standard ML used to, you can use terms such as
3*4 == 12, but you cannot define functions based on equality. For instance, a function
member that tells you whether something is in a list or not won’t have a defined type.
Miranda takes a slightly better approach in that it treats equality as fully polymorphic. Its type is then
(==) :: a -> a -> Bool, but this forces the environment to perform run-time checks on the representation of abstract types.
Some might consider this a bug. Having to look inside an abstraction to decide its type definitely smells funny.
More recent versions of Standard ML take the approach of making equality polymorphic in a limited fashion using something called eqtype variables. This means that type clashes are correctly returned as type errors, but still poses some limitations on the run-time implementation.
Finally, object-oriented programming introduces the idea that users can define their own types. Getting these to support equality means having to force each object to carry with it a pointer to an equality function for that specific type. A dictionary of appropriate equality functions (to compare with different types) is even better.
But a lot of those dictionaries will look exactly the same so we might as well pass them around separately from objects. This is the intuition behind type classes.
Let’s say we want to overload
Float. We can do this by introducing a type class called
Num that says “a type
a belongs to
, andnegate` in appropriate types are defined on it”*.
Now we can define type instances such as
Num Int and specify which functions to translate the overloaded symbols into. We assume things like
mulInt are defined by default.
class Num a where (+), (*) :: a -> a -> a negate :: a -> a instance Num Int where (+) = addInt (*) = mulInt negate = negInt instance Num Float where (+) = addFloat (*) = mulFloat negate = negFloat
This lets us define both the
squares functions from before, but with a well-defined type at compile time.
square :: Num a => a -> a square x = x*x squares :: Num a, Num b, Num c => (a,b,c) -> (a,b,c) squares (x, y, z) = (square x, square y, square z)
square is of type
a -> a and the compiler will be able to resolve both
square 3 and
square 3.14 into their appropriate types. Similarly,
squares no longer needs eight types, just one –
(a,b,c) -> (a,b,c).
As expected, a call such as
square 'c' will produce a type error because there is no
Char instance of the
Num type class.
Translating to Hindley-Milner
A compiler can use our class and instance definitions to create dictionaries holding pointers to correct methods. For
Num we introduce
NumD as a type constructor for a new type whose values are created using
neg take a value of type
NumD and return its first, second, or third component.
data NumD a = NumDict (a -> a -> a) (a -> a -> a) (a -> a) add (NumDict a m n) = a mul (NumDict a m n) = m neg (NumDict a m n) = n numDInt :: NumD Int numDInt = NumDict addInt muLInt negInt numDFLoat :: NumD Float numDFloat = NumDict addFloat mulFloat negFloat
NumD a compiler would simply replace all instances of
Num with their respective dictionary values, as identified by the type. For instance,
x+y translates into
add numD x y.
add numD returns the correct
addFloat function as identified by the type of
y, then applies said function on the arguments. It’s pretty nifty.
square example becomes
square' :: NumD a -> a -> a square' numD x = mul numD x x
Which means that a call such as
square 3 will translate into
square' numDInt 3 and
square 3.2 into
square' numDFloat 3.
A similar conversion works for
squares, just with more characters involved.
Type classes and equality
When applied to equality, type classes don’t differ much from Standard ML’s eqtype variables. But they allow the compiler to decide types at compile-time rather than run-time and a user can easily extend new classes to support abstract types.
The definition is similar to how we defined
Num earlier – we’ll make a type class called
Eq and define instances for
Char. We’ll also define a
member function, which was giving us trouble earlier.
class Eq a where (==) :: a -> a -> bool instance Eq Int where (==) = eqInt instance Eq Char where (==) = eqChar member :: Eq a => [a] -> a -> Bool member  y = False member (x:xs) y = (x == y) \/ member xs y
As you can imagine we can now write terms such as
5 == 4,
'a' == 'b', and
member "Haskell" 'k' or
member [1,2,3] 2. The compiler can infer the correct type each time and using
member on a type that doesn’t have an
Eq instance will produce a type error.
But what’s really cool is that we can define equality between lists and tuples. Even crazier things – sets, random data types we define ourselves, anything really.
instance Eq a, Eq b => Eq (a,b) where (u,v) == (x,y) = (u == x) & (v == y) instance Eq a => Eq [a] where  ==  = True  == y:ys = False x:xs ==  = False x:xs == y:ys = (x == y) & (xs == ys)
Essentially “two tuples are equal if their members are equal” and “lists are equal if they are both empty, or their heads and tails are equal”.
Now we can write terms such as
"Haskell" == "Curry" and even
member ["Haskell", "Alonzo"] "Moses".
The compiler figures this out in much the same way as before – using dictionaries. I’m not going to type it all out but, for instance, integers will have a corresponding
eqDInt function, characters will have an
eqDChar function and so on.
A term such as
3*4 == 12 will translate into
eq eqDInt (mul numDInt 3 4) 12.
So far we’ve treated
Eq as completely different classes. But it makes sense that all numerical types should also be comparable, while all comparable types might not be numerical.
We can make
Num a subclass of
class Eq a => Num a where (+) :: a -> a -> a (*) :: a -> a -> a negate :: a -> a
This asserts that
a may belong to class
Num only if it also belongs to
Num a subclass of
Eq. All other class and instance declarations remain the same. Things magically just work.
Now we can write functions like this:
memsq :: Num a => [a] -> a -> Bool memsq xs x = member xs (square x)
Eq is implied by
Num, we didn’t have to mention it in the type. Neat.
A nice consequence of dictionary-based translation is also that we can define as many super- and subclasses as we want and it doesn’t confuse the compiler in the least. This is a great advantage from object-oriented languages where having many superclasses usually poses implementation problems.
Now you know how type classes work in Haskell. They introduce a lot of neat things that help us write more expressive code while, naturally, not increasing the power of the language.
The only issue with type classes is that they introduce extra parameters to be passed around at run-time (the dictionaries), but that’s not too bad.
The rest of the paper deals with formalising this intuitive definition of type classes using lambda calculus. But I’m not going to include that in my summary, it’s too mathsy and doesn’t add much to understanding what’s going on. At least it didn’t for me.
That said, I finally understand how Haskell’s type system works. Now if only I could find more excuses to actually use Haskell.
You should follow me on twitter, here.