[This post is part of an ongoing challenge to understand 52 papers in 52 weeks. You can read previous entries, here, or subscribe to be notified of new posts by email]
The Hindley-Milner type system is one of the more impressive things in computer science. Global type inference that can figure out the general type of a whole program without a single type annotation anywhere.
I've only ever used it in Haskell and let me tell ya, when I get confused, I delete my type annotations and let the compiler tell me what the hell I'm doing. It's usually right.
But while Hindley-Milner can do parametric polymorphism natively, it needed some work to support ad-hoc polymorphism and become what Haskell's got today. In their 1988 paper, How to make ad-hoc polymorphism less ad hoc Wadler and Blott of the Haskell committee explain how to do just that by introducing type classes.
Type classes are the biggest extension Haskell adds to Hindley-Milner, which makes it a more practical language than its predecessors Miranda and ML. But no more powerful, of course.
Polymorphism
Polymorphism lets us define functions that can act on arguments of different types. Most obvious with operators where writing 1+4
works just as well as writing 1.3+3.14
, you don't have to use addInt
or addFloat
. The compiler handles that for you.
Strachey defined two types of polymorphism - ad-hoc and parametric.
Ad-hoc polymorphism occurs when a function behaves differently for different types, sometimes with completely heterogeneous implementations. Operator overloading is a common example of ad-hoc polymorphism
Parametric polymorphism occurs when a function behaves the same for different data types. length
is a good example, because it doesn't care what type of list it's counting. You can implement a general length
function to behave the same for any list type.
This paper expands Hindley-Milner's parametric polymorphism, with type classes to introduce ad-hoc polymorphism. Because the paper shows how to translate between type classes and pure HM, the authors claim any language using HM typing could potentially be retrofitted with type classes via a preprocessor.
Limitations of ad-hoc polymorphism
The easiest places to look at issues arising from ad-hoc polymorphism are arithmetic operator overloading and equality.
Standard ML takes the simplest approach to operator overloading - arithmetic operators are overloaded, but functions that use them are not. This means while you can write 3*3
or 3.14*3.14
, you cannot define a square function as square x = x*x
and later use terms like square 3
or square 3.14
.
You could solve this with an overloaded square
function, using implementations of type Int -> Int
and Float -> Float
. This becomes unwieldy when you want to have a function squares
that returns a tuple of three squared numbers. You'd need eight different implementations!
Generally speaking, overloaded functions grow exponentially with the number of arguments. Not good.
Equality doesn't fare much better. If you treat it as overloaded, like Standard ML used to, you can use terms such as 3*4 == 12
, but you cannot define functions based on equality. For instance, a function member
that tells you whether something is in a list or not won't have a defined type.
Miranda takes a slightly better approach in that it treats equality as fully polymorphic. Its type is then (==) :: a -> a -> Bool
, but this forces the environment to perform run-time checks on the representation of abstract types.
Some might consider this a bug. Having to look inside an abstraction to decide its type definitely smells funny.
More recent versions of Standard ML take the approach of making equality polymorphic in a limited fashion using something called eqtype variables. This means that type clashes are correctly returned as type errors, but still poses some limitations on the run-time implementation.
Finally, object-oriented programming introduces the idea that users can define their own types. Getting these to support equality means having to force each object to carry with it a pointer to an equality function for that specific type. A dictionary of appropriate equality functions (to compare with different types) is even better.
But a lot of those dictionaries will look exactly the same so we might as well pass them around separately from objects. This is the intuition behind type classes.
Type classes
Let's say we want to overload (+)
, (*)
, and negate
on Int
and Float
. We can do this by introducing a type class called Num
that says "a type a
belongs to Num
if (+)
, `(), and
negate` in appropriate types are defined on it"*.
Now we can define type instances such as Num Int
and specify which functions to translate the overloaded symbols into. We assume things like addInt
and mulInt
are defined by default.
class Num a where (+), (*) :: a -> a -> a negate :: a -> a
instance Num Int where (+) = addInt (*) = mulInt negate = negInt
instance Num Float where (+) = addFloat (*) = mulFloat negate = negFloat
This lets us define both the square
and squares
functions from before, but with a well-defined type at compile time.
square :: Num a => a -> a
square x = x*x
squares :: Num a, Num b, Num c => (a,b,c) -> (a,b,c)
squares (x, y, z) = (square x, square y, square z)
square
is of type a -> a
and the compiler will be able to resolve both square 3
and square 3.14
into their appropriate types. Similarly, squares
no longer needs eight types, just one - (a,b,c) -> (a,b,c)
.
As expected, a call such as square 'c'
will produce a type error because there is no Char
instance of the Num
type class.
Translating to Hindley-Milner
A compiler can use our class and instance definitions to create dictionaries holding pointers to correct methods. For Num
we introduce NumD
as a type constructor for a new type whose values are created using NumDict
. Functions add
, mul
, and neg
take a value of type NumD
and return its first, second, or third component.
data NumD a = NumDict (a -> a -> a) (a -> a -> a) (a -> a)
add (NumDict a m n) = a
mul (NumDict a m n) = m
neg (NumDict a m n) = n
numDInt :: NumD Int
numDInt = NumDict addInt muLInt negInt
numDFLoat :: NumD Float
numDFloat = NumDict addFloat mulFloat negFloat
To use NumD
a compiler would simply replace all instances of Num
with their respective dictionary values, as identified by the type. For instance, x+y
translates into add numD x y
.
add numD
returns the correct addInt
or addFloat
function as identified by the type of x
and y
, then applies said function on the arguments. It's pretty nifty.
Our square
example becomes square'
:
square' :: NumD a -> a -> a
square' numD x = mul numD x x
Which means that a call such as square 3
will translate into square' numDInt 3
and square 3.2
into square' numDFloat 3
.
A similar conversion works for squares
, just with more characters involved.
Type classes and equality
When applied to equality, type classes don't differ much from Standard ML's eqtype variables. But they allow the compiler to decide types at compile-time rather than run-time and a user can easily extend new classes to support abstract types.
The definition is similar to how we defined Num
earlier - we'll make a type class called Eq
and define instances for Int
and Char
. We'll also define a member
function, which was giving us trouble earlier.
class Eq a where
(==) :: a -> a -> bool
instance Eq Int where
(==) = eqInt
instance Eq Char where
(==) = eqChar
member :: Eq a => [a] -> a -> Bool
member [] y = False
member (x:xs) y = (x == y) \/ member xs y
As you can imagine we can now write terms such as 5 == 4
, 'a' == 'b'
, and member "Haskell" 'k'
or member [1,2,3] 2
. The compiler can infer the correct type each time and using member
on a type that doesn't have an Eq
instance will produce a type error.
But what's really cool is that we can define equality between lists and tuples. Even crazier things - sets, random data types we define ourselves, anything really.
instance Eq a, Eq b => Eq (a,b) where
(u,v) == (x,y) = (u == x) & (v == y)
instance Eq a => Eq [a] where
[] == [] = True
[] == y:ys = False
x:xs == [] = False
x:xs == y:ys = (x == y) & (xs == ys)
Essentially "two tuples are equal if their members are equal" and "lists are equal if they are both empty, or their heads and tails are equal".
Now we can write terms such as "Haskell" == "Curry"
and even member ["Haskell", "Alonzo"] "Moses"
.
The compiler figures this out in much the same way as before - using dictionaries. I'm not going to type it all out but, for instance, integers will have a corresponding eqDInt
function, characters will have an eqDChar
function and so on.
A term such as 3*4 == 12
will translate into eq eqDInt (mul numDInt 3 4) 12
.
Subclasses
So far we've treated Num
and Eq
as completely different classes. But it makes sense that all numerical types should also be comparable, while all comparable types might not be numerical.
We can make Num
a subclass of Eq
:
class Eq a => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
negate :: a -> a
This asserts that a
may belong to class Num
only if it also belongs to Eq
, making Num
a subclass of Eq
. All other class and instance declarations remain the same. Things magically just work.
Now we can write functions like this:
memsq :: Num a => [a] -> a -> Bool
memsq xs x = member xs (square x)
Because Eq
is implied by Num
, we didn't have to mention it in the type. Neat.
A nice consequence of dictionary-based translation is also that we can define as many super- and subclasses as we want and it doesn't confuse the compiler in the least. This is a great advantage from object-oriented languages where having many superclasses usually poses implementation problems.
Conclusion
Now you know how type classes work in Haskell. They introduce a lot of neat things that help us write more expressive code while, naturally, not increasing the power of the language.
The only issue with type classes is that they introduce extra parameters to be passed around at run-time (the dictionaries), but that's not too bad.
The rest of the paper deals with formalising this intuitive definition of type classes using lambda calculus. But I'm not going to include that in my summary, it's too mathsy and doesn't add much to understanding what's going on. At least it didn't for me.
That said, I finally understand how Haskell's type system works. Now if only I could find more excuses to actually use Haskell.
Continue reading about Week 20: Making ad-hoc polymorphism less ad hoc
Semantically similar articles hand-picked by GPT-4
- Learning me a Haskell
- Deca - a cool systems programming language
- The birth of LISP - a summary of John McCarthy's original paper
- Cool thing Thursday: Ceylon
- Week 1: Turing's On computable numbers
Learned something new?
Read more Software Engineering Lessons from Production
I write articles with real insight into the career and skills of a modern software engineer. "Raw and honest from the heart!" as one reader described them. Fueled by lessons learned over 20 years of building production code for side-projects, small businesses, and hyper growth startups. Both successful and not.
Subscribe below 👇
Software Engineering Lessons from Production
Join Swizec's Newsletter and get insightful emails 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.
"Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"
Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.
Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.
Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.
Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev
Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization
Want to get my best emails on JavaScript, React, Serverless, Fullstack Web, or Indie Hacking? Check out swizec.com/collections
Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog
Want to brush up on your modern JavaScript syntax? Check out my interactive cheatsheet: es6cheatsheet.com
By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️