Welcome to Counterexamples in Type Systems, a compendium of horrible
programs that crash, segfault or otherwise explode.
The “counterexamples” here are programs that go wrong in ways that
should be impossible: corrupt memory in Rust, produce a
ClassCastException
in cast-free Java, segfault in Haskell, and so
on. This book is a collection of such counterexamples, each with some
explanation of what went wrong and references to the languages or
systems in which the problem occurred.
It’s intended as a resource for researchers, designers and
implementors of static type systems, as well as programmers interested
in how type systems fit together (or don’t).
Type systems
For the purposes of this book, a “type system” is a system that does
local checks of a program’s source code, in order to establish some
global property about the program’s behaviour. The local checks
usually assign a type to each variable and verify that they are used
only in contexts appropriate to their type, while the global property
varies but is generally some sort of safety property: the absence of
memory corruption, use-after-free errors, and the like.
This is intentionally a fairly narrow definition. Some examples of
things that are not “type systems”:
-
Linters: Linters do local checks of a programs’s source code,
but don’t try to establish any global property. Just because a
program passes a linter’s checks does not imply anything definite
about its behaviour. -
Python’s type system: Dynamic languages like Python do local
type checks to establish global safety properties. However, the
type checks are not done purely on the program’s source code
before running it, but examine the actual runtime inputs as well. -
C++’s type system: C++ (among other unsafe languages) has a
type system that does many local checks of a programs’s source
code. However, passing these checks does not establish anything
definite about a program’s behaviour: many C++ programs pass all
of the compiler’s type checking, and yet exhibit use-after-free
errors, memory corruption, and arbitrary behaviour when run. -
Some program analysis tools: Some static program analysis
tools do only global checks (requiring the whole program’s source
code to be available at once), while others detect certain issues
but do not attempt to establish any global property when no issues
are found. Still others do fit the definition of “type system”,
though: the line here is blurry.
This is not to say that there’s something wrong with the above tools,
but they’re not what this book is about. Some examples of things that
do fit the definition are the type systems of languages like Java,
Haskell and Rust.
Soundness and counterexamples
Type systems make the claim that any program that passes the local
type checks will definitely have the global safety properties. A type
system is sound if this claim is true, and to say that a system is
unsound is to say that there exists some program which passes all
type checks, and yet whose behaviour violates the global safety
properties (corrupts memory, crashes unexpectedly, etc.). A
counterexample is such a program, exhibiting a soundness bug.
Different languages enforce different properties, so exactly what
“soundness” means varies somewhat. For instance, a global property
established by the Coq type system is that all programs halt. So, a
program that looped infinitely would be a counterexample in Coq, but
not in Java, which makes no such claim.
The goal of this book is to collect such counterexamples, especially
those that crop up in similar forms across multiple languages. These
fall into a couple of broad categories:
-
Missing checks: The simplest sort of soundness bug is when some
important type check is missing from the type checker. Usually,
these are straightforward bugs, easily fixed once discovered,
highly specific to a particular compiler, and not very interesting.
However, certain missing checks are common mistakes that keep
coming up across many different type checkers and languages, and a
few of these are included here: for instance, missing variance
checks or scope checks. -
Feature interactions: Often, two type system features which are
perfectly sound when used in isolation have some tricky
interaction when used together. Examples of this include mixtures
of polymorphism and mutability or
polymorphism and overloading or
intersections and mutability. -
Subtle differences: If two parts of a type system use different
but similar versions of the same concept, the gap between them can
often cause soundness bugs. For instance, does “private” mean
private to an object or private to a class?
Does “unequal” mean “not provably equal” or “provably
distinct”?
Organisation (or lack thereof)
Since so many of the counterexamples here depend on interactions
between disparate parts of a complex type system, it would be tricky
to separate them all into distinct coherent topics, and no attempt to
do so has been made.
Instead, each entry is tagged with the type system features involved,
and in lieu of a table of contents there is an index and
glossary listing the features and the counterexamples in
which they appear.
The entries themselves are mostly independent, and it should be
possible to read them in any order. However, a few of them do refer to
previous entries, and some attempt was made to put the simpler entries
earlier, so if you do intend to read the whole thing then going start
to finish is best.
A note on sources
While much type system research appears in formal academic
publications, it is relatively rare (but not unheard of!) to publish a
paper about an unsound system.
So, as well as academic papers, many of the counterexamples here are
drawn from unpublished notes, the archives of the
TYPES
forum, blogs by programming language researchers, designers and
implementors, and language-specific bugtrackers, forums and mailing
lists. A major reason for the existence of this book is to collect
these various sources together.
In particular, compiler bugtrackers are a major source of the
counterexamples reproduced here. Try not to read too much into how
often different bugtrackers show up: while it is tempting to assume
that the more often a compiler appears, the buggier it is, this would
be wrong for several reasons:
-
the author works professionally on OCaml, so it is somewhat
over-represented due to familiarity, -
the same applies to other languages that the author likes or at
least vaguely understands, and -
languages whose community does a good job of publicly tracking,
explaining, and writing up issues that arise are also
over-represented.