Understanding Existential Types

Posted on March 31, 2017

Existential types are a less well-known extension to Haskell’s type system and Damas-Hindley-Milner type theory, although they are a logical consequence of the identification of type theory with predicate logic: If type variables can be universally quantified then it seems logical they can also be existentially quantified. I have used existential types in Haskell on several occasions but my brain always struggled to really understand what was going on and how to fix the dreaded error messages I got back from the compiler.

While working on my talk on type-driven development I used existentials to model one of the evolution step of the quizz program I was working on and had some sort of epiphany. This short post is an attempt to capture the insights I gathered in the hope it might be useful for other people.

The Problem

The problem is pretty simple and quite common. A quizz is made up from different types of questions which are part of a quizz, say open questions, multiple choice questions, rating questions. Each question expects a different type of answer: A free text for open questions, selected option for MCQ, a number for rating questions, but the answer is always initially given by the user as a text which must be converted to the right form.

So we have the following (abridged) types of questions:

data OpenQuestion 
data MCQuestion
data RatingQuestion

We expose the API of questions as a typeclass in order to be future-proof (we would like to make it easy to introduce new types of questions):

class Questionable q where
    type Answer q :: *
    -- other methods omitted....
    isCorrectAnswer :: Answer q -> q -> Bool

and then we need to define a Quizz as a sequence of questions and some functions to compute the answers to the questions and assess the user’s result. The User is simply modelled as a function that provides an answer (or no answer) as a string, given any question.

type User = Text -> Maybe Text

data Quizz = Quizz { questions :: [ Question ] }

answerQuestion :: User -> Question -> Quizz -> Quizz
answerQuestion user question quizz = ...

answers :: User -> Quizz -> Quizz
answers user quizz =
 foldr (answerQuestion user) quizz (questions quizz)

The key issue is then to define Question in such a way that it allows us to work with current (and future) question’s types in a uniform way.

Using Existential type

A very simple solution would be to wrap each possible type in a specialized constructor, which in essence amounts to tag each possible Question with its type:

data Question = 
    MCQ MCQuestion
  | Open OpenQuestion
  | Rating RatingQuestion

This solution, while simple to use as it is amenable to direct pattern-matching, suffers from an obvious drawback: It closes the range of available types of questions, or rather makes it difficult to add a new one.

Given that each type of question is assumed to be an instance of the Questionnable interface, we can wrap them using existential quantification in the type Question, using one of the two available syntaxes. The classic data type declaration syntax:

data Question =
  forall q . Questionable q => Question q

or the GADT-style syntax which explicitly exposes the constructor Question as a function

data Question where
  Question :: Questionable q => q -> Question

The effect of an existential type construction is to limit the scope of the type variable q appearing in the constructor, thus in effect ensuring the question itself, whatever its type, must stay within the scope of its appearance. For example, while it is possible to pattern match on Question to get whatever q is packed inside it, this q cannot be returned. Writing

getQ (Question question _) = question

yields the following, somewhat dreadful, compiler error:

error:
    • Couldn't match expected type ‘t’ with actual type ‘q’
        because type variable ‘q’ would escape its scope
      This (rigid, skolem) type variable is bound by
        a pattern with constructor:
          Question :: forall q.
                      Questionable q =>
                      q -> Question,
        in an equation for ‘getQ’
        at /Users/arnaud/projects/xxi-typed/haskell/.stack-work/intero/intero2342CZJ.hs:11:7-18
    • In the expression: question
      In an equation for ‘getQ’: getQ (Question question) = question
    • Relevant bindings include
        question :: q
          (bound at /Users/arnaud/projects/xxi-typed/haskell/.stack-work/intero/intero2342CZJ.hs:11:16)
        getQ :: Question -> t
          (bound at /Users/arnaud/projects/xxi-typed/haskell/.stack-work/intero/intero2342CZJ.hs:11:1)

This is so because the type of the result question being question :: q implies that the type variable q becomes free in any context where getQ is used: It escapes the scope of the constructor.

Interestingly, the ScopedTypeVariables GHC extension gives us the capability to use that q. We can write the following (somewhat contrived) function:

slug (Question (quest :: q) _) = slugify quest
  where
    slugify :: q -> Text
    slugify = T.take 3 . question

In the type of slugify we are not forced to expose the constraint Questionable q because the type variable q at this point is the one brought in scope with the expression q :: q.

Understanding Existential Types

What’s somewhat confusing for the average programmer who has not had a PhD in type theory is: Why are those types called existential? Especially given the fact they are introduced by forall keyword? This StackOverflow does a great job at explaining how (intuitionistic) logic rules relate \(\forall\) and \(\exists\) quantifiers in the case of type constructors.

In short, it comes from the fact that \[ \forall x. Q(x) \implies P = (\exists x. Q(x)) \implies P. \]

References

There are already a number of resources on the topic: