phantom types tutorial - Haskell

Welcome to the Functional Programming Zulip Chat Archive. You can join the chat here.

James King

If anyone has time/expertise to review a tutorial on phantom types I'm working on, it would be greatly appreciated:

{-
module: Phantom.hs
description: A phantom types tutorial
author: James King <[email protected]>

Here's a common scenario:

There's a web service you need to integrate with.  It uses different
credentials for the "staging" sandbox environment and the production
one.

It would be bad to use the wrong set of credentials for the wrong
environment.

This module is a tutorial on how to use "phantom types."  This pattern
can ensure that users of our module cannot make the obvious mistake of
mixing up different kinds of credentials meant for different
environments.
-}
module Phantom where

-- | This is the type for staging credentials
data Staging
-- | And this one is for production.
data Production

-- | These two kinds of environments can then be used as values for
-- the type parameter, @kind@.
data Credential kind
  = Credential
  { name  :: String
  , token :: String
  }
  deriving (Eq, Show)

{-
The reason why it's called the "phantom type" pattern is because the
@kind@ type parameter only appears on the left-hand side of the
equals-sign.  It's like a ghost which we'll see if we write a function
that takes @Credential kind@ as a parameter.
-}

type StagingCredential = Credential Staging

-- | This pattern works well with smart-constructors for ensuring that
-- your users create only valid credentials.  Normally this might have
-- a return type like, @Either String StagingCredential@.  For brevity
-- we omit it here.  The important thing is that by using this
-- function to construct values of StagingCredential Haskell will
-- ensure that we can't mix these up with ProductionCredential.
stagingCredential :: String -> String -> StagingCredential
stagingCredential = Credential

-- | It also has the benefit of making our code more descriptive at
-- the type-level.  We can have different functions and behaviours
-- that are specific to StagingCredential without having to also
-- update code relating to the production environment.
stagingRequest :: StagingCredential -> IO ()
stagingRequest _ = putStrLn "Making request to staging server..."

-- | We do the same for our production credentials and have isolated,
-- separated code for the production environment.
type ProductionCredential = Credential Production

productionCredential :: String -> String -> ProductionCredential
productionCredential = Credential

productionRequest :: ProductionCredential -> IO ()
productionRequest _ = putStrLn "Making request to production server..."

-- | Here's a silly example function to demonstrate how our `kind`
-- value disappears like a ghost...
ghostCredential :: Credential kind -> IO ()
ghostCredential creds = do
  -- We can't really talk about the value of `kind` in any of the
  -- terms here... it exists only at the type-level and disappears
  -- here!  Try replacing the body of this function with a type hole
  -- and see what binds you get.  See if you can write something here
  -- that pattern-matches on @kind@ so we can have different
  -- behaviours for the different kinds of credentials.  Once you're
  -- convinced it cannot be done, feel free to move on.
  putStrLn $ "ghostCredential: " ++ show creds

-- | If you were frustrated, that's okay! Maybe we cannot write code
-- that is polymorphic over our credentials... Don't give up!  If you
-- want to be able to write code that is polymorphic over @kind@ you
-- still can!  In order to talk about @kind@ we need to use type
-- classes.
class Request kind where
  executeRequest :: Credential kind -> IO ()

-- | When we give an instance for Request we can fix the type of
-- @kind@.  We have to.  That's what an instance is by definition.  So
-- for this implementation of @executeRequest@ we give the one we
-- defined above for staging credentials.
instance Request Staging where
  executeRequest = stagingRequest

-- | And the same for Production.
instance Request Production where
  executeRequest = productionRequest

example :: IO ()
example = do
  -- And now, using our smart constructors we can create credentials
  -- for the respective environments...
  let staging = stagingCredential "staging" "foo-token"
      production = productionCredential "production" "bar-token"

  -- And we can dispatch to  the correct implementation using the type
  -- class function!
  executeRequest staging
  executeRequest production

{-
When to use this pattern:

  - When you have a type that can be used in many contexts
  - And using the type in the wrong context would be a programming error

It's also useful in situations where you might use a sum type but the
constructors of that sum type would have no semantic meaning other
than expressing choice.  For example:
-}

-- | Instead of having our phantom type parameter here we have a
-- plain, unadorned record.
data SumCredential
  = SumCredential
  { sumName  :: String
  , sumToken :: String
  }
  deriving (Eq, Show)

-- | And we use a sum type to add the distinct contexts to our
-- unadorned type.  The constructors of our sum type here stand in as
-- the phantom type parameter instead.  The value contained in each
-- case is the same.
data SumCredentialKind
  = SumStaging SumCredential
  | SumProduction SumCredential
  deriving (Eq, Show)

-- | This enables us to write our request function using pattern
-- matching.  This seems straight-forward: we only use functions and
-- data structures and don't need to use type classes and type
-- parameters!
sumExecuteRequest :: SumCredentialKind -> IO ()
sumExecuteRequest (SumStaging _) = putStrLn "Making sum request to staging..."
sumExecuteRequest (SumProduction _) = putStrLn "Making sum request to prod..."

sumExample :: IO ()
sumExample = do
  let staging = SumStaging $ SumCredential "staging" "staging-token"
      production = SumProduction $ SumCredential "production" "production-token"
  sumExecuteRequest staging
  sumExecuteRequest production

-- | But we lose the ability to distinguish the kinds of credentials
-- at the type level.  Which has the effect of forcing our code to
-- couple together under pattern matching at the term-level.  Using a
-- smart constructor like this doesn't give us any type hints that
-- this produces a credential for a staging environment. In order to
-- find out what kind of credential our user is holding they'll have
-- to inspect it!
sumStagingCredential :: String -> String -> SumCredentialKind
sumStagingCredential sname stoken = SumStaging $ SumCredential sname stoken

modifySumStagingRequest :: SumCredentialKind -> SumCredentialKind
modifySumStagingRequest (SumStaging creds) =
  if sumName creds == "specialCreds"
  then SumStaging creds { sumToken = "special-token" }
  else SumStaging creds

-- | We have to handle every case even if we're only interested in one
-- unless we ignore exhaustive pattern matching... which is not what
-- we want to do.
modifySumStagingRequest creds = creds

-- | Contrast that here where we have the same behaviour but no need
-- to mention the alternative case.
modifyStagingCredential :: StagingCredential -> StagingCredential
modifyStagingCredential creds =
  if name creds == "specialCreds"
  then creds { token = "special-token" }
  else creds

{-
There are many use-cases for this pattern.  You could have a
data-structure with user-submitted input that uses a phantom type
parameter to check if the data has been validated.  That way code can
be guaranteed by the type system to not handle unrecognized inputs.
I'm sure once you get a little more experience with this pattern you
will see more.

What I like about this pattern is how it can separate concerns and
keep code decoupled.  Using type constructors for control flow is a
code smell in my opinion and an opportunity to enrich your code with
stronger, more flexible types.

This pattern works well for the scenarios listed earlier in this
tutorial.  If you find that your application is requiring data in
order to make a decision about which method to dispatch to then it's
not going to be a good fit.  It only really works well if your
sum-types are merely being used to pattern match to different
implementations.

When you see that code-smell you can turn those constructors into
types and create a phantom type parameter to the type they formerly
contained as constructors:
-}

data Foo = Foo String Int

-- | Lift the constructors up...
data Bar
  = Baz Foo
  | Quz Foo

data Baz'
data Quz'

-- | And add a phantom type parameter to Foo!
data Foo' a = Foo' String Int

-- | Aliases actually make code more clear...
type FooBaz = Foo' Baz'
type FooQuz = Foo' Quz'

{-
Happy hacking!
-}
TheMatten

Try replacing the body of this function with a type hole
and see what binds you get. See if you can write something here
that pattern-matches on @kind@ so we can have different
behaviours for the different kinds of credentials. Once you're
convinced it cannot be done, feel free to move on.

Not sure if the example is going to work well in this case - you could just as well supply some opaque GADT that actually makes use of the type and user wouldn't be able to tell a difference
Maybe what you can show is one can easily "repack" a credential into one with different type without changing it's content:

repack :: forall a b . Credential a -> Credential b
repack (Credential n t) = Credential n t
TheMatten

BTW, if I remember right, record updates can be type-changing - so creds { token = "special-token" } would actually have this repacking behaviour, which probably isn't what you want

James King

Thanks a bunch! That helps a lot.

TheMatten

No problem, otherwise it's a nice resource :smile:

Torsten Schmits

you can also add this:

data Env = Prod | Staging

data Creds (env :: Env)

if you want to be expressive with the type param (and close it)

Gabriel Lebec

Small nit: I object to using the param name kind as it is a type variable, not a kind variable. It happens to be fillable by a type of any kind (like Maybe :: Type -> Type or Int :: Type), but not a kind (like Type -> Type or Type) per se.

James King

Gabriel Lebec said:

Small nit: I object to using the param name kind as it is a type variable, not a kind variable. It happens to be fillable by a type of any kind (like Maybe :: Type -> Type or Int :: Type), but not a kind (like Type -> Type or Type) per se.

A good point, I felt it read a little odd too. Thanks for the suggestion. :)