## Log For Haskell: Week 2

Day 1, April 17, 2020: Typeclasses

Typeclasses are among the most powerful features in Haskell. They allow us to define generic interfaces that provide a common feature set over a wide variety of types; defining  a set of functions that can have different implementations depending on the type of data they are given.

Here is a typeclass is defined:

class BasicEq a where
isEqual :: a -> a -> Bool
isEqual x y = not (isNotEqual3 x y)

isNotEqual :: a -> a -> Bool
isNotEqual x y = not (isEqual x y)

and here is how it is used (instantiated):

instance BasicEq Bool where
isEqual True True = True
isEqual False False = True
isEqual _ _ = False

and also

instance BasicEq Color where
isEqual Red Red = True
isEqual Green Green = True
isEqual Blue Blue = True
isEqual _ _ = False

This is like refl in Martin-Löf type theory.  If BasicEq or ==is not implemented for a type, you could say (in the abstract) that that type has no notion of equality since = is uninhabited.)  You can instantiate an existing typeclass for a new type:

instance Show Color where
show Red = "Red"
show Green = "Green"
show Blue = "Blue"

On the Read and Show typeclasses:

You may often have a data structure in memory that you need to store on disk for later retrieval or to send across the network. The process of converting data in memory to a flat series of bits for storage is called serialization. It turns out that read and show make excellent tools for serialization. show produces output that is both human- and machine-readable. Most show output is also syntactically valid Haskell, though it is up to people that write Show instances to make it so.

Example

First write a data structure to disk:

ghci> let d1 = [Just 5, Nothing, Nothing, Just 8]::[Maybe Int]
ghci> putStrLn (show d1)
[Just 5,Nothing,Nothing,Just 8]
ghci> writeFile "test" (show d1)

ghci> input <- readFile "test"
"[Just 5,Nothing,Nothing,Just 8]"
ghci> let d2 = read input
Ambiguous type variable a' in the constraint:
Read a' arising from a use of read' at <interactive>:1:9-18
Probable fix: add a type signature that fixes these type variable(s)

ghci> let d2 = (read input)::[Maybe Int]
ghci> print d1
[Just 5,Nothing,Nothing,Just 8]
ghci> print d2
[Just 5,Nothing,Nothing,Just 8]
ghci> d1 == d2
True

Automatic derivation

data Color = Red | Green | Blue
deriving (Read, Show, Eq, Ord)

Pragmas

{-# LANGUAGE TypeSynonymInstances #-}

Data versus Newtype

data DataInt = D Int
deriving (Eq, Ord, Show)

newtype NewtypeInt = N Int
deriving (Eq, Ord, Show)

When we declare a newtype, we must choose which of the underlying type’s typeclass instances we want to expose. Here, we’ve elected to make NewtypeInt provide Int’s instances for Eq, Ord, and Show. As a result, we can compare and print values of type

NewtypeInt: ghci> N 1 < N 2
True

Since we are not exposing Int’s Num or Integral instances, values of type NewtypeInt are not numbers. For instance, we can’t add them:

ghci> N 313 + N 37
--> error

Further example (hiding implementation):

newtype UniqueID = UniqueID Int
deriving (Eq)

The compiler treats UniqueID as a different type from Int. As a user of UniqueID, we know only that we have a unique identifier; we cannot see that it is implemented as an Int.

Recap: defining types

1. The datakeyword introduces a truly new algebraic data type.
2. The type keyword gives us a synonym to use for an existing type. We can use the type and its synonym interchangeably.
3. The newtype keyword gives an existing type a distinct identity. The original type and the new type are not interchangeable.

.

## Real World Haskell: Log for Week 1

Yesterday, April 9, 2020, I started reading studying Real World Haskell by Bryan O’Sullivan. It is excellent. Most of all, I like the balanced mix of the theoretical and the practical. I am going to learn a lot. (Some homework on GitHub)

Below is an annotated reading log.  I am keeping it in order to keep motivated. Also to collect a few notes and nuggets along the way. (Note: a “week” in the terminology of this blog does not necessarily consist of seven contiguous days.  That is because stuff happens.)

Day 1, April 9.  Chapters 1 and 2.

Why Care About Types, p. 17

Before we launch into a deeper discussion of Haskell’s type system, let’s talk about why we should care about types at all — what are they even for? At the lowest level, a computer is concerned with bytes, with barely any additional structure. What a type system gives us is abstraction. A type adds meaning to plain bytes: it lets us say “these bytes are text,” “those bytes are an airline reservation,” and so on. Usually, a type system goes beyond this to prevent us from accidentally mixing up types. For example, a type system usually won’t let us treat a hotel reservation as a car rental receipt. The benefit of introducing abstraction is that it lets us forget or ignore low-level details. If I know that a value in my program is a string, I don’t have to know the intimate details of how strings are implemented. I can just assume that my string is going to behave like all the other strings I’ve worked with.

Day 2, April 10.  Chapter 3

Algebraic data types allow us to distinguish between otherwise identical pieces of information. Two tuples with elements of the same type are structurally identical, so they have the same type.

Consider first the tuples a and b:

a = ("Porpoise", "Grey")
b = ("Table", "Oak")

We can pass b to a function expecting an pair consisting of an cetacean’s name and a color. The compiler will not complain, since a and b have the same type.  But there is another way:

data Cetacean = Cetacean String String
data Furniture = Furniture String String

c = Cetacean "Porpoise" "Grey"
d = Furniture "Table" "Oak"

This second way let’s us bring the type system to bear in writing programs with fewer bugs. With the tuples we just defined, we could conceivably pass a description of a whale to a function expecting a chair, and the type system could not help us. With the algebraic data types, there is no such possibility of confusion. (p. 46)

Day 3, April 11, 2020, Chapter 3

Just realized that both of the following are product types:

data Point3Da = Point3Da (Double, Double, Double)
data Point3Db = Point3Db { x :: Double, y:: Double, z:: Double}

But the record syntax gives accessor functions by default, e.g.

Main> b = Point3Db { x = 1, y = 2, z = 3}
Main> x b
1.0

Working through the exercises.  Feels good.

Day 4, April 12, 2020, Chapter 4: Functional Programming

On page 72, an 11-line skeleton of a command-line program for working with files. Nice!

Day 5, April 13, 2020

Having fun working the exercises in Chapter 4, page 84.  Problem 4, scrambling text is especially nice.  View a piece of text as an irregular list of exploded strings, so that “abc” explodes to ["a", "b", "c"]. The core of the problem is to design a function

transpose :: [[String]] -> [[String]]

such that transpose . transpose is the identity, modulo empty lists.  From this one the desired function

transposeString :: String -> String

Here is a sample:

> str = "abc\ndef\ngh"
> transposeString str
> transposeString \$ transposeString str
"abc\ndef\ngh"

Day 6: April 15, 2020: Still in Chapter 4

Missed a day yesterday due to pressing real-world obligations as well as a time-sink of a mini-disaster: my usual computer is ill and possibly approaching its final weeks of life. No way to repair it as we are confined to our apartment due to the pandemic.  Anyway: onward!

In today’s reading: structural recursion, which the O’Donnell introduces in the humble context of writing a function to convert a string representing an integer to an integer. Also a discussion of tail recursion, constant space versus linear space, and tail call optimization.  There is also a good treatment of this X’s book on ML, which, in common with Real World Haskell, has a pleasant mix of the theoretical and the practical.

Following a few more such examples, all done with recursion defined via pattern-matching, the author introduces the map, filter, and foldl functions as a way of capturing common patterns of computation.

Notes

• foldl, which is defined recursively by two pattern-matching equations, yields tail-recursive functions.  That is good for performance: a smart compiler compiles a foldl into a for loop
• Ditto for foldr.  Functions are expressible via foldr if and only if they are primitive recursive.
• See Ravi Chug’s cs223 course for notes on tail recursion.

See also this code in the Elm compiler (List.foldl is stack-safe and. List.foldr is , according to @Robin, stack-safe-ish: it will consume more and more stack space up to a point, then move to a stack-safe but slower algorithm

Day 7, April 16, 2020. On to Chapter 5, Writing a Library (Working with JSON data)

Still have to do a few of the exercises in Chapter 4, but decided to forge ahead today.  Lots of practical advice about designing a library, e.g., this:

Instead of rendering straight to a string, our Prettify module will use an abstract type that we’ll call Doc. By basing our generic rendering library on an abstract type, we can choose an implementation that is flexible and efficient. If we decide to change the underlying code, our users will not be able to tell.

Here is the type signature:

data Doc = Empty
| Char Char
| Text String
| Line
| Concat Doc Doc
| Union Doc Doc
deriving (Show,Eq)

Note that a Doc is really a tree.  There is a concatenation operator:

(<>) :: Doc -> Doc -> Doc
Empty <> y = y
x <> Empty = x
x <> y = x Concat y

so that Empty is the “zero” element.  Actual rendering is done by compact :: Doc -> String`.  I’ll have to come back to this chapter and reread it when I’m writing a package of my own.

Next “Haskell Week:” on to Type classes!

## Compilers and Interpreters

Lately I’ve been thinking about what to do to better understand compilers and interpreters, having done most everything so far by the seat of my pants. Fun, but not necessarily recommended. Some possibilities:

1. Robert Nystrom’s  Crafting Interpreters
2. Stephen Diehl’s Write You a Haskell
3. William Appel’s Modern Compiler Implementation in ML

I looked into the Nystrom’s book, and liked the first chapters, but the code, in Java, puts me off.  I also looked into (3), but that would require learning ML.  I tried this briefly, and it was pleasant.  Most suited to my tastes was Stephen Diehl’s online book (2).  There is, for example, this quote:

In mathematics a function is defined as a correspondence that assigns exactly one element of a set to each element in another set. If a function f(x)=a then the function evaluated at x will always have the value a. Central to the notion of all mathematics is the notion of equational reasoning, where if a=f(x) then for an expression g(f(x),f(x)), this is always equivalent to g(a,a). In other words, the values computed by functions can always be substituted freely at all occurrences.

The implementation may perform effects, but central to this definition is the unobservability of such effects. A function is said to be referentially transparent if replacing a function with its computed value output yields the same observable behavior.

The choice of (2) requires more Haskell than I am comfortable with, but my experience with Elm, another statically typed language of pure functions  should help.  I’m going to take the plunge!