I’ve started working on H-99: Ninety-nine Haskell Problems
Problems 1-10, Lists: April 17, 2020
I’ve started working on H-99: Ninety-nine Haskell Problems
Problems 1-10, Lists: April 17, 2020
Day 1, April 17, 2020: Typeclasses
Typeclasses are among the most powerful features in Haskell. They allow us to define generic interfaces that provide a common feature set over a wide variety of types; defining a set of functions that can have different implementations depending on the type of data they are given.
Here is a typeclass is defined:
class BasicEq a where isEqual :: a -> a -> Bool isEqual x y = not (isNotEqual3 x y) isNotEqual :: a -> a -> Bool isNotEqual x y = not (isEqual x y)
and here is how it is used (instantiated):
instance BasicEq Bool where isEqual True True = True isEqual False False = True isEqual _ _ = False
and also
instance BasicEq Color where isEqual Red Red = True isEqual Green Green = True isEqual Blue Blue = True isEqual _ _ = False
This is like refl
in Martin-Löf type theory. If BasicEq
or ==
is not implemented for a type, you could say (in the abstract) that that type has no notion of equality since =
is uninhabited.) You can instantiate an existing typeclass for a new type:
instance Show Color where show Red = "Red" show Green = "Green" show Blue = "Blue"
On the Read
and Show
typeclasses:
You may often have a data structure in memory that you need to store on disk for later retrieval or to send across the network. The process of converting data in memory to a flat series of bits for storage is called serialization. It turns out that read and show make excellent tools for serialization. show produces output that is both human- and machine-readable. Most show output is also syntactically valid Haskell, though it is up to people that write Show instances to make it so.
Example
First write a data structure to disk:
ghci> let d1 = [Just 5, Nothing, Nothing, Just 8]::[Maybe Int] ghci> putStrLn (show d1) [Just 5,Nothing,Nothing,Just 8] ghci> writeFile "test" (show d1)
Now read it back:
ghci> input <- readFile "test" "[Just 5,Nothing,Nothing,Just 8]" ghci> let d2 = read input Ambiguous type variable `a' in the constraint: `Read a' arising from a use of `read' at <interactive>:1:9-18 Probable fix: add a type signature that fixes these type variable(s) ghci> let d2 = (read input)::[Maybe Int] ghci> print d1 [Just 5,Nothing,Nothing,Just 8] ghci> print d2 [Just 5,Nothing,Nothing,Just 8] ghci> d1 == d2 True
Automatic derivation
data Color = Red | Green | Blue deriving (Read, Show, Eq, Ord)
Pragmas
{-# LANGUAGE TypeSynonymInstances #-}
Data versus Newtype
data DataInt = D Int deriving (Eq, Ord, Show) newtype NewtypeInt = N Int deriving (Eq, Ord, Show)
When we declare a newtype, we must choose which of the underlying type’s typeclass instances we want to expose. Here, we’ve elected to make NewtypeInt provide Int’s instances for Eq, Ord, and Show. As a result, we can compare and print values of type
NewtypeInt: ghci> N 1 < N 2 True
Since we are not exposing Int’s Num or Integral instances, values of type NewtypeInt are not numbers. For instance, we can’t add them:
ghci> N 313 + N 37 --> error
Further example (hiding implementation):
newtype UniqueID = UniqueID Int deriving (Eq)
The compiler treats UniqueID as a different type from Int. As a user of UniqueID, we know only that we have a unique identifier; we cannot see that it is implemented as an Int.
Recap: defining types
data
keyword introduces a truly new algebraic data type.type
keyword gives us a synonym to use for an existing type. We can use the type and its synonym interchangeably.newtype
keyword gives an existing type a distinct identity. The original type and the new type are not interchangeable.
.
Yesterday, April 9, 2020, I started reading studying Real World Haskell by Bryan O’Sullivan. It is excellent. Most of all, I like the balanced mix of the theoretical and the practical. I am going to learn a lot. (Some homework on GitHub)
Below is an annotated reading log. I am keeping it in order to keep motivated. Also to collect a few notes and nuggets along the way. (Note: a “week” in the terminology of this blog does not necessarily consist of seven contiguous days. That is because stuff happens.)
Day 1, April 9. Chapters 1 and 2.
Why Care About Types, p. 17
Before we launch into a deeper discussion of Haskell’s type system, let’s talk about why we should care about types at all — what are they even for? At the lowest level, a computer is concerned with bytes, with barely any additional structure. What a type system gives us is abstraction. A type adds meaning to plain bytes: it lets us say “these bytes are text,” “those bytes are an airline reservation,” and so on. Usually, a type system goes beyond this to prevent us from accidentally mixing up types. For example, a type system usually won’t let us treat a hotel reservation as a car rental receipt. The benefit of introducing abstraction is that it lets us forget or ignore low-level details. If I know that a value in my program is a string, I don’t have to know the intimate details of how strings are implemented. I can just assume that my string is going to behave like all the other strings I’ve worked with.
Day 2, April 10. Chapter 3
Algebraic data types allow us to distinguish between otherwise identical pieces of information. Two tuples with elements of the same type are structurally identical, so they have the same type.
Consider first the tuples a and b:
a = ("Porpoise", "Grey") b = ("Table", "Oak")
We can pass b to a function expecting an pair consisting of an cetacean’s name and a color. The compiler will not complain, since a and b have the same type. But there is another way:
data Cetacean = Cetacean String String data Furniture = Furniture String String c = Cetacean "Porpoise" "Grey" d = Furniture "Table" "Oak"
This second way let’s us bring the type system to bear in writing programs with fewer bugs. With the tuples we just defined, we could conceivably pass a description of a whale to a function expecting a chair, and the type system could not help us. With the algebraic data types, there is no such possibility of confusion. (p. 46)
Day 3, April 11, 2020, Chapter 3
Just realized that both of the following are product types:
data Point3Da = Point3Da (Double, Double, Double)
data Point3Db = Point3Db { x :: Double, y:: Double, z:: Double}
But the record syntax gives accessor functions by default, e.g.
Main> b = Point3Db { x = 1, y = 2, z = 3} Main> x b 1.0
Working through the exercises. Feels good.
Day 4, April 12, 2020, Chapter 4: Functional Programming
On page 72, an 11-line skeleton of a command-line program for working with files. Nice!
Day 5, April 13, 2020
Having fun working the exercises in Chapter 4, page 84. Problem 4, scrambling text is especially nice. View a piece of text as an irregular list of exploded strings, so that “abc” explodes to ["a", "b", "c"]
. The core of the problem is to design a function
transpose :: [[String]] -> [[String]]
such that transpose . transpose
is the identity, modulo empty lists. From this one the desired function
transposeString :: String -> String
Here is a sample:
> str = "abc\ndef\ngh" > transposeString str "adg\nbeh\ncf" > transposeString $ transposeString str "abc\ndef\ngh"
Day 6: April 15, 2020: Still in Chapter 4
Missed a day yesterday due to pressing real-world obligations as well as a time-sink of a mini-disaster: my usual computer is ill and possibly approaching its final weeks of life. No way to repair it as we are confined to our apartment due to the pandemic. Anyway: onward!
In today’s reading: structural recursion, which the O’Donnell introduces in the humble context of writing a function to convert a string representing an integer to an integer. Also a discussion of tail recursion, constant space versus linear space, and tail call optimization. There is also a good treatment of this X’s book on ML, which, in common with Real World Haskell, has a pleasant mix of the theoretical and the practical.
Following a few more such examples, all done with recursion defined via pattern-matching, the author introduces the map, filter, and foldl functions as a way of capturing common patterns of computation.
Notes
See also this code in the Elm compiler (List.foldl
is stack-safe and. List.foldr
is , according to @Robin, stack-safe-ish: it will consume more and more stack space up to a point, then move to a stack-safe but slower algorithm
Day 7, April 16, 2020. On to Chapter 5, Writing a Library (Working with JSON data)
Still have to do a few of the exercises in Chapter 4, but decided to forge ahead today. Lots of practical advice about designing a library, e.g., this:
Instead of rendering straight to a string, our Prettify module will use an abstract type that we’ll call Doc. By basing our generic rendering library on an abstract type, we can choose an implementation that is flexible and efficient. If we decide to change the underlying code, our users will not be able to tell.
Here is the type signature:
data Doc = Empty | Char Char | Text String | Line | Concat Doc Doc | Union Doc Doc deriving (Show,Eq)
Note that a Doc
is really a tree. There is a concatenation operator:
(<>) :: Doc -> Doc -> Doc Empty <> y = y x <> Empty = x x <> y = x `Concat` y
so that Empty
is the “zero” element. Actual rendering is done by compact :: Doc -> String
. I’ll have to come back to this chapter and reread it when I’m writing a package of my own.
Next “Haskell Week:” on to Type classes!
Lately I’ve been thinking about what to do to better understand compilers and interpreters, having done most everything so far by the seat of my pants. Fun, but not necessarily recommended. Some possibilities:
I looked into the Nystrom’s book, and liked the first chapters, but the code, in Java, puts me off. I also looked into (3), but that would require learning ML. I tried this briefly, and it was pleasant. Most suited to my tastes was Stephen Diehl’s online book (2). There is, for example, this quote:
In mathematics a function is defined as a correspondence that assigns exactly one element of a set to each element in another set. If a function f(x)=a then the function evaluated at x will always have the value a. Central to the notion of all mathematics is the notion of equational reasoning, where if a=f(x) then for an expression g(f(x),f(x)), this is always equivalent to g(a,a). In other words, the values computed by functions can always be substituted freely at all occurrences.
The implementation may perform effects, but central to this definition is the unobservability of such effects. A function is said to be referentially transparent if replacing a function with its computed value output yields the same observable behavior.
The choice of (2) requires more Haskell than I am comfortable with, but my experience with Elm, another statically typed language of pure functions should help. I’m going to take the plunge!