Climb That Mountain: 2018

Wednesday, November 14, 2018

Hspec with QuickCheck

This blog post will not describe what property testing is, there are other writings about it. Instead, I'll try to show you how to get started with it, as finding introductory tutorials on the topic is scarce, this is my attempt to fill that void.

I heard about property-based testing in 2015, when I attended Michael Nygaard’s StrangeLoop talk. As a long time TDD-er, I was highly skeptical of generating tests for my own code, but the idea started to grow on me.

Where would I use property testing in a real project? I was presented with a task at my job to solve this: "Write QuickCheck tests for a loan amortization algorithm. It has a starting amount, a term length (in month) and an interest rate. The test should verify:

The loan is paid back
The term length matches the input argument
The principal is shrinking every month"

I was eager to work on this task, however, practical examples were limited. The best information was in the book Haskell Programming from First Principles. I kept coming back to its chapter on testing to pick up what I know.

I wanted to learn with a simple example, I chose the Roman Numeral kata. The converter has 2 functions:

Convert Arabic to Roman (convertToRoman)
Convert the Roman back to Arabic (convertFromRoman)

Following the reverse list example should be trivial: converting an Arabic number to Roman and back to Arabic should give the same number. The Haskell Book is using a similar example with morse code conversions.

I'll go through the good old TDD solution of the Roman Numeral kata in less detail, as the goal of this post is to describe how to use QuickCheck.

Here is how the code starts out:

module RomanNumeralsSpec where

import Test.Hspec

main :: IO ()
main = hspec spec

type Roman = String

convertToRoman :: Int -> Roman
convertToRoman 1 = "I"
convertToRoman 2 = "II"
convertToRoman 3 = "III"
convertToRoman 4 = "IV"

convertFromRoman :: Roman -> Int
convertFromRoman "" = undefined

spec :: Spec
spec =
  describe "Converting to Roman Numerals" $ do
    it "converts 1 to I" $
      convertToRoman 1 `shouldBe` "I"
    it "converts 2 to II" $
      convertToRoman 2 `shouldBe` "II"
    it "converts 3 to III" $
      convertToRoman 3 `shouldBe` "III"
    it "converts 4 to IV" $
      convertToRoman 4 `shouldBe` "IV"

Look how dumb the test cases (and the code itself) are so far. When I introduce custom data types and a rule table for the conversions, the code becomes much simpler. Watch this:

...

type Conversions = [(Int, Roman)]

conversions :: Conversions
conversions =
  [ (4, "IV")
  , (1, "I") ]

convertToRoman :: Int -> Roman
convertToRoman 0 = []
convertToRoman x =
  roman ++ convertToRoman (x - number)
    where
      (number, roman) =
        head . filter (\(a,_) -> a <= x) $ conversions

...

Commit Point

Extending this now is easy, I don't have to modify the logic, just add more values to the conversions table.

...

conversions :: Conversions
conversions =
  [ (90, "XC")
  , (50, "L")
  , (40, "XL")
  , (10, "X")
  , (9, "IX")
  , (5, "V")
  , (4, "IV")
  , (1, "I") ]

...

spec :: Spec
spec =
  describe "Converting to Roman Numerals" $ do
    it "converts 1 to I" $
      convertToRoman 1 `shouldBe` "I"
    it "converts 2 to II" $
      convertToRoman 2 `shouldBe` "II"
    it "converts 3 to III" $
      convertToRoman 3 `shouldBe` "III"
    it "converts 4 to IV" $
      convertToRoman 4 `shouldBe` "IV"
    it "converts 5 to V" $
      convertToRoman 5 `shouldBe` "V"
    it "converts 6 to VI" $
      convertToRoman 6 `shouldBe` "VI"
    it "converts 8 to VIII" $
      convertToRoman 8 `shouldBe` "VIII"
    it "converts 9 to IX" $
      convertToRoman 9 `shouldBe` "IX"
    it "converts 10 to X" $
      convertToRoman 10 `shouldBe` "X"
    it "converts 11 to XI" $
      convertToRoman 11 `shouldBe` "XI"
    it "converts 99 to L" $
      convertToRoman 99 `shouldBe` "XCIX"

Commit point

Oh, these tests are ugly. I'll make them a bit worse before I improve them, sorry about that.

The conversions table can be used to convert Roman numbers to Arabic. The Arabic value is accumulated by adding the matched values together.

Here are the conversion code and the accompanying tests:

...
import Data.List (isPrefixOf, find)
import Data.Maybe (fromJust)
...

convertFromRoman :: Roman -> Int
convertFromRoman "" = 0
convertFromRoman r =
  number + convertFromRoman (drop (length roman) r)
    where
      (number, roman) = fromJust $ find (\(_,r') -> r' `isPrefixOf` r) conversions

  ...

    describe "Roman to Number Conversions" $ do
    it "converts I to 1" $
      convertFromRoman "I" `shouldBe` 1
    it "converts II to 2" $
      convertFromRoman "II" `shouldBe` 2
    it "converts III to 3" $
      convertFromRoman "III" `shouldBe` 3
    it "converts IV to 4" $
      convertFromRoman "IV" `shouldBe` 4
    it "converts V to 5" $
      convertFromRoman "V" `shouldBe` 5
    it "converts VIII to 8" $
      convertFromRoman "VIII" `shouldBe` 8
    it "converts IX to 9" $
      convertFromRoman "IX" `shouldBe` 9
    it "converts X to 10" $
      convertFromRoman "X" `shouldBe` 10
    it "converts XI to 11" $
      convertFromRoman "XI" `shouldBe` 11
    it "converts XCIX to 99" $
      convertFromRoman "XCIX" `shouldBe` 99

Commit Point

I ended up with 47 lines of mindless, verbose and repeated test code.

Let's fire up GHCi, and see how this works in the REPL. Use the Makefile target make repl-test to try this:

λ> convertToRoman 12
"XII"
λ> convertFromRoman . convertToRoman $ 12
12

OK, I can convert an Arabic number to Roman and convert that back to Arabic, I'll use this mechanism to verify the logic with QuickCheck.

Let's explore QuickCheck in the REPL. Please follow along, but notice, that the generated random numbers are going to be different for you.

λ> import Test.QuickCheck
λ> sample (arbitrary :: Gen Int)
0
-1
-3
-4
-2
-9
8
14
11
-18
-16

It works for Strings as well:

λ> sample (arbitrary :: Gen String)
""
"Q\EOT"
"#8"
"^x"
"&}\t\NULU k"
"\DC4\816800<"
"~!oH\763194&e\GSG"
"\STX\869030\194889\1040760\820031\799098w\SIHz"
"\47452\ETX>j\686979\ACK?\1094610\160069\943268\99807+\519462"
""
"\694257s\883691\996507fO_n4m\tF\357900"

Ok, this is great, but what if you just want random numbers between 1 and 3? Use the elements function:

λ> let oneThroughThree = elements [1..3] :: Gen Int
oneThroughThree :: Gen Int
λ> sample' oneThroughThree
[3,1,3,3,3,1,2,1,3,2,2]

Note that I used the sample' function, which returns a list of elements now.

In case you want to have three with more frequencies, just provide a list accordingly. Like this:

λ> let moreThrees = elements [1,2,3,3,3,3] :: Gen Int
moreThrees :: Gen Int
λ> sample' moreThrees
[1,3,3,3,3,2,3,2,1,1,3]

In the sample list the 3s are represented with higher frequencies.

But what if you want a combination of Strings and Integers for your needs? Well, you can build a short function for that:

import Test.QuickCheck
...
genTuple :: (Arbitrary a, Arbitrary b)
         => Gen (a, b)
genTuple = do
  a <- arbitrary
  b <- arbitrary
  return (a, b)

Fire up the REPL with the test code (I used the included Makefile's repl-test target), and sample this new function to get a list of tuples with Integers and Strings:

λ> import Test.QuickCheck
λ> sample' (genTuple :: Gen (Int, String))
[(0,""),(-1,"P!"),(0,"\703651\888426O"),(-5,"\DLEs_\640436>\a"),(-6,""),(10,"T\45432\&5?\STX.j"),(-10,"h\266318\SUB\175378"),(9,"\ESC\978066"),(16,"FQ@w;'I^\EM\NUL"),(7,""),(12,"\ACK\569630\49462")]

Commit Point

After this brief intro, let's replace the manual tests with generated tests by QuickCheck.

This is the function I came up with to test the logic:

prop_convertNumber :: Int -> Bool
prop_convertNumber x = (convertFromRoman . convertToRoman) x == x

I gave it a try in the REPL:

λ> import Test.QuickCheck
λ> quickCheck prop_convertNumber
*** Failed! Exception: 'Prelude.head: empty list' (after 3 tests and 1 shrink):
-1
λ> quickCheck prop_convertNumber
*** Failed! Exception: 'Prelude.head: empty list' (after 6 tests and 1 shrink):
-1

Oh, crap… QuickCheck generated -1, but my logic only works with positive numbers. I have to tell QuickCheck what numbers it can use.

Commit Point

A valid range needs to be passed to QuickCheck to exercise the number conversion logic, the elements function could do just that. I add these two functions to my test:

numbers :: [Int]
numbers = [1..1000]

genNumbers :: Gen Int
genNumbers = elements numbers

I could easily verify in the REPL that the genNumbers function is producing numbers within the defined range:

λ> import Test.QuickCheck
λ> sample' genNumbers
[745,853,321,678,436,711,825,593,441,900,315]
λ> sample' genNumbers
[706,110,263,36,807,589,555,444,60,261,116]

Great, only positive numbers are generated, I can use this set for my tests. I change the property testing function to use the genNumbers generated values in the test:

prop_convertNumber :: Property
prop_convertNumber =
  forAll genNumbers
    (\x ->
      (convertFromRoman . convertToRoman) x == x)

This works as expected when I test it in the REPL:

λ> import Test.QuickCheck
λ> quickCheck prop_convertNumber
+++ OK, passed 100 tests.

I can now replace all my test with this function:

spec :: Spec
spec = do
  describe "Converting to Roman Numerals" $ do
    it "converts number to Roman and back" $ property $
      prop_convertNumber

Commit Point

Hspec's QuickCheck wrapper provides a convenience function for scenarios like this: I can replace the it and property functions with prop like this (I need to add an import to make this work):

import Test.Hspec.QuickCheck (prop)

...

spec :: Spec
spec = do
  prop "converts number to Roman and back" $
    prop_convertNumber

When I run the tests, they should all pass, and QuickCheck reports back that it generated 100 tests for me.

RomanNumerals
  converts number to Roman and back
    +++ OK, passed 100 tests.

Finished in 0.0017 seconds
1 example, 0 failures

Commit Point

It's great that QuickCheck generates the numbers and runs these tests, however, I would like to look under the hood and see how these conversions are working. I'd like to eyeball what the converted Roman numbers are looking like.

This is all pure code, I can't just print numbers in the terminal. I also don't want to make pure code dirty with IO. Debug.Trace is the solution.

Check this out:

import Debug.Trace
...

prop_convertNumber :: Property
prop_convertNumber =
  forAll genNumbers
    (\x ->
      traceShow("number: ", (x, convertToRoman x)) $
        (convertFromRoman . convertToRoman) x == x)

traceShow will print out both the Arabic and the converted Roman number.

Here is a sample of what I received:

...
("number: ",(825,"XCXCXCXCXCXCXCXCXCXV"))
("number: ",(662,"XCXCXCXCXCXCXCXXXII"))
("number: ",(246,"XCXCLXVI"))
("number: ",(921,"XCXCXCXCXCXCXCXCXCXCXXI"))
    converts number to Roman and back
          +++ OK, passed 100 tests.

          Finished in 0.0030 seconds
          1 example, 0 failures

Oh, that 825 does not look like a valid Roman number. Of course: my conversion table only has values from 1 through 90. It can only convert numbers up to 98. I need to make sure QuickCheck will generate numbers within this range.

numbers :: [Int]
numbers = [1..98]

When I run the tests, I got lucky by QuickCheck, it used 98 for one of the conversions:

...
("number: ",(98,"XCVIII"))
("number: ",(62,"LXII"))
("number: ",(54,"LIV"))
  converts number to Roman and back
      +++ OK, passed 100 tests.

      Finished in 0.0034 seconds
      1 example, 0 failure

All good, now!

Commit Point

QuickCheck has a little brother, called SmallCheck, which will generate numbers up-to a defined depth. It might be a better fit for the Roman Numeral Kata, but that would be a different blog post. I encourage you to explore that library and see how you could change the test code to work with it.

Saturday, October 13, 2018

Haskell by Day

I had played around with Clojure for a couple of years by 2015. I wanted to use it for more than just a toy-project, and when we had to build a data pipeline for my employer, I was more than happy to give it a try.
I hired a developer for our team who had more serious functional programming background than I had. He started doing interesting things with our code base like separating out functions based on purity and using currying like no one I had seen before.

I asked him, how he knows so much about FP. His answer was plain and simple: "I've been learning Haskell." Oh. I have always thought Haskell is this far-out, esoteric, academic language that no one uses. I realized he knew more Clojure just by learning Haskell. Then he showed me Elm and I was blown away. (Elm has its roots in Haskell, it's a similar ML language.)

A couple of months later I watched Chad Fowler's talk about his adventures at WunderList in which he talked about a routing application that they wrote in less than fifty (50) lines of Haskell. It never went done and was blazing fast. His message (the one that I picked up anyway) was "Haskell, Haskell, learn Haskell".

I started digging into the language by following the examples from the book Learn You a Haskell for Great Good. It was an easy read, I was cruising through the examples in GHCi (REPL). Then I stopped looking at the language for about 2-3 months as something came up. I went back to it and was shocked to realize that I did not remember anything. Zero. What the heck is this weird syntax (x:xs)?

I realized reading the book and following the examples in a REPL was not enough, I had to keep coming back to them, and most importantly, practice them. The author of the HaskellProgramming book, Chris Allen had a good talk on how to learn Haskell in less than five years. The message is simple: learn by doing. And keep doing it. This is how my little learning project haskell-katas was born. I needed something I could come back to and practice with. It had mostly helped, but I was still confused by its error messages and type classes, which made more sense by more reading and practice.

I liked my employer, we worked on some interesting problems, but I wanted to try myself on the job market, see how marketable my skills are. At one point I had to solve a Tetris game challenge for a company as a take-home exercise. I knew they were interested in folks with functional thinking. I had no intention working on that problem in Ruby/Python/JavaScript, but I thought it would be interesting to come up with a solution in Haskell. They said I could do it in Haskell and I jumped on the task. I began working on it on a Friday afternoon, played with the problem for an hour or two, I tried to see how I could represent the data with Algebraic Data Types and maybe map the different shapes on the board. When that worked, I knew I'll be able to submit a solution. I worked on it over the weekend and by Monday morning I had the solution working. I submitted my code 3 days early and got invited for an in-person interview. This was the moment I realized I might be able to find a job working with Haskell.

Finding a role with Haskell got more important to me, as the one or two hours here and there in the evening or weekends will never be enough to learn the language fully. I found a job posting on StackOverflow that listed Haskell as a requirement, here in Chicago. I applied, and after a few rounds of interviews I received an offer from them which I accepted.

We are working on a student loan solution for the UK market. Haskell is the brain of the application, all the different calculation models and rules are coded in it. I was intrigued to work for the only company in Chicago that uses Haskell. I started my job about a month ago and boy, it's been an adventure. I feel very slow sometimes, but more productive when I submit a change that makes the code a bit easier to test. For example, I had to dive into a couple of days of learning just to interact with JSON. All in all, I am super happy that I can do something by day that I learned at night, and with this new role I can take my functional programming skills to a new level.

You might ask: How about Ruby? Am I still using? Oh yeah, we have an API app in Rails, plus I started mentoring our intern by pairing with him on different challenges in Ruby a couple of times a week. We also have a fairly significant code base in Python that our data scientists put together to test the Haskell component. I enjoy going into there and help the team make the code better.

The future is polyglot. Follow your heart and work with the language you want to learn. That's the only way to get real deep into it!

Monday, June 18, 2018

Haskell Stack vs. Stackage - Explained

I am sitting in the Denver Airport terminal, waiting for my flight to take me home from Lambda Conf 2018 to Chicago. This conference was the best I have attended in recent years: the quality of the talks, the friendliness of the presenters and the attendees is just something I have not seen in a long time.

I set down next to a guy (Dan Burton to be exact) with my lunch a couple of days ago and I started a conversation with him. When we chatted about open source, he mentioned he works on Stackage. I heard about it, but I asked him to tell me more. I learned quite a bit, I figured others would benefit from having this information, hence the idea of this blog post was born.

Stack came along a couple of years ago to save us all from "Cabal Hell". I never had to experience it as I started learning Haskell only two years ago, long after Stack and Stackage was born. But understanding what they provide helps me appreciate it even more.

When you start a new project with Stack, there is one line in stack.yaml that has significant importance:

resolver: lts-11.11

This value makes the connection between Stack and Stackage.

"What is lts-11.11?" - It's the long term support version of Stackage.
"What is Stackage then?" - It's a set of Haskell tools and libraries tested together in a snapshot making sure that the specified versions work together.
"A snapshot? What's that?" - An LTS or Nightly release of packages.
"Isn't this a lot of work? Testing all these libraries together…" - Oh yes it is, but the good thing is that it’s automated for the most part.
"How many people are working on this?" - Eight, but there are also some devops people at FP Complete that occasionally help with the server.
"How often are the libraries tested?" - There is a nightly snapshot released (almost) every night. There is an LTS snapshot minor bump (e.g. lts-11.11 -> lts-11.12) released (almost) every week. LTS major releases (e.g. lts-11 -> lts-12) are approximately every 3 to 5 months.
"Which one should I use?" - the LTS snapshot of course. Unless you are curious and want to see how a library is changing daily.
"But I have GHC installed globally on my computer. Is that used?" - It depends. If the LTS snapshot you specify in your project uses a different GHC version than what you have outside of Stack, that LTS specified GHC version will be installed.
"Give me an example!" - Sure.

First, let's see what is installed globally. When I run which ghc this is what I get: /usr/local/bin/ghc. And when I peek into this file, I see it points to my homebrew installed ghc, with version 8.4.3:

#!/bin/sh
exedir="/usr/local/Cellar/ghc/8.4.3/lib/ghc-8.4.3/bin"
exeprog="ghc-stage2"
executablename="$exedir/$exeprog"
datadir="/usr/local/Cellar/ghc/8.4.3/share"
bindir="/usr/local/Cellar/ghc/8.4.3/bin"
topdir="/usr/local/Cellar/ghc/8.4.3/lib/ghc-8.4.3"
executablename="$exedir/ghc"
exec "$executablename" -B"$topdir" ${1+"$@"}

Now when I run ghc-pkg list, I see 33 packages installed with this system-level GHC version:

% ghc-pkg list
/usr/local/Cellar/ghc/8.4.3/lib/ghc-8.4.3/package.conf.d
    Cabal-2.2.0.1
    array-0.5.2.0
    base-4.11.1.0
    binary-0.8.5.1
    ...

I have not installed any packages myself into this GHC version, all those 33 packages come with GHC.

I have a project where the resolver is lts-11.11. When I run stack exec -- ghc-pkg list in this project (after it was successfully built of course), the following libraries are listed. I left out the bulk of the libraries, as the key point here is the different layers and not what is in those:

% stack exec -- ghc-pkg list
/Users/adomokos/.stack/programs/x86_64-osx/ghc-8.2.2/lib/ghc-8.2.2/package.conf.d
    Cabal-2.0.1.0
    array-0.5.2.0
    base-4.10.1.0
    ...
/Users/adomokos/.stack/snapshots/x86_64-osx/lts-11.11/8.2.2/pkgdb
    Cabal-2.0.1.1
    HUnit-1.6.0.0
    StateVar-1.1.1.0
    aeson-1.2.4.0
    ...
/Users/adomokos/Projects/persistent-test/.stack-work/install/x86_64-osx/lts-11.11/8.2.2/pkgdb
    katip-0.5.5.1
    persistent-test-0.1.0.0

The 3 paths listed above in this shell snippet is where Haskell packages are pulled from:

Global - the system-level GHC packages list, Stack will never install anything into this
Snapshot - a database shared by all projects using the same snapshot
Local - Project specific database

But wait! What is GHC 8.2.2 doing there? I have version 8.4.3 installed at the system level. As it turns out, Stack, based on the LTS information uses a different version of GHC. I have GHC version 8.4.3 at the system level, but LTS-11.11 uses GHC version 8.2.2.

Let’s prove that out further:

% stack exec -- which ghc
/Users/adomokos/.stack/programs/x86_64-osx/ghc-8.2.2/bin/ghc
% stack exec -- ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.2.2

Ha, Stack rolls its own GHC version and ignores the system-level version if it's different than what it needs. How cool is that!

When I went to Stackage's website, I noticed that a newer version of LTS was released recently. I had LTS-11.11 (released on 05/28/2018), but the latest version is (as of this writing) LTS-11.13 (released on 06/09/2018). I updated stack.yaml to use the newer version and rebuilt the project. Ran the app and everything worked properly.

What changed between the two LTS snapshots? Stackage.org has a very good comparison page, this is where you can follow the diffs. It seems not many of the packages changed that I used, however, postgresql-simple went from 0.5.3.0 to 0.5.4.0. Since LTS-11.13 is specified in stack.yaml and that LTS needs postgresql-simple version 0.5.4.0, what happens when I specify version 0.5.3.0 in package.yaml?

I changed package.yaml this way:

dependencies:
  - base >= 4.7 && < 5
  - postgresql-simple == 0.5.3.0
  ...

When I ran stack build, this friendly error message let me know that I'd like to use a version of a package that is not in the provided LTS snapshot:

Error: While constructing the build plan, the following exceptions were encountered:

In the dependencies for persistent-test-0.1.0.0:
    postgresql-simple-0.5.4.0 from stack configuration
    does not match ==0.5.3.0  (latest matching version
                              is 0.5.3.0)
needed since persistent-test is a build target.

Some different approaches to resolving this:

  * Set 'allow-newer: true' to ignore all version
    constraints and build anyway.

  * Consider trying 'stack solver', which uses the cabal-install
    solver to attempt to find some working build
    configuration. This can be convenient when dealing with many
    complicated constraint errors, but results
    may be unpredictable.

  * Recommended action: try adding the following to your extra-deps
    in /Users/adomokos/Projects/persistent-test/stack.yaml:

- postgresql-simple-0.5.3.0

Plan construction failed.

Once I removed the version specification for the postgresql-simple package, it built successfully. But did it pick the correct version since I did not specify it?

% stack exec -- ghc-pkg list | grep postgresql-simple
    postgresql-simple-0.5.4.0

Yep, the correct, Stackage LTS-11.13 version was in fact installed.

I grabbed all the package names from LTS-11.13, I counted 2474 packages that got tested against each other for this particular LTS release. Kudos to the Stackage Curator Team for making sure we will only use packages that are playing nice with each other!

(Thanks to Dan for proofreading my post, this writing is more accurate with his feedback.)

Thursday, April 19, 2018

Path Count

I've bumped into this brain teaser recently:

"Given two integer numbers describing an n by m graph, where n represents the height and m represents the width, calculate the number of ways you can get from the top left to the bottom right if you can only go right and down"

That's a fine challenge, and prior to knowing recursive types in Haskell, I don’t know how I would have approached the problem.

Let's draw what the paths would look like first.

Given 1x1, the case is pretty simple:

{-
  The is what the matrix would look like:

  (0,1) - (1,1)
    |       |
  (0,0) - (1,0)
-}

The only way to get from the top left to the bottom right is to "walk" the perimeters:

{-
  (0,1) - (1,1) - (1,0)
  (0,1) - (0,0) - (1,0)
-}

This case is so easy, I don’t even bother with devising a solution. Let's look at a 1 by 2 graph:

{-
  (0-1) - (1,1) - (2,1)
    |       |       |
  (0-0) - (1,0) - (2,0)
-}

Following the rules laid out, there are 3 ways to get from the top left to the bottom right point:

{-
  (0-1) - (1,1) - (2,1) - (2,0)
  (0-1) - (1,1) - (1,0) - (2,0)
  (0-1) - (0,0) - (1,0) - (2,0)
-}

The rule "you can only go right and down" tells us something: it's a binary tree. How could I draw up a recursive tree structure for this?

I'd like to make sure the logic is correct, I put all this logic into an HSpec file. How 'bout this?

-- test/PathCountSpec.hs

import Test.Hspec

main :: IO ()
main = hspec spec

type Point = (Int, Int)
data Tree a = Leaf
            | Node a (Tree a) (Tree a)
            deriving (Show, Eq)

spec :: Spec
spec =
    describe "Path Count" $ do
        it "can calculate tree branches for 1x2 matrix" $ do
            let tree =
                    Node (0,1)
                        (Node (1,1)
                            (Node (2,1) Leaf
                                        (Node (2,0) Leaf Leaf))
                            (Node (1,0) (Node (2,0) Leaf Leaf)
                                        Leaf))
                        (Node (0,0)
                            (Node (1,0)
                                (Node (2,0) Leaf Leaf)
                                Leaf)
                            Leaf)
            {-
               Possible paths:
                (0,1) - (1,1) - (2,1) - (2,0)
                (0,1) - (1,1) - (1,0) - (2,0)
                (0,1) - (0,0) - (1,0) - (2,0)
            -}
            pending

The key here, that the number of times Node (2,0) Leaf Leaf appears is the number of different ways I can get from the top left to the bottom right. All I have to do is counting the number of times this sub-tree appears in the tree itself.

I wrote a function (that I put into the HSpec file itself) to do just that:

leafCount :: Tree Point -> Int
leafCount Leaf = 0
leafCount (Node _ Leaf Leaf) = 1
leafCount (Node _ left right) = leafCount left + leafCount right

When I call this function with the provided tree I have in the spec, I should receive 3. This assertion passes:

leafCount tree `shouldBe` 3

I manually had to build up this tree, next I looked at how I could generate it myself based on the top left and bottom right points. I had to make sure I won’t add branches outside of the matrix, what I accomplished with two guards.

buildTree :: Point -> Point -> Tree Point
buildTree (a,b) end@(c,d)
    | a > c = Leaf
    | b < 0 = Leaf
    | otherwise = Node (a, b) (buildTree (a+1,b) end) (buildTree (a,b-1) end)

This logic will keep "walking" right and down until the guards will stop it.

I added another assertion to make sure this still passes:

buildTree (0,1) (2,0) `shouldBe` tree

I wanted to make sure this logic still holds when I pivot the two numbers. This is the next assertion I added:

buildTree (0,2) (1,0) `shouldBe` tree

Tests are passing. All right, I need to set up an outer function that takes two numbers and returns the number of paths.

Here it is:

pathCount :: Int -> Int -> Int
pathCount m n = leafCount $ fromTree m n
    where fromTree m n = buildTree (0,m) (n,0)

This assertion will exercise this function:

pathCount 1 2 `shouldBe` 3

Everything is green!

I added one more test to make sure the 1x1 graph works:

pathCount 1 1 `shouldBe` 2

And finally I added a 2x2 test as well. This one has 6 different paths to get from the top left to the bottom right:

{-
    Possible paths:
    (0,2) - (1,2) - (2,2) - (2,1) - (2,0)
    (0,2) - (1,2) - (1,1) - (2,1) - (2,0)
    (0,2) - (1,2) - (1,1) - (1,0) - (2,0)
    (0,2) - (0,1) - (1,1) - (2,1) - (2,0)
    (0,2) - (0,1) - (1,1) - (1,0) - (2,0)
    (0,2) - (0,1) - (0,0) - (1,0) - (2,0)
-}
pathCount 2 2 `shouldBe` 6

And when I run the spec again, it all works! You can find the solution in this gist.

Sunday, January 28, 2018

Haskell to MySQL via YeshQL (Part 3.)

In the previous blog post we built a console app in Haskell that talks to MySQL via YeshQL. It creates a client record and counts them with a SQL query.
In the final part in this series, we will add automated tests to our application, we will save a user record along with its parent client and make sure all the saving happens in one unit of work.

This is the commit point where we left it at the end of Part 2.

Add Tests

Let’s set up the great testing tool HSpec in our project.

First, replace the content of test/Spec.hs file with this:

{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

This will let auto-discover any spec files we have in the test directory.

Let's add the first test to the project in the test/Hashmir/DataSpec.hs file:

module Hashmir.DataSpec where

import Test.Hspec

main :: IO ()
main = hspec spec

spec :: Spec
spec = do
    describe "Hashmir Data" $ do
        it "runs a test" $ do
            True `shouldBe` True

We are not testing anything real here, we just want to make sure all the building blocks are in place.

Add the test-suite directive to package.yml file:

...

tests:
  hashmir-test:
    source-dirs: test/
    main: Spec.hs
    dependencies:
      - hashmir
      - hspec == 2.*
    other-modules:
      Hashmir.DataSpec

make build should recompile the app, and stack test will run the entire test suite.

When all is good, you should see this:

Hashmir.Data
  Hashmir Data
    runs a test

Finished in 0.0010 seconds
1 example, 0 failures

It wouldn't save much typing, but I like navigating the projects I am working on from a Makefile, I added these changes to run the tests with make test:

...
test: ## Run the specs
  @stack test

.PHONY: help test

Commit point

Verify Client Create Logic

Creating a record in the database is easy, we already verified it when we ran the app. However, making this automated and repeatable shows some challenges. We need to make sure that every test cleans after itself in the DB. We could wrap each and every spec in a transaction and just roll it back, but that would be quite complex. Dropping and rebuilding the database is fast as it is. Sure, it's a couple of hundred milliseconds, but that is negligible for now.

HSpec provides before hooks, we will hook into that.

Let's change the test/Hashmir/DataSpec.hs like this:

module Hashmir.DataSpec where

import Test.Hspec
import System.Process
import qualified Hashmir.Data as D

main :: IO ()
main = hspec spec

resetDB :: IO ()
resetDB = callCommand "make build-db"

spec :: Spec
spec = before resetDB $ do
    describe "Hashmir Data" $ do
        it "creates a Client record" $ do
            clientId <- D.insertClient "TestClient" "testclient"
            clientId `shouldBe` 1

We call resetDB with every single spec, that function makes a system call to rebuild the DB.

When you try executing the test, stack tries to recompile the app, but it presents an error:

test/Hashmir/DataSpec.hs:4:1: error:
    Failed to load interface for ‘System.Process’
    It is a member of the hidden package ‘process-1.4.3.0’.
    Perhaps you need to add ‘process’ to the build-depends in your .cabal file.

Oh-oh. We need to add the process package to our test-suite, let's modify the package.yml like this:

tests:
  hashmir-test:
    source-dirs: test/
    main: Spec.hs
    dependencies:
      - process
      - hashmir
      - hspec == 2.*
    other-modules:
      Hashmir.DataSpec

After adding the process package, regenerating the cabal file, we can now run our first test successfully:

Hashmir.Data
  Hashmir Data
Dropping and rebuilding database hashmir_test
    runs a test

Finished in 0.1378 seconds
1 example, 0 failures

The beauty of this solution is that we can run it over and over again, the test will pass as the checked clientId will always be 1, since the database is recreated every time.

Commit point

Add a User Record Along With Client

Let's add a failing spec for this first. Add the following content to the test/Hashmir/DataSpec.hs file:

    it "creates a Client and a User record" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        userId <- D.insertUser clientId "joe" "joe@example.com" "password1"
        userId `shouldBe` 1

There is no insertUser function, let's add it. We also need to add the SQL template to the YeshQL code. It's very similar to the Client insert script, here are all the changes for that:

[yesh|
    -- name:countClientSQL :: (Int)
    SELECT count(id) FROM clients;
    ;;;
    -- name:insertClientSQL
    -- :client_name :: String
    -- :subdomain :: String
    INSERT INTO clients (name, subdomain) VALUES (:client_name, :subdomain);
    ;;;
    -- name:insertUserSQL
    -- :client_id :: Integer
    -- :login :: String
    -- :email :: String
    -- :password :: String
    INSERT INTO users (client_id, login, email, password)
    VALUES (:client_id, :login, :email, :password);
|]

And the insertUser function like this:

insertUser :: Integer -> String -> String -> String -> IO Integer
insertUser clientId login email password =
    withConn $ insertUserSQL clientId login email password

When I run make test, this is the output printed on the screen:

Hashmir.Data
  Hashmir Data
Dropping and rebuilding database hashmir_test
    creates a Client record
Dropping and rebuilding database hashmir_test
    creates a Client and a User record

Finished in 0.2642 seconds
2 examples, 0 failures

The lines Dropping and rebuilding database hashmir_test is too much noise, let's remove it from the Makefile.

Hashmir.Data
  Hashmir Data
    creates a Client record
    creates a Client and a User record

Finished in 0.2354 seconds
2 examples, 0 failures

This looks much cleaner.

Commit point

Roll Back Transactions When Error Occurs

The happy path of our application is working well: the User and Client records are inserted properly. First, the Client is saved, its id is used for the User record to establish the proper references. But we should treat these two inserts as one unit of work: if the second fails, it should roll back the first insert.

Let's write a test for it. I'll make the created Client's id intentionally wrong by incrementing it by one.

    it "rolls back the transaction when failure occurs" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        _ <- D.insertUser (clientId + 1) "joe" "joe@example.com" "password1"
        clientCount <- D.withConn $ D.countClientSQL
        clientCount `shouldBe` Just 0

When I run the tests, this is the error I am getting:

Hashmir.Data
  Hashmir Data
    creates a Client record
    creates a Client and a User record
    rolls back the transaction when failure occurs FAILED [1]

Failures:

  test/Hashmir/DataSpec.hs:23:
  1) Hashmir.Data, Hashmir Data, rolls back the transaction when failure occurs
       uncaught exception:
           SqlError (SqlError {seState = "",
               seNativeError = 1452,
               seErrorMsg = "Cannot add or update a child row: a foreign key constraint
                             fails (`hashmir_test`.`users`, CONSTRAINT `client_id`
                             FOREIGN KEY (`client_id`) REFERENCES `clients` (`id`))"})

Randomized with seed 668337839

Finished in 0.3924 seconds
3 examples, 1 failure

The database is protecting itself from an incorrect state, a User record won't be saved with an id that does not match a record in the clients table. This exception is justified, although, it could be handled better with a Maybe type, that's not the point right now. Let's just expect this exception for now to see a proper test failure.

Change the test like this:

    it "rolls back the transaction when failure occurs" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        (D.insertUser (clientId + 1)
                      "joe"
                      "joe@example.com"
                      "password1")
            `shouldThrow` anyException
        clientCount <- D.withConn $ D.countClientSQL
        clientCount `shouldBe` Just 0

The spec now produces the error I would expect:

Failures:

  test/Hashmir/DataSpec.hs:31:
  1) Hashmir.Data, Hashmir Data, rolls back the transaction when failure occurs
       expected: Just 0
        but got: Just 1

Randomized with seed 1723584293

Finished in 0.3728 seconds
3 examples, 1 failure

Finally, we have a spec that fails correctly, as we are not rolling back the created Client record.

The reason the Client record is not rolled back is that we use two different transactions to persist the records: first, the Client record is saved and the connection is committed, and then the User record is attempted to be saved. It fails, the record is not created, but the Client record has already been committed to the database. This is our problem, we should reuse the same connection for both save operations, and only commit it after the second one.

Let's refactor the code to do that. Both the insertClient and insertUser now accept a connection:

insertClient :: H.IConnection conn =>
                      String -> String -> conn -> IO Integer
insertClient name subdomain =
    insertClientSQL name subdomain

insertUser :: H.IConnection conn =>
                    Integer -> String -> String -> String -> conn -> IO Integer
insertUser clientId login email password =
    insertUserSQL clientId login email password

The specs now has to be modified to pass in the connection:

spec :: Spec
spec = before resetDB $ do
    describe "Hashmir Data" $ do
        it "creates a Client record" $ do
            clientId <- D.withConn $ D.insertClient "TestClient" "testclient"
            clientId `shouldBe` 1
        it "creates a Client and a User record" $ do
            userId <- D.withConn (\conn -> do
                clientId <- D.insertClient "TestClient" "testclient" conn
                D.insertUser clientId "joe" "joe@example.com" "password1" conn)
            userId `shouldBe` 1
        it "rolls back the transaction when failure occurs" $ do
            (D.withConn (\conn -> do
                clientId <- D.insertClient "TestClient" "testclient" conn
                D.insertUser (clientId+1) "joe" "joe@example.com" "password1" conn))
                `shouldThrow` anyException
            clientCount <- D.withConn $ D.countClientSQL
            clientCount `shouldBe` Just 0

And finally, the Main function has to be updated as well:

main :: IO ()
main = do
    clientId <- D.withConn $ D.insertClient "TestClient" "testclient"
    putStrLn $ "New client's id is " ++ show clientId
    Just clientCount <- D.withConn D.countClientSQL
    putStrLn $ "There are " ++ show clientCount ++ " records."

When you run the tests, they should all pass now.

Commit point

Summary

In this blog series we set up YeshQL, added logic to insert Client and its dependent User records, we added tests and made sure all the writes are in one transaction.

Our final solution works, but it requires the connection to be passed in. Using a Reader Monad would be a more elegant solution, but that should be a different blog post.

Wednesday, November 14, 2018

Hspec with QuickCheck

Saturday, October 13, 2018

Haskell by Day

Monday, June 18, 2018

Haskell Stack vs. Stackage - Explained

Thursday, April 19, 2018

Path Count

Sunday, January 28, 2018

Haskell to MySQL via YeshQL (Part 3.)

Add Tests

Verify Client Create Logic

Add a User Record Along With Client

Roll Back Transactions When Error Occurs

Summary

Blog Archive