Thursday, April 19, 2018

Path Count

I've bumped into this brain teaser recently:

"Given two integer numbers describing an n by m graph, where n represents the height and m represents the width, calculate the number of ways you can get from the top left to the bottom right if you can only go right and down"

That's a fine challenge, and prior to knowing recursive types in Haskell, I don’t know how I would have approached the problem.

Let's draw what the paths would look like first.

Given 1x1, the case is pretty simple:

{-
  The is what the matrix would look like:

  (0,1) - (1,1)
    |       |
  (0,0) - (1,0)
-}

The only way to get from the top left to the bottom right is to "walk" the perimeters:

{-
  (0,1) - (1,1) - (1,0)
  (0,1) - (0,0) - (1,0)
-}

This case is so easy, I don’t even bother with devising a solution. Let's look at a 1 by 2 graph:

{-
  (0-1) - (1,1) - (2,1)
    |       |       |
  (0-0) - (1,0) - (2,0)
-}

Following the rules laid out, there are 3 ways to get from the top left to the bottom right point:

{-
  (0-1) - (1,1) - (2,1) - (2,0)
  (0-1) - (1,1) - (1,0) - (2,0)
  (0-1) - (0,0) - (1,0) - (2,0)
-}

The rule "you can only go right and down" tells us something: it's a binary tree. How could I draw up a recursive tree structure for this?

I'd like to make sure the logic is correct, I put all this logic into an HSpec file. How 'bout this?

-- test/PathCountSpec.hs

import Test.Hspec

main :: IO ()
main = hspec spec

type Point = (Int, Int)
data Tree a = Leaf
            | Node a (Tree a) (Tree a)
            deriving (Show, Eq)

spec :: Spec
spec =
    describe "Path Count" $ do
        it "can calculate tree branches for 1x2 matrix" $ do
            let tree =
                    Node (0,1)
                        (Node (1,1)
                            (Node (2,1) Leaf
                                        (Node (2,0) Leaf Leaf))
                            (Node (1,0) (Node (2,0) Leaf Leaf)
                                        Leaf))
                        (Node (0,0)
                            (Node (1,0)
                                (Node (2,0) Leaf Leaf)
                                Leaf)
                            Leaf)
            {-
               Possible paths:
                (0,1) - (1,1) - (2,1) - (2,0)
                (0,1) - (1,1) - (1,0) - (2,0)
                (0,1) - (0,0) - (1,0) - (2,0)
            -}
            pending

The key here, that the number of times Node (2,0) Leaf Leaf appears is the number of different ways I can get from the top left to the bottom right. All I have to do is counting the number of times this sub-tree appears in the tree itself.

I wrote a function (that I put into the HSpec file itself) to do just that:

leafCount :: Tree Point -> Int
leafCount Leaf = 0
leafCount (Node _ Leaf Leaf) = 1
leafCount (Node _ left right) = leafCount left + leafCount right

When I call this function with the provided tree I have in the spec, I should receive 3. This assertion passes:

leafCount tree `shouldBe` 3

I manually had to build up this tree, next I looked at how I could generate it myself based on the top left and bottom right points. I had to make sure I won’t add branches outside of the matrix, what I accomplished with two guards.

buildTree :: Point -> Point -> Tree Point
buildTree (a,b) end@(c,d)
    | a > c = Leaf
    | b < 0 = Leaf
    | otherwise = Node (a, b) (buildTree (a+1,b) end) (buildTree (a,b-1) end)

This logic will keep "walking" right and down until the guards will stop it.

I added another assertion to make sure this still passes:

buildTree (0,1) (2,0) `shouldBe` tree

I wanted to make sure this logic still holds when I pivot the two numbers. This is the next assertion I added:

buildTree (0,2) (1,0) `shouldBe` tree

Tests are passing. All right, I need to set up an outer function that takes two numbers and returns the number of paths.

Here it is:

pathCount :: Int -> Int -> Int
pathCount m n = leafCount $ fromTree m n
    where fromTree m n = buildTree (0,m) (n,0)

This assertion will exercise this function:

pathCount 1 2 `shouldBe` 3

Everything is green!

I added one more test to make sure the 1x1 graph works:

pathCount 1 1 `shouldBe` 2

And finally I added a 2x2 test as well. This one has 6 different paths to get from the top left to the bottom right:

{-
    Possible paths:
    (0,2) - (1,2) - (2,2) - (2,1) - (2,0)
    (0,2) - (1,2) - (1,1) - (2,1) - (2,0)
    (0,2) - (1,2) - (1,1) - (1,0) - (2,0)
    (0,2) - (0,1) - (1,1) - (2,1) - (2,0)
    (0,2) - (0,1) - (1,1) - (1,0) - (2,0)
    (0,2) - (0,1) - (0,0) - (1,0) - (2,0)
-}
pathCount 2 2 `shouldBe` 6

And when I run the spec again, it all works! You can find the solution in this gist.

Sunday, January 28, 2018

Haskell to MySQL via YeshQL (Part 3.)

In the previous blog post we built a console app in Haskell that talks to MySQL via YeshQL. It creates a client record and counts them with a SQL query.
In the final part in this series, we will add automated tests to our application, we will save a user record along with its parent client and make sure all the saving happens in one unit of work.

This is the commit point where we left it at the end of Part 2.

Add Tests

Let’s set up the great testing tool HSpec in our project.

First, replace the content of test/Spec.hs file with this:

{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

This will let auto-discover any spec files we have in the test directory.

Let's add the first test to the project in the test/Hashmir/DataSpec.hs file:

module Hashmir.DataSpec where

import Test.Hspec

main :: IO ()
main = hspec spec

spec :: Spec
spec = do
    describe "Hashmir Data" $ do
        it "runs a test" $ do
            True `shouldBe` True

We are not testing anything real here, we just want to make sure all the building blocks are in place.

Add the test-suite directive to package.yml file:

...

tests:
  hashmir-test:
    source-dirs: test/
    main: Spec.hs
    dependencies:
      - hashmir
      - hspec == 2.*
    other-modules:
      Hashmir.DataSpec

make build should recompile the app, and stack test will run the entire test suite.

When all is good, you should see this:

Hashmir.Data
  Hashmir Data
    runs a test

Finished in 0.0010 seconds
1 example, 0 failures

It wouldn't save much typing, but I like navigating the projects I am working on from a Makefile, I added these changes to run the tests with make test:

...
test: ## Run the specs
  @stack test

.PHONY: help test

Commit point

Verify Client Create Logic

Creating a record in the database is easy, we already verified it when we ran the app. However, making this automated and repeatable shows some challenges. We need to make sure that every test cleans after itself in the DB. We could wrap each and every spec in a transaction and just roll it back, but that would be quite complex. Dropping and rebuilding the database is fast as it is. Sure, it's a couple of hundred milliseconds, but that is negligible for now.

HSpec provides before hooks, we will hook into that.

Let's change the test/Hashmir/DataSpec.hs like this:

module Hashmir.DataSpec where

import Test.Hspec
import System.Process
import qualified Hashmir.Data as D

main :: IO ()
main = hspec spec

resetDB :: IO ()
resetDB = callCommand "make build-db"

spec :: Spec
spec = before resetDB $ do
    describe "Hashmir Data" $ do
        it "creates a Client record" $ do
            clientId <- D.insertClient "TestClient" "testclient"
            clientId `shouldBe` 1

We call resetDB with every single spec, that function makes a system call to rebuild the DB.

When you try executing the test, stack tries to recompile the app, but it presents an error:

test/Hashmir/DataSpec.hs:4:1: error:
    Failed to load interface for ‘System.Process’
    It is a member of the hidden package ‘process-1.4.3.0’.
    Perhaps you need to add ‘process’ to the build-depends in your .cabal file.

Oh-oh. We need to add the process package to our test-suite, let's modify the package.yml like this:

tests:
  hashmir-test:
    source-dirs: test/
    main: Spec.hs
    dependencies:
      - process
      - hashmir
      - hspec == 2.*
    other-modules:
      Hashmir.DataSpec

After adding the process package, regenerating the cabal file, we can now run our first test successfully:

Hashmir.Data
  Hashmir Data
Dropping and rebuilding database hashmir_test
    runs a test

Finished in 0.1378 seconds
1 example, 0 failures

The beauty of this solution is that we can run it over and over again, the test will pass as the checked clientId will always be 1, since the database is recreated every time.

Commit point

Add a User Record Along With Client

Let's add a failing spec for this first. Add the following content to the test/Hashmir/DataSpec.hs file:

    it "creates a Client and a User record" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        userId <- D.insertUser clientId "joe" "joe@example.com" "password1"
        userId `shouldBe` 1

There is no insertUser function, let's add it. We also need to add the SQL template to the YeshQL code. It's very similar to the Client insert script, here are all the changes for that:

[yesh|
    -- name:countClientSQL :: (Int)
    SELECT count(id) FROM clients;
    ;;;
    -- name:insertClientSQL
    -- :client_name :: String
    -- :subdomain :: String
    INSERT INTO clients (name, subdomain) VALUES (:client_name, :subdomain);
    ;;;
    -- name:insertUserSQL
    -- :client_id :: Integer
    -- :login :: String
    -- :email :: String
    -- :password :: String
    INSERT INTO users (client_id, login, email, password)
    VALUES (:client_id, :login, :email, :password);
|]

And the insertUser function like this:

insertUser :: Integer -> String -> String -> String -> IO Integer
insertUser clientId login email password =
    withConn $ insertUserSQL clientId login email password

When I run make test, this is the output printed on the screen:

Hashmir.Data
  Hashmir Data
Dropping and rebuilding database hashmir_test
    creates a Client record
Dropping and rebuilding database hashmir_test
    creates a Client and a User record

Finished in 0.2642 seconds
2 examples, 0 failures

The lines Dropping and rebuilding database hashmir_test is too much noise, let's remove it from the Makefile.

Hashmir.Data
  Hashmir Data
    creates a Client record
    creates a Client and a User record

Finished in 0.2354 seconds
2 examples, 0 failures

This looks much cleaner.

Commit point

Roll Back Transactions When Error Occurs

The happy path of our application is working well: the User and Client records are inserted properly. First, the Client is saved, its id is used for the User record to establish the proper references. But we should treat these two inserts as one unit of work: if the second fails, it should roll back the first insert.

Let's write a test for it. I'll make the created Client's id intentionally wrong by incrementing it by one.

    it "rolls back the transaction when failure occurs" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        _ <- D.insertUser (clientId + 1) "joe" "joe@example.com" "password1"
        clientCount <- D.withConn $ D.countClientSQL
        clientCount `shouldBe` Just 0

When I run the tests, this is the error I am getting:

Hashmir.Data
  Hashmir Data
    creates a Client record
    creates a Client and a User record
    rolls back the transaction when failure occurs FAILED [1]

Failures:

  test/Hashmir/DataSpec.hs:23:
  1) Hashmir.Data, Hashmir Data, rolls back the transaction when failure occurs
       uncaught exception:
           SqlError (SqlError {seState = "",
               seNativeError = 1452,
               seErrorMsg = "Cannot add or update a child row: a foreign key constraint
                             fails (`hashmir_test`.`users`, CONSTRAINT `client_id`
                             FOREIGN KEY (`client_id`) REFERENCES `clients` (`id`))"})

Randomized with seed 668337839

Finished in 0.3924 seconds
3 examples, 1 failure

The database is protecting itself from an incorrect state, a User record won't be saved with an id that does not match a record in the clients table. This exception is justified, although, it could be handled better with a Maybe type, that's not the point right now. Let's just expect this exception for now to see a proper test failure.

Change the test like this:

    it "rolls back the transaction when failure occurs" $ do
        clientId <- D.insertClient "TestClient" "testclient"
        (D.insertUser (clientId + 1)
                      "joe"
                      "joe@example.com"
                      "password1")
            `shouldThrow` anyException
        clientCount <- D.withConn $ D.countClientSQL
        clientCount `shouldBe` Just 0

The spec now produces the error I would expect:

Failures:

  test/Hashmir/DataSpec.hs:31:
  1) Hashmir.Data, Hashmir Data, rolls back the transaction when failure occurs
       expected: Just 0
        but got: Just 1

Randomized with seed 1723584293

Finished in 0.3728 seconds
3 examples, 1 failure

Finally, we have a spec that fails correctly, as we are not rolling back the created Client record.

The reason the Client record is not rolled back is that we use two different transactions to persist the records: first, the Client record is saved and the connection is committed, and then the User record is attempted to be saved. It fails, the record is not created, but the Client record has already been committed to the database. This is our problem, we should reuse the same connection for both save operations, and only commit it after the second one.

Let's refactor the code to do that. Both the insertClient and insertUser now accept a connection:

insertClient :: H.IConnection conn =>
                      String -> String -> conn -> IO Integer
insertClient name subdomain =
    insertClientSQL name subdomain

insertUser :: H.IConnection conn =>
                    Integer -> String -> String -> String -> conn -> IO Integer
insertUser clientId login email password =
    insertUserSQL clientId login email password

The specs now has to be modified to pass in the connection:

spec :: Spec
spec = before resetDB $ do
    describe "Hashmir Data" $ do
        it "creates a Client record" $ do
            clientId <- D.withConn $ D.insertClient "TestClient" "testclient"
            clientId `shouldBe` 1
        it "creates a Client and a User record" $ do
            userId <- D.withConn (\conn -> do
                clientId <- D.insertClient "TestClient" "testclient" conn
                D.insertUser clientId "joe" "joe@example.com" "password1" conn)
            userId `shouldBe` 1
        it "rolls back the transaction when failure occurs" $ do
            (D.withConn (\conn -> do
                clientId <- D.insertClient "TestClient" "testclient" conn
                D.insertUser (clientId+1) "joe" "joe@example.com" "password1" conn))
                `shouldThrow` anyException
            clientCount <- D.withConn $ D.countClientSQL
            clientCount `shouldBe` Just 0

And finally, the Main function has to be updated as well:

main :: IO ()
main = do
    clientId <- D.withConn $ D.insertClient "TestClient" "testclient"
    putStrLn $ "New client's id is " ++ show clientId
    Just clientCount <- D.withConn D.countClientSQL
    putStrLn $ "There are " ++ show clientCount ++ " records."

When you run the tests, they should all pass now.

Commit point

Summary

In this blog series we set up YeshQL, added logic to insert Client and its dependent User records, we added tests and made sure all the writes are in one transaction.

Our final solution works, but it requires the connection to be passed in. Using a Reader Monad would be a more elegant solution, but that should be a different blog post.