Sunday, May 15, 2016

Currying in Haskell, Clojure, Ruby and JavaScript

I worked with a developer about a year ago, who had more experience with functional programming than I had. We worked on a Clojure project, and his deep Haskell background made him an expert on our team. This was especially revealing when we discussed partial function applications and currying. I was vaguely familiar with the concept, but I've never used them in any of the apps I've worked on.

Fast forward a year, after learning and playing with Haskell for a few months, I understand why: in Haskell, everything is curried. I repeat: everything. Even the function invocation is curried. In fact, you have to work hard if you want it differently. No wonder, currying was so obvious for that developer.

haskell-logo

Let's look at a simple example in Haskell:

-- You can type this in your own GHCi or try it in https://ghc.io/
let multiply x y = x * y
let double = multiply 2
let triple = multiply 3

double 3
triple 4

I created a multiply function that takes two arguments and multiplies them. In Haskell everything is curried, it's perfectly fine to invoke this function with only a single argument. What will I get back? Another function. This was the breakthrough for me: partially applying a function yields another function. Then I defined two other functions, one that passes in 2 to double, the other 3 to triple whatever argument I pass to it.

What I was amazed by this was the easiness and the natural nature of Haskell's currying through partial application.

Let's see how this simple example would look in Clojure.

(defn multiply [x y] (* x y))
(def double (partial multiply 2))
(def triple (partial multiply 3))

(double 3) ;; will yield 6
(triple 4) ;; will produce 12

This works, but yuck, I had to use a special language construct partial to signal, that I'll be partially applying the multiply function. Based on the Haskell example, my intuition was to use defn for the double and triple functions, but that tripped me over, it did not work. I had to "StackOverflow" it to realize, that the def binding is needed instead of defn to produce the partially applied function. This is a far cry from Haskell, where everything felt natural.

Although Ruby is a dynamically typed object-oriented language, it has many functional constructs that I enjoy using. I was curious if Ruby supports currying. To my surprise, it does. Look at the same example with partial functions and currying in Ruby.

multiply = -> (x, y) { x * y }
double = multiply.curry.(2)
triple = multiply.curry.(3)

double.(3) # will yield 6
triple.(4) # will produce 12
Well, this works, but it's far from Haskell's obvious nature, where I did not have to use any kind of special keywords to achieve the same result.

Here is how I would write this with "programming by wishful thinking":

# This won't work
multiply = -> (x, y) { x * y }
double = multiply(2)
triple = multiply(3)

I am sure the Ruby language authors had a reason to use curry for partial applications, but it just did not feel natural. I have to learn and remember how to use it properly.

There are currying related npm packages in Node.js, but I have not found anything that's built into the language. Here is how the poor man's currying is done in JavaScript:

var multiply = function(x) {
  return function(y) {
    return x * y;
  }
}

var double = multiply(2);
var triple = multiply(3);

double(3); // will yield 6
triple(4); // will produce 12
I like JavaScript's "functions are first class citizens" nature, I am sure once ES6 or 7 gets widely adopted, it will be a language I'll enjoy using in the future.

Learning about currying in one language and using those concepts in another is an obvious benefit of learning a programming language in every year.

Sunday, March 27, 2016

Why Make?

A new member joined our team a couple of weeks ago, and as we took him out for a cup of coffee on his first week, he asked me a question: "I’ve never seen this before and I wanted to ask you. Why are you using Make in your Ruby project?"

His question was legit coming from someone in the Ruby world, where we have rake for achieving the same goal. I had to think about the history there as I explained my reasoning behind using it.

As I started looking at other programming languages, rake wasn’t available for me. A couple of years ago I got pretty deep into node.js, and there were repetitive tasks I had to do, like dropping and rebuilding a database, running the tests, etc. I created a script directory and put separate scripts in it to accomplish all that.

For example, for the two tasks I mentioned above, I created two scripts:

project_root
   |- scripts
        |- build_db.sh
        |- run_tests.sh

This worked OK for a while, but when I contributed to the great testing framework mocha.js, I realized that the author of that module, T.J., just took these convenience scripts to another level by using Make.

I noticed he is using Make to run simple shell commands in a more elegant manner than I did with my scripts directory and shell scripts in it. I immediately started to adopt this practice.

As I started exploring other languages like Clojure, Erlang, Haskell, using Make was an obvious choice. It did not matter what language I used, dropping and building a database was the same task, regardless.

This practice came with me as I went back to Ruby as well. As I started working on larger, 3-4-year-old Rails apps, running rake tasks was a time-consuming exercise. Bundler had to load the whole world into memory before it could even evaluate what it had to do. This way, simple tasks that had nothing to do with Rails had an 8-10 seconds startup time. I did not think twice about firing up a Makefile to do the same in less than a second.

Of course, some of the religious Rails disciples dismissed this, but productivity over religion has a higher precedence for me.

A few weeks ago I found a blog post on how to add task description to Make targets. I updated my Makefiles, and now, when I run make in the terminal, this is what I see:

As I have not found a good make target generator, I created a gist to get me rolling. Documentation and a sample target are a good way to get started. I even added a shell function to grab it for me:

make-init() {
  curl https://gist.githubusercontent.com/adomokos/2fd95840d59b19bbb3f4/raw/7b548cd3fda0dab958ecb0e0955fbadc1af6ef6e/Makefile > Makefile
}

Now, I only need to type make-init in the terminal and I have a Makefile to work with.

Friday, February 19, 2016

Teaching Clojure to a 7-Year-Old

I see my 7-year-old son as someone, who is deeply interested in anything computer related. I recall how he went through photos on my iPod Touch when he was only two years old. Then video games kicked in, and now he can make an 8-hour flight back to Europe without taking a break from a game on an iPad.

This is all nice and cool, but why don't we do something more useful with this passion?! He has been curious about programming, he practiced basic function calls via the game Code Combat. I figured, let's take this to the next level and try programming.

I wanted to use a language that is easy to understand, but can be very powerful. I considered Python or JavaScript, but I figured his first real programming language should be a functional one, and the simple nature of LiSP made Clojure the obvious choice.

I wanted to find an editor that's easy to use. I was aware of LightTable, but I've never tried it. We downloaded it on my wife's 11" MacBook Air and we jumped in.

What's really cool about LightTable is that you don't need to run a separate REPL, you could just write your Clojure expressions in a clj file, save it and by hitting <Cmd> + <Shift> + <Enter> the expressions are evaluated in line, right next to them. It's really the best tool for beginners.

We worked on a rectangular area calculator in our first session, since that's what he's been learning at school. This was our first expression:

(defn area [x y] (* x y))

(area 2 3) # 6
We tried different numbers, he was pumped when the correct number was printed in LightTable after the evaluation.

Our area function is for rectangles, but what if we wanted to calculate the square's area? We only had to pass one number to our calculator.
Of course we could have done this:

(area 4 4) # 16
But this was not very elegant. I proposed creating a new function for square-area and calling area from that this way:
(defn square-area [x]
  (area x x))

(square-area [4]) # 16
My goal here was to show him how one function can leverage the functionality of another function. He liked this a lot.

In our next session - as he constantly nudged me to continue his journey in Clojure programming land - I wanted to teach him something we could build upon: we learned about vectors, which is similar to Arrays in other programming languages.

We started out with listing his best friends:

(def friends [:andy :james :tommy :ethan :elliot])
I explained to him that these names are in order and they will always remain in order the way he defined it first. There are ways to explore the collection, for example pulling the first item from it. This is what we tried:
(first friends) # :andy
I asked him if he could get the last item of the vector. He thought about a bit and this is what he came up with:
(last friends) # :elliot
Excellent! Then I asked him how we could get the third item from the collection and he intuitively tried the function third, which does not exist. I showed him the nth function to do that. This is what he tried to get the third item:
(nth friends 3) # :ethan
But oh, it returned the fourth and not the third item. So we talked about the 0-based index, which he grasped, but did not make much sense to him.

I told him: "Imagine how great it would be to sort these names in alphabetical order. Do you know what verb would describe that operation?" He said "sort", so we gave it a try:

(sort friends) # (:andy :elliot :ethan :james :tommy)
We both smiled at how easy it was. Then I asked him if we could put the names in descending order. We were looking for the right word, and he came up with backwards. Well, it's close, so I asked him to look up synonyms for that word in Google. We both settled on reverse. He tried this:
(reverse friends) # (:elliot :ethan :tommy :james :andy)
Oh-oh. This only put the original list into reverse order without sorting. It was obvious that we needed to sort it first and then reverse it. I helped him to write this:
(reverse (sort friends)) # (:tommny :james :ethan :elliot :andy)

We wrapped up our session by creating a vector with numbers.

(def numbers [9 12 5 7 1])
Based on what he learned earlier, he put them numbers in descending order:
(reverse (sort numbers)) # (12 9 7 5 1)
I showed him how we could use the filter and the odd? functions to filter out the odd numbers.
(filter odd? numbers) # (9 5 7 1)
He was able to sort the odd numbers by using the sort function:
(sort (filter odd? numbers)) # (1 5 7 9)
Picking the largest odd number was a tricky one, we had to look at the docs, as this did not work:
(max (filter odd? numbers)) # (1 5 7 9)
But this did:
(apply max (filter odd? numbers)) # 9
There was an easier way to get the largest odd number: reverse sorting the odd numbers and getting the first item. This is what we wrote:
(first (reverse (sort (filter odd? numbers)))) # 9
We both smiled when we saw the result.

This is the full code we wrote together:

We are going to spend half of our time going through the same function composition the next time. Then we might look at other collection types like maps and sets. I don't want to push him, but if he asks for it, we'll go as far as we can.

Tuesday, January 26, 2016

Vitamin, Medicine, Drug

I worked with a good friend of mine on a product idea a few months ago. The number of disengaged employees at large enterprises is staggering, we both have witnessed this during our professional career. We tried measuring employee engagement and happiness by frequent, short surveys, providing a real-time engagement thermometer to management.

I talked with two investors about our idea and one of them told me this: "There are two types of products, vitamins and medicines. While vitamins are good to have for a company, it's not absolutely essential. The company can survive and even thrive without it. However, a medicine is a must have, companies can't live without it. I tend to invest in medicine-like product ideas, and I am sorry, yours is a vitamin. It can get big, but selling the idea will be hard."

The Vitamin

A vitamin product might be important for management, but the perceived value for the employee is unclear. Unless we were able to provide some kind of value for the person who fills out our survey, the employee would never be engaged.

I once had to track my hours on projects at a large enterprise just to provide data for the army of project managers to calculate actual project cost. Although this had value to the employer, it had very little value to me or my peers. We were constantly nudged by management to log the hours by the end of each week.
Now, if the company is in consulting and the employee won't get paid unless she provides the hourly breakdown of billable hours, it's a different story. The employee has vested interest in providing the data, otherwise, she will never get paid.

Selling the Vitamin can take an army of sales people for cold calling prospects. The referral rate is low, users are not very engaged.

The Medicine

The medicine product has real benefits for both the employee and the employer.
I have witnessed SalesForce shifting from vitamin to medicine category before. Initially, it wasn't taken seriously by the sales teams, but as soon as it was leveraged for financial reporting, it became essential for the company.
Basecamp is another good example for medicine, which is adopted by the enterprise (mostly) through employee demand. I've read about people using Basecamp for their freelance projects, and when they join larger companies, they suggest this tool.
Github is so good, it's pushing the boundaries of medicine. I have worked with many software engineers, however, I have never met a single sales person from Github trying to talk us into submitting our credit card and signing up for private repositories.

Medicine is easy to sell, users are recommending it to other potential customers. Companies with a medicine-like software have a smaller sales team. The reputation of the product is selling itself.

The Drug

There is a third category this investor did not mention to me, but it exists out there. People are so hooked, they get angry when they don't have access to it. It's Facebook. The company did an exercise just recently to train for the battle against Google, they made Facebook inaccessible for Android users to investigate what they would do. Users tried to restart the app on their mobile device a couple of times. When that did not work, they opened their browsers and logged on through that. No matter what, they did not want to miss anything that was happening with their friends.

Finding the drug is super hard. But medicine-like products can be invented, and vitamins can transition into medicine.

When you're searching for a new gig, or you are thinking about your new idea, skip the vitamins, and start out with the medicine.

Sunday, November 1, 2015

Clojure API with Yesql, Migrations and More (Part 3.)

We created a database with scripts, added migrations and communicated with the database with the help of yesql in the previous posts. Please look at those first to get up to speed with this part.

In the final part of the series, we will serialize the data we pull from the database to JSON, and we will expose that data through an HTTP endpoint. We will also add logging to monitor the JDBC communication with the database.

It was about two years ago when I attended a conference and I sat down with a couple of friends one night for a chat. It was late in the evening, after a couple of beers they asked me what I was up to. I told them I am learning Clojure. They wanted to see it in action, we solved FizzBuzz together. They liked it, but one question was lingering there: "can you build a web app with Clojure?". Of course!
We started out as a console application, but the requirements have changed, we need to expose the data via an HTTP interface through JSON. I like to look at web frameworks as a "delivery mechanism", progressing the idea this way follows that.

Use this commit as a starting for this blog post. Rebuild the database by running make build-db.

Serializing the Data as JSON

We will use the cheshire library to serialize the data to JSON. Let's modify the "project.clj" file this way, see my changes highlighted:

...
  :dependencies [[org.clojure/clojure "1.7.0"]
                 [org.postgresql/postgresql "9.4-1201-jdbc41"]
                 [yesql "0.5.1"]
                 [cheshire "5.5.0"]]
...
The serialization should be taken care of by some kind of logic component. Let's write the test for this, place this content into your "test/kashmir/logic_test.clj" file:
(ns kashmir.logic-test
  (:require [clojure.test :refer :all]
            [kashmir.logic :refer :all]
            [cheshire.core :as json]))

(deftest find-member-by-id-test
  (testing "returns a JSON serialized member record"
      (let [member (first (json/parse-string (find-member 2) true))]
        (is (= "Paul" (:first_name member))))))
Let's add the function skeleton to see test errors and not Java failures. Put this in the "src/kashmir/logic.clj" file:
(ns kashmir.logic)

(defn find-member [id] nil)
Rebuild the database with the make build-db command. Running lein test should provide an output similar to this:
% lein test

lein test kashmir.data-test

lein test kashmir.logic-test

lein test :only kashmir.logic-test/find-member-by-id-test

FAIL in (find-member-by-id-test) (logic_test.clj:9)
returns a JSON serialized member record
expected: (= "Paul" (:first_name member))
  actual: (not (= "Paul" nil))

Ran 4 tests containing 4 assertions.
1 failures, 0 errors.
Tests failed.
Cheshire uses two main functions, generate-string to serialize and parse-string to deserialize data. We will have to serialize the data, please modify the "src/kashmir/logic.clj" file this way:
(ns kashmir.logic
  (:require [kashmir.data :as data]
            [cheshire.core :as json]))

(defn find-member [id]
  (json/generate-string (data/find-member id)))
Run your tests again, all 4 should pass now.
As you think about, the logic namespace is responsible for making sure the data component returned data, handling exceptions and validating user input. This is the part of the app I'd test the most.
(Commit point.)

Exposing the Data with Compojure

Compojure is our go-to tool when it comes to building a web interface without much ceremony. Let's add it to our "project.clj" file:

(defproject kashmir "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.7.0"]
                 [org.postgresql/postgresql "9.4-1201-jdbc41"]
                 [yesql "0.5.1"]
                 [compojure "1.4.0"]
                 [ring/ring-defaults "0.1.5"]
                 [cheshire "5.5.0"]]
  :clj-sql-up {:database "jdbc:postgresql://kashmir_user:password@localhost:5432/kashmir"
               :deps [[org.postgresql/postgresql "9.4-1201-jdbc41"]]}
  :ring {:handler kashmir.handler/app}
  :plugins  [[clj-sql-up "0.3.7"]
             [lein-ring "0.9.7"]]
  :main ^:skip-aot kashmir.core
  :target-path "target/%s"
  :profiles {:uberjar {:aot :all}
             :dev {:dependencies [[javax.servlet/servlet-api "2.5"]
                                  [ring-mock "0.1.5"]]}})
We also need to add a "src/kashmir/handle.clj" file, that will handle the different web requests:
(ns kashmir.handler
  (:require [compojure.core :refer :all]
            [compojure.route :as route]
            [ring.middleware.defaults :refer [wrap-defaults api-defaults]]
            [kashmir.logic :as logic]))

(defroutes api-routes
    (GET "/" [] "Hello World")
    (GET "/members/:id{[0-9]+}" [id]
         {:status 200
          :headers {"Content-Type" "application/json; charset=utf-8"}
          :body (logic/find-member (read-string id))})
    (route/not-found "Not Found"))

(def app
    (wrap-defaults api-routes api-defaults))
Fire up the server with the lein ring server-headless command. Open up a new terminal window, and request the member with ID 2 using the curl command: curl -i http://localhost:3000/members/2. You should see something like this:
% curl -i http://localhost:3000/members/2
HTTP/1.1 200 OK
Date: Thu, 15 Oct 2015 17:31:44 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 123
Server: Jetty(7.6.13.v20130916)

[{"id":2,"first_name":"Paul","last_name":"McCartney",
  "email":"pmccartney@beatles.com","created_at":"2015-10-15T16:50:03Z"}]%
The -i switch for curl will print out both the header and the body of the response.
(Commit point.)

Using Ring Response

The way we are generating the response is too verbose, we are explicitly setting the status, the headers and the body. There are ring helpers we can take advantage of, making this a lot shorter.
Change the "src/kashmir/handler.clj" file content to this (highlighted rows will designate changes):

(ns kashmir.handler
  (:require [compojure.core :refer :all]
            [compojure.route :as route]
            [ring.middleware.defaults :refer [wrap-defaults api-defaults]]
            [ring.util.response :as rr]
            [kashmir.logic :as logic]))

(defroutes api-routes
    (GET "/" [] "Hello World")
    (GET "/members/:id{[0-9]+}" [id]
         (rr/response (logic/find-member (read-string id))))
    (route/not-found "Not Found"))

(def app
    (wrap-defaults api-routes api-defaults))
Fire up the server, run the curl request, everything should still work the same.
(Commit point.)

Stubbing out Data Access in Logic Tests

Hitting the database for the logic function is feasible, but it won't buy you all that much. You can stub out your database call with Clojure's with-redefs function. You need to define a function that returns the value the data access function would return.

Modify the "test/kashmir/logic_test.clj" file this way:
(ns kashmir.logic-test
  (:require [clojure.test :refer :all]
            [kashmir.logic :refer :all]
            [kashmir.data :as data]
            [cheshire.core :as json]))

(deftest find-member-by-id-test
  (testing "returns a JSON serialized member record"
    (with-redefs [data/find-member (fn [id] [{:first_name "Paul"}])]
      (let [member (first (json/parse-string (find-member 2) true))]
        (is (= "Paul" (:first_name member)))))))

Now, stop your Postgres database server and run this test, it should pass as it's not hitting the database, it purely tests the hash serialization.
(Commit point.)

Adding JDBC Logging

Our solution works well as it is, however, we don't see what kind of SQL statements are executed against the database. Turning on logging in Postgres is one option, but monitoring JDBC within our application is prefereable. We will use the log4jdbc library to log jdbc activities. This library is using the Simple Logging Facade For Java library, you need to add that jar file to the project.

Download the slf4j jar file and add it to the project's lib directory. Then modify the "project.clj" file this way:

                  [yesql "0.5.1"]
                  [compojure "1.4.0"]
                  [ring/ring-defaults "0.1.5"]
                  [cheshire "5.5.0"]]
                  [cheshire "5.5.0"]
                  [com.googlecode.log4jdbc/log4jdbc "1.2"]]
   :clj-sql-up {:database "jdbc:postgresql://kashmir_user:password@localhost:5432/kashmir"
                :deps [[org.postgresql/postgresql "9.4-1201-jdbc41"]]}
   :ring {:handler kashmir.handler/app}
   :resource-paths ["lib/slf4j-simple-1.7.12.jar"]
   :plugins  [[clj-sql-up "0.3.7"]
              [lein-ring "0.9.7"]]
   :main ^:skip-aot kashmir.core
You need to configure slf4j, you can do that by adding this content to the "resources/log4j.properties" file:
# the appender used for the JDBC API layer call logging above, sql only
log4j.appender.sql=org.apache.log4j.ConsoleAppender
log4j.appender.sql.Target=System.out
log4j.appender.sql.layout=org.apache.log4j.PatternLayout
log4j.appender.sql.layout.ConversionPattern= \u001b[0;31m (SQL)\u001b[m %d{yyyy-MM-dd HH:mm:ss.SSS} \u001b[0;32m %m \u001b[m %n

# ==============================================================================
# JDBC API layer call logging :
# INFO shows logging, DEBUG also shows where in code the jdbc calls were made,
# setting DEBUG to true might cause minor slow-down in some environments.
# If you experience too much slowness, use INFO instead.

log4jdbc.drivers=org.postgresql.Driver

# Log all JDBC calls except for ResultSet calls
log4j.logger.jdbc.audit=FATAL,sql
log4j.additivity.jdbc.audit=false

# Log only JDBC calls to ResultSet objects
log4j.logger.jdbc.resultset=FATAL,sql
log4j.additivity.jdbc.resultset=false

# Log only the SQL that is executed.
log4j.logger.jdbc.sqlonly=FATAL,sql
log4j.additivity.jdbc.sqlonly=false

# Log timing information about the SQL that is executed.
log4j.logger.jdbc.sqltiming=FATAL,sql
log4j.additivity.jdbc.sqltiming=false

# Log connection open/close events and connection number dump
log4j.logger.jdbc.connection=FATAL,sql
log4j.additivity.jdbc.connection=false
Finally, you need to modify the "src/kashmir/data.clj" file to use the logger Postgres connection:
   (:require [yesql.core :refer [defqueries]]
             [clojure.java.jdbc :as jdbc]))
 
 (def db-spec {:classname "net.sf.log4jdbc.DriverSpy"
               :subprotocol "log4jdbc:postgresql"
               :subname "//localhost:5432/kashmir"
               :user "kashmir_user"
               :password "password1"})
Now when you run the tests or hit the HTTP endpoint with cURL, you should see the JDBC logs in the terminal:
lein test kashmir.data-test
[main] INFO jdbc.connection - 1. Connection opened
[main] INFO jdbc.audit - 1. Connection.new Connection returned
[main] INFO jdbc.audit - 1. PreparedStatement.new PreparedStatement returned
[main] INFO jdbc.audit - 1. Connection.prepareStatement(SELECT *
FROM members
WHERE id = ?) returned net.sf.log4jdbc.PreparedStatementSpy@51dbed72
[main] INFO jdbc.audit - 1. PreparedStatement.setObject(1, 2) returned
[main] INFO jdbc.sqlonly - SELECT * FROM members WHERE id = 2
...
(Commit point.)

As you can see, the log can be verbose. The easiest way I found to turn off logging is changing the log4jdbc:postgresql subprotocol back to the original value: postgresql.
(Commit point.)

This last step concludes the series. We set up a database build process, added migrations and seed data to it. We separated SQL from Clojure by using the yesql library. We added testing with mocking to make sure our code is working properly. We exposed the data as JSON through an HTTP endpoint and we added JDBC logging to the project to monitor the communication with the database.

I hope you will find this exercise helpful. Good luck building your database backed Clojure solution!

Saturday, October 31, 2015

Clojure API with Yesql, Migrations and More (Part 2.)

In the previous article, we started working on kashmir, a Clojure project that interacts with a database, and exposes the data through a JSON HTTP endpoint.
In this post we'll seed the database with some test data, add yesql as our DB communication tool, at the end we will cover testing.

Adding Seed Data

Use this commit as your starting point for this exercise. Rebuild your database by running make build-db to make sure you have no records in the tables. Create a new file in resources/seeds.sql and add the following content to it:

INSERT INTO bands(name) VALUES ('The Beatles');
INSERT INTO bands(name) VALUES ('The Doors');

INSERT INTO members(first_name, last_name, email)
VALUES ('John', 'Lennon', 'jlennon@beatles.com');
INSERT INTO members(first_name, last_name, email)
VALUES ('Paul', 'McCartney', 'pmccartney@beatles.com');
INSERT INTO members(first_name, last_name, email)
VALUES ('George', 'Harrison', 'gharrison@beatles.com');
INSERT INTO members(first_name, last_name, email)
VALUES ('Ringo', 'Starr', 'rstarr@beatles.com');

INSERT INTO bands_members(band_id, member_id) VALUES(1, 1);
INSERT INTO bands_members(band_id, member_id) VALUES(1, 2);
INSERT INTO bands_members(band_id, member_id) VALUES(1, 3);
INSERT INTO bands_members(band_id, member_id) VALUES(1, 4);
This will create 2 band and 4 member records. It will also associate the members of The Beatles with their band record. We will insert these records through Postgres' command line tool. Let's add this to our build-db target in our Makefile:
  ...
build-db:
 dropdb --if-exists --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
 createdb --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
 lein clj-sql-up migrate
 psql -U $(USER) -d $(DBNAME) --file resources/seeds.sql > /dev/null
We added < /dev/null to this line, we are not interested seeing how many records got inserted into the tables. When you run make build-db you should have the seed data inserted into your database.
(Commit point.)

Talking to the Database with Yesql

The natural way to communicate with a database in Clojure is using java.jdbc. However, the spaghetti SQL is hard to understand, and mixing of Clojure code with SQL could make it a mess very quickly. I found the fantastic tool yesql a few weeks ago and it was just what I needed: an easy way to separate SQL from Clojure. Let's add yesql and the Postgres jdbc driver to the project by modifying the project.clj file this way:

(defproject kashmir "0.1.0-SNAPSHOT"
  ...
  :dependencies [[org.clojure/clojure "1.7.0"]
                 [org.postgresql/postgresql "9.4-1201-jdbc41"]
                 [yesql "0.5.1"]]
  ...)
Create a new directory called "sql" under "src/kashmir". Create a new SQL file in this directory called "data.sql". Add these two queries to it:
-- name: find-member-by-id
-- Find the member with the given ID(s).
SELECT *
FROM members
WHERE id = :id

-- name: count-members
-- Counts the number of members
SELECT count(*) AS count
FROM members
The line in this SQL file that begins with -- name: has special significance. Yesql will create data access functions with the name you define there.
Add a new Clojure file under "src/kashmir" called "data.clj", this file will hold the data access functions. Add the following code to it:
(ns kashmir.data
  (:require [yesql.core :refer [defqueries]]
            [clojure.java.jdbc :as jdbc]))

(def db-spec {:classname "org.postgresql.Driver"
              :subprotocol "postgresql"
              :subname "//localhost:5432/kashmir"
              :user "kashmir_user"
              :password "password1"})

(defqueries "kashmir/sql/data.sql"
            {:connection db-spec})
I am a bit unhappy with duplicating the Postgres connection information here, I'll leave you to set up the DB connection in the project.clj file.
Fire up the REPL to see if this works (you can find my input highlighted below):
% lein repl
nREPL server started on port 55527 on host 127.0.0.1 - nrepl://127.0.0.1:55527
REPL-y 0.3.7, nREPL 0.2.10
Clojure 1.7.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
    Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

kashmir.core=> (require '[kashmir.data :refer :all])
nil
kashmir.core=> (count-members)
({:count 4})
kashmir.core=> (find-member-by-id {:id 2})
({:id 2, :first_name "Paul", :last_name "McCartney", :email "pmccartney@beatles.com", :created_at #inst "2015-10-14T19:59:48.905474000-00:00"})
kashmir.core=>
Fantastic! We can talk to the database through yesql based on the SQL scripts defined in "src/kashmir/sql/data.sql" file.
(Commit point.)

Adding Tests

Although our application does not have much logic just yet, I'd like to show you how you could start writing automated tests. Create a new test file under "test/kashmir/data_test.clj". Add the following code to it:

(ns kashmir.data-test
  (:require [clojure.test :refer :all]
            [kashmir.data :refer :all]))

(deftest count-members-test
  (testing "there are 4 members"
    (is (= 4 (-> (count-members) first :count)))))
Remove the failing test in "test/kashmir/core_test.clj" file:
(ns kashmir.core-test
  (:require [clojure.test :refer :all]
            [kashmir.core :refer :all]))
Run the tests by invoking lein test and you should see the one and only test passing:
% lein test

lein test kashmir.data-test

Ran 1 tests containing 1 assertions.
0 failures, 0 errors.
(Commit point.)

Finding a Member

Yesql needs a hash even when a record is looked up by an ID. This is how you invoke the yesql generated function: (find-member-by-id {:id 2}). We should keep the data access interface unaware of this implementation detail. Let's find a member by an ID this way: (find-member 2). Write the test for this in test/kashmir/data_test.clj:

...

(deftest find-member-by-id-test
  (testing "finds PM with id 2"
    (is (= "Paul" (-> (find-member 2) first :first_name)))))
This is the code implementation of it in "src/kashmir/data.clj":
...

(defn find-member [id]
  (find-member-by-id {:id id}))
Both of the tests should pass now.
(Commit point.)

Adding a Member

Reading data with yesql is simple, but adding records and testing that over and over can be more challenging. The database has to be reset to its original state after each test run. You have two options here:

  • truncate all the tables after each test,
  • roll back the INSERT transactions.
The way you can truncate all the tables is a blog post in itself. Unfortunately the Clojure community has not created a DatabaseCleaner project we love so much in the Ruby world just yet. Let's use the roll-back feature of the INSERT transaction for our tests in the examples.

When you create a member, you need to associate that member with a band. In fact, a member can not be added to the database without a band. A hash with all the member data and the band name will be the arguments to this create function.
Let's write the test first in the "test/kashmir/data_test.clj" file:

...
(deftest create-member-test
  (testing "adds a member to the DB"
    (let [member {:first_name "Jim" :last_name "Morrison" :email "jmorrison@doors.com"}]
      (is (= 1 (create-member! member "The Doors"))))))
Let's write the simplest code that could possibly work. First, we need to add the INSERT SQL statements to "src/kashmir/data.sql". This is what those look like:
...

-- name: find-band-by-name
-- Finds a band record based on the provided name
SELECT *
FROM bands
WHERE name = :name

-- name: create-member-raw!
-- Adds a new member with the bands_members association
WITH inserted AS (
  INSERT INTO members (first_name, last_name, email)
  VALUES (:first_name, :last_name, :email)
  RETURNING id
)
INSERT INTO bands_members (member_id, band_id)
SELECT inserted.id, :band_id FROM inserted
As I was writing this blog post, I researched how I could insert records into different tables with one SQL statement. Using stored procedure or function would be an easy choice, but that's too heavy for what we need. I found this blog post by Rob Conery. He shows how CTEs (Common Table Expressions) can be used to insert and reuse the created record in a subsequent insert. That's what you see in the second SQL command. By using this solution, the Clojure code will be small, as the database write functionality is delegated to PostgreSQL.
This is what the data logic will look like in the "src/kashmir/data.clj" file:
...

(defn create-member!
  ([member band-name]
    (let [band-id (-> (find-band-by-name {:name band-name})
                       first
                       :id)]
        (create-member-raw! (conj member {:band_id band-id})))))
The "-raw" postfix was used for the function that gets generated by yesql. We could have created an alias, but I liked this kind of naming-convention.
When you run the test it won't error out, but one of the tests will fail. It has more than 4 total records in the members table. Absolutely, the database was not restored to its default state. Let's take care of that! We will insert the record, but we will roll back the transaction once the test is complete, leaving the database in it's original, default state.
Add/modify the highlighted lines in your "src/kashmir/data.clj" file:
...

(defn create-member!
  ([member band-name]
    (jdbc/with-db-transaction [tx db-spec]
      (create-member! member band-name tx)))
  ([member band-name tx]
    (let [band-id (-> (find-band-by-name {:name band-name})
                       first
                       :id)]
        (create-member-raw! (conj member {:band_id band-id})
                            {:connection tx}))))
And finally, initialize and roll back the transaction from the test. Change the highlighted lines in "test/kashmir/data_test.clj" this way:
(ns kashmir.data-test
  (:require [clojure.test :refer :all]
            [kashmir.data :refer :all]
            [clojure.java.jdbc :as jdbc]))
  ...

(deftest create-member-test
  (jdbc/with-db-transaction [tx db-spec]
    (jdbc/db-set-rollback-only! tx)
      (testing "adds a member to the DB"
        (let [member {:first_name "Jim" :last_name "Morrison" :email "jmorrison@doors.com"}]
          (is (= 1 (create-member! member "The Doors" tx)))))))
Rebuild your database and run your tests. You should see the 0 failures, 0 errors. message. Do it many times, the tests should always pass.
(Commit point.)

One Last Refactoring

I am unhappy with the create-member! function. The way we are looking up the band by its name is inelegant, I feel we could do better. Since we have one band record by name, when we call find-band-by-name, we should get back one single hash and not a lazy-seq with a hash in it. Let's refactor to that! First, we'll renamed the yesql generated function to find-band-by-name-raw in the "src/kashmir/sql/data.sql" file:

...

-- name: find-band-by-name-raw
-- Finds a band record based on the provided name
SELECT *
FROM bands
WHERE name = :name
Let's refactor the actual code like this in "src/kashmir/data.clj":
...

(defn find-band-by-name [name]
  (first (find-band-by-name-raw {:name name})))

(defn create-member!
  ([member band-name]
    (jdbc/with-db-transaction [tx db-spec]
      (create-member! member band-name tx)))
  ([member band-name tx]
    (let [band-id (:id (find-band-by-name band-name))]
        (create-member-raw! (conj member {:band_id band-id})
                            {:connection tx}))))
I rebuilt the db, ran the tests and everything passed.
(Commit point.)

You could say this is only the "happy path", what if the band name is incorrect and no band will be found. This will blow up somewhere. Absolutely! You need to do exception handling and error checking. I wanted to keep my examples simple, so others coming to Clojure can benefit from the simplified code.

This last refactoring concludes the second part of the series. In the final session we will add logging to jdbc to monitor how yesql communicates with the database. We will also expose the data as JSON through an HTTP endpoint.

Clojure API with Yesql, Migrations and More (Part 1.)

I've found endless books and articles explaining core ideas and the building blocks of the Clojure programming language. They show you how to use the different data structures, they have good examples for little tasks, like reading and parsing a CSV file, but books or articles that walk you through an example of building a comprehensive solution is hard to find.
I always liked writings that showed me how to build an app. I would learn many aspects of a language, get familiar with tools, and most of all, I would build something that could serve as foundation for my future projects.
I am planning to do just that with this series of blog posts. I'd like to show you how to:

  • Set up your database environment with scripts
  • Manage database changes through migrations
  • Test the different components
  • Stub out function calls you don't need for testing
  • Add logging to monitor the database communication
  • And all this in Clojure!
By the end of these blog posts you will be able to expose database through Clojure libraries as JSON data with HTTP endpoints.

I have following presumptions:

  • You have Clojure installed (I have version 1.7.0 at the time of writing)
  • We'll use PostgreSQL (mine is 9.4.4)
  • I am using OSX (10.10.5)
The name of the app is "kashmir", you can find the final solution in this public repo. I will link specific commit points to the blog posts, this way you can join this tutorial at any point you want. Let's dive in!

The Data Model

The data model is simple, it has only 3 tables. The members table lists the various band members, the bands table lists all the bands those members belong to, and the bands_members table is used to map the members to their bands.
This is what it looks like:

Creating the Database User

I use the excellent pgcli tool as my command line interface for Postgres. It has code completion, table name suggestion features, it's your psql tool on steorid. If you don't have it, grab it through homebrew. Create a DB user called "kashmir_user" and allow this user to create DBs. This is how you do it in the command line, all the inputs are highlighted:

% pgcli postgres
Version: 0.19.1
Chat: https://gitter.im/dbcli/pgcli
Mail: https://groups.google.com/forum/#!forum/pgcli
Home: http://pgcli.com
postgres> CREATE USER kashmir_user PASSWORD 'password1';
CREATE ROLE
Command Time: 0.000s
Format Time: 0.000s
postgres> ALTER USER kashmir_user CREATEDB;
ALTER ROLE
Command Time: 0.000s
Format Time: 0.000s
postgres>

Initializing the Project

Generate the new project skeleton by runnig the lein new app kashmir command in the terminal. You should have a skeleton app project that looks like this. When you run lein run, you see "Hello, World!", and when you run the tests you see the 1 failure:

% lein test

lein test kashmir.core-test

lein test :only kashmir.core-test/a-test

FAIL in (a-test) (core_test.clj:7)
FIXME, I fail.
expected: (= 0 1)
  actual: (not (= 0 1))

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.

Creating the Database

Database drop and create operations should be scripted. You can use rake db:drop and rake db:create in Rails, we should have something similar here. You can use the Postgres command line tools to create and drop databases with the createdb and dropdb commands. The --if-exists switch helps you when you're running it for the first time, the command won't error out if the database does not exist.
The easiest way to create executable tasks is with a Makefile. Create a new file called Makefile in your project root and add this to it:

DBNAME=kashmir
USER=kashmir_user
PORT=5432
HOST=localhost

PGPASSWORD=password1

# Builds the DB by dropping and recreating it
build-db:
 dropdb --if-exists --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
 createdb --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
We set up variables in the Makefile, it will be easy to change these values later, you are also adhering to good DRY principles.
Run the Make target by typing the command make build-db in the terminal. You can run this as many times as you want, it will drop and recreate the empty database for you.
(Commit point.)

Running Migrations

The best way to implement changes in a database is through reversible migrations. I usually use the great clj-sql-up migration tool for that. Let's add it to the project.clj file:

(defproject kashmir "0.1.0-SNAPSHOT"
  ...
  :clj-sql-up {:database "jdbc:postgresql://kashmir_user:password@localhost:5432/kashmir"
               :deps [[org.postgresql/postgresql "9.4-1201-jdbc4"]]}
  :plugins  [[clj-sql-up "0.3.7"]])
Run the command lein clj-sql-up create create-members to generate your first migration. This should create a new file in the "migrations" directory. Open up that file and add your migration SQL to it:
(defn up []
    ["CREATE TABLE members(id SERIAL PRIMARY KEY,
                           first_name varchar(50) NOT NULL,
                           last_name varchar(50) NOT NULL,
                           email varchar(50) NOT NULL,
                           created_at timestamp NOT NULL default CURRENT_TIMESTAMP)"
     "CREATE INDEX idx_members_id ON members(id)"
     "CREATE UNIQUE INDEX idx_email_unique ON members(email)"])

(defn down []
  ["DROP TABLE members"])
Test your SQL by running lein clj-sql-up migrate in the terminal. I would recommend looking at the database to make sure the first table, "members" got created properly. Open up pgcli and run \dt from the pgcli prompt. You should see two tables listed there:
  • clj_sql_migrations
  • members
The table "clj_sql_migrations" is used to track the actual version of your database, it's the metadata for clj-sql-up to run the migrations. Let's add the "bands" and "bands_members" tables as well, create a new migration file with the clj-sql-up generator: lein clj-sql-up create create-bands. Open up the migrations/*-create-bands.clj file and add this SQL:
(defn up []
    ["CREATE TABLE bands(id SERIAL PRIMARY KEY,
                         name varchar(50) NOT NULL,
                         created_at timestamp NOT NULL default CURRENT_TIMESTAMP)"
     "CREATE INDEX index_bands_id ON bands(id)"
     "CREATE TABLE bands_members(id SERIAL PRIMARY KEY,
                                 band_id INTEGER REFERENCES bands (id),
                                 member_id INTEGER REFERENCES members (id),
                                 created_at timestamp NOT NULL default CURRENT_TIMESTAMP)"])

(defn down []
  ["DROP TABLE bands"]
  ["DROP TABLE bands_members"])
We should be running the migrations when we drop and rebuild the database. Change your Makefile's "build-db" target like this:
  ...
# Builds the DB by dropping and recreating it
build-db:
  dropdb --if-exists --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
  createdb --username $(USER) $(DBNAME) -h $(HOST) -p $(PORT)
  lein clj-sql-up migrate
Now when you run make build-db in your terminal, you should see the database recreated by dropping it first, creating it and running the migration at the end.
(Commit point.)

In the next post we'll add seed data to the database for testing purposes, we will also use the excellent Yesql library to communicate with the database.