Tuesday, April 25, 2017

Fireside Chat

When I saw a retweet from Jason Fried about available tickets to a fireside chat with him at Basecamp, I jumped on it. I figured if I can kill two birds with one stone, - meeting him in person and seeing their offices - it's a no-brainer. Company Culture was the topic of the conversation led by Aimee Groth, who visited Chicago to publicize her new book, Kingdom of Happiness about Zappos' culture.

Basecamp HQ

Basecamp HQ is as cool as you think it is. Very few desks, a couple of meeting rooms. It reminded me more of a train terminal with its large windows and limited furnishing than a real office. The office is centered around an auditorium, which is an effective PR and educational platform for the company.

I enjoyed looking at the walls covered with postcards from employees all over the world, but I especially liked David's H-1B approval notice from the USCIS from 2005. I laughed out loud when I noticed it, as I had to go through similar hassle myself, but mine is safely guarded with my documents at home.

Basecamp works in six weeks work schedule. Whatever the team can get down in six weeks, they will deliver it. The scope can change, but the six weeks schedule is hard set. This timeframe helps them delivering functionality, and since the company is working remotely, it worked out well for them.

They don't have managers who only manage people or projects, the teams are led by team leads. These team leads are developers as well, shipping code on a daily basis. Jason and his team realized that managers who do not code, grow away from the work. According to him, "professional (full time) managers forget to do the work".
At one point they've tried rotating team leads, but that did not work out, as the continuity was lost. I could see that: "I see this problem, but I won't deal with it, I'll leave it for the next person, who will take over." Basecamp is looking for people who are self-managed, however, Jason emphasized multiple times: "people like to be led". It's important to "rally the folks by clear goals and purpose".

Jason also talked about the Jeff Bezos investment in the company, which meant a small ownership stake in Basecamp. They did not need the money to survive, David and Jason felt having a person like Mr. Bezos is mutually beneficial to both parties. "Who would not like to have Jeff Bezos as an advisor in his or her company?!" They have not talked to Jeff Bezos for a while, but if they wanted to, they could just reach out to his secretary, set up a meeting, and Jeff would fly to Chicago for a meeting or dinner with them.

The best advice from Bezos - and according to Jason, this was worth the entire dividend they have paid for his investment - was: "invest in the things in your business, that won't change". Don't chase the shiny new things, stick to what will not change. For them, it's Basecamp. The company had 4-5 products that they sold a couple of years ago to focus on their main product, which is Basecamp.

Jason went into details why other products were divested (like HighRise, Backpack, Campfire). Maintaining the web, Android and iOS versions of their products resulted in 15 different projects. That led to insufficient focus for each platform for each product with the employees they had at the time. They could - of course - have hired other developers, but they intentionally wanted to stay small. They did not want to get richer, be the next billionaire, they were just as happy with what they had. This sounds strange, almost naive in the era of bloated startups that are bleeding money chasing to be the next Facebook.

I enjoyed the Q&A at the very end. Some interesting questions came up about the startup community in Chicago, about VCs in general. Jason kindly offered to stay as long as everybody's questions were answered. Really a courteous offer, considering it was after 8 pm on a Friday night.

Oh, yes, and one more thing: Basecamp has 130,000 paying customers. It's a remarkable achievement by a company that has never taken VC money, was profitable from the get-go, and created an exciting app in the "not-so-exciting" domain of project management.

Tuesday, March 28, 2017

Containers

As I was exploring how to make Golang even faster on AWS Lambda, I found a project that promised sub-millisecond execution time compared to my (already pretty good) ~60 millisecond. It used Python execution that ran the Go code in-process in contrast to my attempt, where I had to spawn a new process and execute the lambda code there. Very clever, no wonder that solution did not have the 60-millisecond penalty for running that code. However, in order to build the sample code for this AWS Lambda I had to use Docker.

I've heard about Docker years ago, understood what it's used for at a very high level, however, I have never really given it a serious try. I figured it was time. Boy, I was in for some pleasant surprise!

The project AWS Lambda Go from Eawsy used Docker to containerize their build environment on my laptop. What does that mean? Imagine having a build server running on your computer in seconds, where the version of the Go compiler, the Python environment is set by the author of the Dockerfile. I'd use a little build engine that takes in my code, runs its magic and a zip file comes out that I can run on Lambda. What?!

I wrote all these different tutorials about running MRI Ruby on AWS Lambda or interacting with a Postgres DB with Clojure and I had to set up all the prereqs in plain text: "you have to have Postgres running, and Clojure, and MRI Ruby". I provided all the different Makefile scripts to follow the examples. However, with Docker, I'd just provide a Dockerfile that sets up the environment in the future.

I believe containers are big and will be even bigger very soon.

I see more and more applications where the code describes the behavior and the container descriptor describes the environment.


They live side by side, clearly stating what virtual components the software needs to execute. Engineers can run the software with those containers locally, and the software can be deployed to the cloud with those images pre-built, with tight control over its execution context.

There are many resources to learn Docker. I started with reading the Docker in Action book and went further by reading the Docker in Practice book.

I created a Docker templates repository, where I collected ideas for different recipes. Do I need a Ruby worker with Redis and Postgres backend? I'll just run docker compose up with this docker_compose.yml file and I have an environment, where everything from the OS to the version of Redis and Postgres is predefined. If it works on my machine, it will work on yours, too.

There are many things I like about Docker as compared to Vagrant or other virtual machine solutions. The biggest thing for me is the low power Docker containers would need. Vagrant images would reserve 2 of your 4 cores and 8GB memory when Docker will only take from the host as much as it needs. If it's 32MB, that's it, if it's 1GB, it will take that much.

Docker is the future, and you will see more and more code repos with a Dockerfile in it.

Wednesday, February 8, 2017

Golang

The first time I heard about Golang was a few years back, when the great guys at Brad's Deals, our next door office neighbor organized and hosted the local Go meetup there. Then IO.js and Node.js war broke out and TJ Holowaychuck shifted from Node.js to Golang announcing the move in an open letter to the community.
I did not think much of the language, as its reputation was far from the beauty of a real functional language.

Fast forward a couple of years and I am giving Ruby a serious try on AWS Lambda. Ruby works there, however, it needs enough memory and 3000 ms (3 seconds) to do anything. We have to invoke some of them millions of times in a month and when we calculated the cost for it, the bill gets fairly large quickly.

I created a simple AWS Lambda with Ruby just to print the words "Hello, World!" with 128 MB memory. It took 5339 ms to execute it.

Ruby Hello World on AWS Lambda

Then one night I wrote a tiny Go program:

package main

import "fmt"

func main() {
  fmt.Println("Hello, World!")
}

I cross compiled (since I am working on OSX) with the command GOOS=linux GOARCH=amd64 go build github.com/adomokos/hello to Linux, packaged it up with a Node.JS executor and ran it. I couldn't believe my eyes, it took only 68 ms to get the string "Hello, World!" back. 68 ms! And it was on a 128 MB memory instance. It was beautiful!

Go Hello World on AWS Lambda

Ruby would need four times the memory and it would still execute ~10 times slower than Go. That was the moment when I got hooked.

Go is a simple language. I am not saying it's easy to learn, it's subjective: it depends on your background, your experience. But it's far from the beauty of Haskell or Clojure. However, the team I am working with would have no trouble switching between Go and Ruby multiple times a day.

What kind of a language today does not have map or reduce functions?! Especially when functions are first-class citizens in the language. It turns out, I can write my own map function if I need to:

package collections

import (
  "github.com/stretchr/testify/assert"
  "strconv"
  "testing"
)

func fmap(f func(int) string, numbers []int) []string {
  items := make([]string, len(numbers))

  for i, item := range numbers {
    items[i] = f(item)
  }

  return items
}

func TestFMap(t *testing.T) {
  numbers := []int{1, 2, 3}
  result := fmap(func(item int) string { return strconv.Itoa(item) }, numbers)
  assert.Equal(t, []string{"1", "2", "3"}, result)
}

Writing map with recursion would be more elegant, but it's not as performant as using a slice with defined length that does not have to grow during the operation.

History

Go was created by some very smart people at Google, I wanted to understand their decision to keep a language this pure.
Google has a large amount of code in C and C++, however, those languages are far from modern concepts, like parallel execution and web programming to name a few. Those languages were created in the 60s and 80s, well before the era of multi-core processors and the Internet. Compiling a massive codebase in C++ can easily take hour(s), and while they were waiting for compilation, the idea of a fast compiling, simple, modern language idea was born. Go does not aim to be shiny and nice, no, its designers kept it:

  • to be simple and easy to learn
  • to compile fast
  • to run fast
  • to make parallel processing easy

Google hires massive number of fresh CS graduates each year with some C++ and Java programming experience, these engineers can feel right at home with Go, where the syntax and concept is similar to those languages.

Tooling

Go comes with many built-in tools, like code formatting and benchmarking to name the few. In fact I set up Vim Go that leverages many of those tools for me. I can run, test code with only a couple of keystrokes.

Let's see how performant the procedure I wrote above is. But before I do that I'll introduce another function where the slice's length is not pre-determined at the beginning of the operation, this way it has to auto-scale internally.

func fmapAutoScale(f func(int) string, numbers []int) []string {
  // Initialize a slice with default length, it will auto-scale
  var items []string

  for _, item := range numbers {
    items = append(items, f(item))
  }

  return items
}

The function is doing the same as fmap, similar test should verify the logic.

I added two benchmark tests to cover these functions:

// Run benchmark with this command
// go test -v fmap_test.go -run="none" -benchtime="3s" -bench="BenchmarkFmap"
// -benchmem
func BenchmarkFmap(b *testing.B) {
  b.ResetTimer()

  numbers := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
  for i := 0; i < b.N; i++ {
    fmap(func(item int) string { return strconv.Itoa(item)  }, numbers)
  }
}

func BenchmarkFmapAutoScale(b *testing.B) {
  b.ResetTimer()

  numbers := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
  for i := 0; i < b.N; i++ {
    fmapAutoScale(func(item int) string { return strconv.Itoa(item)  },
    numbers)
  }
}

When I ran the benchmark tests, this is the result I received:

 % go test -v fmap_test.go -run="none" -benchtime="3s" -bench="BenchmarkFmap"
    -benchmem
 BenchmarkFmap-4   ‡ 10000000 | 485 ns/op | 172 B/op | 11 allocs/op
 BenchmarkFmapAS-4 ‡ 5000000  | 851 ns/op | 508 B/op | 15 allocs/op
 PASS
 ok   command-line-arguments  10.476s

The first function, where I set the slice size to the exact size is more performant than the second one, where I just initialize the slice and let it autoscale. The ns/op displays the execution length per operation in nanoseconds. The B/op output describes the bytes it uses per operation. The last column describes how many memory allocations it uses per operation. The difference is insignificant, but you can see how this can become very useful as you try writing performant code.

Popularity

Go is getting popular. In fact, very popular. It was TIOBE's "Language of the Year" gaining 2.16% in one year. I am sure you'll be seeing articles about Go more and more. Check it out if you haven't done it yet, as the chance of finding a project or job that uses Go is increasing every day.

Sunday, November 13, 2016

Recursion Done Right - Haskell Influenced Recursion in Ruby

Learning Haskell has influenced the way I think about code. I wrote about currying before in various languages, but Haskell taught me a bit about how to do recursion properly.

Although fast paced, I really like the examples in the book Learn you a Little Haskell for Great Good. As one chapter talks about recursion and higher order functions, I was amazed by the simplicity of the code that lets you do basic list operations.

Here is how one could find the maximum of a list:

maximum' :: (Ord a) => [a] -> a
maximum' [] = error "maximum of an empty list"
maximum' (x:xs) = max x (maximum' xs)

There is already a maximum function in Haskell's core library, this example just shows you what you need to do to implement it yourself.

I am not going into details about the type declaration, but there are a couple of points I'd like to talk about. The pattern matching in the second line checks for the case, where the collection is an empty array. When that happens, an exception is thrown. The last line does pattern matching as well, it grabs the head and the tail of the list and saves it into the x and xs variables. Then it uses the max functions to figure out which number is greater: x or the recurred result of maximum' with the tail of the list. This is a prime example of declarative code, its simplicity is striking and the fact that I don’t have to know how max works with the recursion makes it a joy to read.

Let’s look at another example. Here is how you could implement map in Haskell yourself:

map' :: (a -> b) -> [a] -> [b]
map' f [] = []
map' f (x:xs) = f x : map' f xs

Similarly, the edge-case is handled first. The last line has the logic, function f is applied to the head of the list, the result is concatenated with the recurred result of map' of f function and the tail of the list.

All right, let’s see how we could express this logic in Ruby.

Here is my first attempt:

module Collections
  def self.maximum(collection)
    head = collection.first
    tail = collection[1..-1]

    return 0 unless tail
    max head, (maximum tail)
  end

  def self.max(a, b)
    a > b ? a : b
  end
  private_class_method :max
end

RSpec.describe 'Recursion done right' do
   context 'maximum' do
     it 'returns an empty array as max of empty list' do
       expect(Collections.maximum([])).to eq(0)
     end

    it 'returns the maximum of a list' do
      expect(Collections.maximum([1,3,2])).to eq(3)
    end
  end
end

I did not find a max method in Ruby, I added that as a private class method. This is still pretty easy to read, but a bit more verbose than what I'd like it to be. I wanted to find the (x:xs) head-tail (car-cdr for you LiSP folks) equivalent in Ruby, I knew that will be key to make it a more succinct solution. This is it: (head, *tail) = collection. I also had to change the guard to quit from the recursion to look for an empty array, as the splat operator will provide that.

Here is my revised solution:

module Collections
  def self.maximum(collection)
    (head, *tail) = collection

    return 0 if tail.empty?
    max head, (maximum tail)
  end
  ...
end

This is better, but the destructuring can take place in the argument:

module Collections
  def self.maximum((head, *tail))
    return 0 if tail.empty?
    max head, (maximum tail)
  end
  ...
end

This is pretty darn close to the solution in Haskell.
Now let’s look at the map function.

These specs describe the behavior:

context 'map' do
  it 'mapping [] with (*3) gives []' do
    expect(Collections.map(->(x){ x*3 }, [])).to be_empty
  end
  it 'mapping [1,2,3] with (*3) gives [1,6,9]' do
    expect(Collections.map(->(x){ x*3 }, [1,2,3])).to eq([3,6,9])
  end
end

My implementation of map takes a lambda with one argument, which multiplies that one argument by three, and the second argument is the collection of items the map function will operate on.

This is my implementation for it:

module Collections
  def self.map(f, (head, *tail))
    return [] unless head

    [f.(head)] + map(f, tail)
  end
  ...
end

The key to make it concise is the destructuring the collection argument into head and tail. The guard statement makes sure the recursion will quit once there is no item in the head. The bulk of the logic is the last line of the method: the lambda is applied to the head, it's converted into an array and that value is concatenated with the result of the recurred result of the lambda and the rest of the collection.

In our case, the following calculation takes place:

map (*3) [1,2,3]
[(3*1)] + map (*3) [2,3]
[(3*1)] + [(3*2)] + map (*3) [3]
[(3*1)] + [(3*2)] + [(3*3)]
[3] + [6] + [9]
[3,6,9]

Haskell takes pride in how easily it implements the quicksort algorithm. Let’s see how it’s done there:

quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
    let smallerSorted = quicksort [a | a <- xs, a <= x]
        biggerSorted = quicksort [a | a <- xs, a > x]
    in  smallerSorted ++ [x] ++ biggerSorted

I don’t blame you if this seems to be a bit more cryptic than you wanted to be. It takes a little practice to read what is really going on here. I'll explain it, as it will help our own Ruby implementation. The first line is the type declaration, ignore that for now. The second line is the guard, sorting an empty array will give you back an empty array. The meat of the logic begins on the third line. The collection argument is destructured into head and tail, just like I've been doing in the examples above. Based on the head value, we are filtering the elements into smaller-equal, and bigger parts. We do all this recursively until the list is exhausted. Right before the result is returned, the three items, the smaller sorted, the head value and the bigger sorted elements are combined into one collection.

Let’s see how this is done in Ruby. Here are the specs I prepared to prove the logic:

context 'quicksort' do
  it 'returns an empty list for empty list' do
    expect(Collections.quicksort([])).to eq([])
  end
  it 'sorts a list of items' do
    expect(Collections.quicksort([2,5,3])).to eq([2,3,5])
  end
end

Here is how I'd like the code to be:

def self.quicksort((head, *tail))
  return [] unless head

  smaller_sorted = quicksort(Collections.filter(->(x) { x <= head }, tail))
  bigger_sorted = quicksort(Collections.filter(->(x) { x > head }, tail))
  smaller_sorted + [head] + bigger_sorted
end

This logic is very close to the Haskell example, but unfortunately, I don't have the filter function just yet. (Ruby standard library offers the select method on enumerables, but let's keep these examples free from all that.) filter takes a lambda as its predicate function, and a collection it needs to operate on. This spec proves out our logic:

context 'filter' do
  specify 'filter (>2) [] returns an empty list' do
    expect(Collections.filter(->(x){ x > 2 }, [])).to be_empty
  end
  specify 'filter (>2) [1,3,5] returns [3,5]' do
    expect(Collections.filter(->(x){ x > 2 }, [1,3,5])).to eq([3,5])
  end
end

And the implementation is similar what you've seen before:

def self.filter(f, (head, *tail))
  return [] unless head

  if f.(head)
    [head] + filter(f, tail)
  else
    filter(f, tail)
  end
end

And now, when you run the entire spec, the quicksort implementation just magically works.

specs executed

Studying Haskell taught me a few things about recursion. The head and tail concept is essential to make the code simple and neat. Without that it would have been a lot more noisier. Whenever I used recursion before, I always felt I needed an accumulator. I wanted something I could jump to and investigate when something went wrong. I would have written the filter function like this before:

def self.filter(f, (head, *tail), accumulator=[])
  return accumulator unless head

  accumulator << head if f.(head)

  filter(f, tail, accumulator)
end

Although this works, adding the accumulator with a default argument to the list just makes this code a lot noisier, but I do like not having conditional branches in it, it's just easier to reason about this code.

You can review the examples in this gist.

Based on what you read here, try implementing replicate, take, reverse, repeat and zip functions yourself. In case you need directions, check out this gist to see how I did it.

Friday, October 21, 2016

Vim Katas

It was about 5 years ago when I watched Jim Weirich solving the Roman Numeral kata at an SCNA conference in Chicago. I was amazed by how he mastered his editor of choice: Emacs. His fingers were flying on the keyboard and he wrangled the code I have rarely seen anybody before that.

aws-lambda

I started using Vim in 2008 or 2009, but I never really invested the time to be good at it. I read the man pages, I went through Vim tutor, but I never really picked up or started using most of the advanced features.

I remember how great I felt when I reset my Caps Lock key to function as Ctrl key. The power of hitting <Ctrl+c> with my pinky and index finger just to trigger <Esc> without reaching up to the top left on the keyboard made me feel I've just found kryptonite.

I've had the book Practical Vim for some time, but I never really practiced the examples in it. I looked at them here and there, tried them out, but I've never made a habit of practicing those daily. Then one day I got sick and tired of my inabilities, I started a new markdown document where I jotted down the first exercise and the project of Vim Katas was born. Every time I commuted to work, I started with the first one and practiced all of them. Once I got to the end of it, I read the book further and added new exercises to it.

I might have covered 60% of the book by now, but when I bump into a repeatable task and I leverage a macro for it, it always puts a smile on my face.

Using Vim reminds me of learning to play a musical instrument. It takes time and effort to be good at it, and the keystrokes have to be at your fingertips. Once you slow down and think about it, the effectiveness of the text-based editor fades away, you would be more efficient by using visual editor instead (like Sublime Text, IntelliJ or Visual Studio).

Vim (and I am sure many other tools) has this huge wall that people have to climb first to appreciate it. Clone that repo, open the first markdown file in Vim, and start practicing!

Monday, June 6, 2016

Using Ruby with ActiveRecord in AWS Lambda

In my previous blog post I described how you can run MRI Ruby on AWS Lambda. In this article I'll guide you through adding gems to the project: first faker, and then the mysql2 gem with activerecord, and finally we will have the Ruby code talk to an RDS instance, all this through an AWS Lambda.

I recorded all my changes in this project, feel free to jump in where you want, you'll find commit points after each section.

1. Add faker to the project

You can pick up the changes from the previous blog post here. Let's start by editing our Ruby app, add a Gemfile to it in the hello_ruby directory:

source 'https://rubygems.org'

gem 'faker'

Run BUNDLE_IGNORE_CONFIG=1 bundle install --path vendor in that directory. The --path vendor argument is important, as we have to package all the files in the vendor directory. Make sure the BUNDLED WITH part of your Gemfile.lock is not there as that can cause you some pain when you deploy your code to AWS Lambda.

Edit the lib/hello.rb file like this:

#!/usr/bin/env ruby

require 'faker'

puts "Hello - '#{Faker::Name.name}' from Ruby!"

We required the faker gem and used it to generate a fake name. Run the app in the terminal with bundle exec ruby lib/hello.rb command.

Hello - 'Jamar Gutmann II' from Ruby!

You will get a different name between the single quotes, but that's the point, faker generates a random name for us.

Commit point

2. Use faker with Traveling Ruby

The run target in the Makefile will have to copy all vendorized gems, plus it needs to configure the app to run with the correct bundler settings. This step is heavily influenced by how Traveling Ruby packages gems for deployment, please review their tutorial as reference.

Add a bundler_config template file to the resources directory with this content:

BUNDLE_PATH: .
BUNDLE_WITHOUT: development:test
BUNDLE_DISABLE_SHARED_GEMS: '1'

Change the resources/wrapper.sh file to set the Gemfile’s location:

#!/bin/bash
set -e

# Figure out where this script is located.
SELFDIR="`dirname \"$0\"`"
SELFDIR="`cd \"$SELFDIR\" && pwd`"

# Tell Bundler where the Gemfile and gems are.
export BUNDLE_GEMFILE="$SELFDIR/lib/vendor/Gemfile"
unset BUNDLE_IGNORE_CONFIG

# Run the actual app using the bundled Ruby interpreter, with Bundler activated.
exec "$SELFDIR/lib/ruby/bin/ruby" -rbundler/setup "$SELFDIR/lib/app/hello.rb"

Modify the Makefile's run target with the following changes:

...

run: ## Runs the code locally
    @echo 'Run the app locally'
    @echo '-------------------'
    @rm -fr $(OSXDIR)
    @mkdir -p $(OSXDIR)/lib/ruby
    @tar -xzf resources/traveling-ruby-20150715-2.2.2-osx.tar.gz -C $(OSXDIR)/lib/ruby
    @mkdir $(OSXDIR)/lib/app
    @cp hello_ruby/lib/hello.rb $(OSXDIR)/lib/app/hello.rb
    @cp -pR hello_ruby/vendor $(OSXDIR)/lib/
    @rm -f $(OSXDIR)/lib/vendor/*/*/cache/*
    @mkdir -p $(OSXDIR)/lib/vendor/.bundle
    @cp resources/bundler-config $(OSXDIR)/lib/vendor/.bundle/config
    @cp hello_ruby/Gemfile $(OSXDIR)/lib/vendor/
    @cp hello_ruby/Gemfile.lock $(OSXDIR)/lib/vendor/
    @cp resources/wrapper.sh $(OSXDIR)/hello
    @chmod +x $(OSXDIR)/hello
    @cd $(OSXDIR) && ./hello

...

Run the target with make run and you should see something similar to this in the terminal:

$: make run
Run the app locally
-------------------
Hello - 'Kelly Huel' from Ruby!

We've just run the app with Traveling Ruby's Ruby interpreter, and we used the faker gem's functionality as well!

Commit point

3. Deploy the app with faker to AWS Lambda

In order to run your app in AWS Lambda, you only need to change the package target in your Makefile, everything else, the delete, create, invoke targets should remain the same. Change the file like this:

...

package: ## Package the code for AWS Lambda
    @echo 'Package the app for deploy'
    @echo '--------------------------'
    @rm -fr $(LAMBDADIR)
    @rm -fr deploy
    @mkdir -p $(LAMBDADIR)/lib/ruby
    @tar -xzf resources/traveling-ruby-20150715-2.2.2-linux-x86_64.tar.gz -C $(LAMBDADIR)/lib/ruby
    @mkdir $(LAMBDADIR)/lib/app
    @cp hello_ruby/lib/hello.rb $(LAMBDADIR)/lib/app/hello.rb
    @cp -pR hello_ruby/vendor $(LAMBDADIR)/lib/
    @rm -f $(LAMBDADIR)/lib/vendor/*/*/cache/*
    @mkdir -p $(LAMBDADIR)/lib/vendor/.bundle
    @cp resources/bundler-config $(LAMBDADIR)/lib/vendor/.bundle/config
    @cp hello_ruby/Gemfile $(LAMBDADIR)/lib/vendor/
    @cp hello_ruby/Gemfile.lock $(LAMBDADIR)/lib/vendor/
    @cp resources/wrapper.sh $(LAMBDADIR)/hello
    @chmod +x $(LAMBDADIR)/hello
    @cp resources/index.js $(LAMBDADIR)/
    @cd $(LAMBDADIR) && zip -r hello_ruby.zip hello index.js lib/
    @mkdir deploy
    cd $(LAMBDADIR) && mv hello_ruby.zip ../deploy/
    @echo '... Done.

...

The added rows are very similar to the ones we had to add to run the app locally with Traveling Ruby. Delete the lambda function and recreate it by using the Makefile. When you invoke it, your should see something like this:

START RequestId: 3f6ae8f5-23c1-11e6-9acc-0f50ffa39e9b Version: $LATEST
2016-05-27T04:12:41.473Z
  3f6ae8f5-23c1-11e6-9acc-0f50ffa39e9b
    Hello - 'Mrs. Lelah Bradtke' from Ruby!

END RequestId: 3f6ae8f5-23c1-11e6-9acc-0f50ffa39e9b
REPORT RequestId: 3f6ae8f5-23c1-11e6-9acc-0f50ffa39e9b
       Duration: 3425.01 ms
       Billed Duration: 3500 ms
       Memory Size: 512 MB
       Max Memory Used: 65 MB

The Hello - 'xyz' from Ruby! string contains the Faker gem generated name. You can also invoke the Lambda function through the AWS Management Console, you should see something similar to this in the Log output section:

faker-with-aws-lambda

Commit point

4. Publish a newer version to AWS Lambda

Dropping and recreating the Lambda function works, but it's not the most effective solution. AWS allows you to update your function which we'll do with this new target in the Makefile:

...

publish: package ## Deploys the latest version to AWS
        aws lambda update-function-code \
                --function-name HelloFromRuby \
                --zip-file fileb://./deploy/hello_ruby.zip

...

This target will let you update the function code. It also calls the package target to make sure your latest changes will be deployed to AWS.

Commit point

5. Create a new RDS database with one table

Add this script to your Makefile, it will create a minimal RDS instance for you, you can drop that instance, connect to the DB and drop/create the database with some seed data in it.

DBPASSWD=Kew2401Sd
DBNAME=awslambdaruby

...

create-rds-instance: ## Creates an RDS MySQL DB instance
    aws rds create-db-instance \
        --db-instance-identifier MyInstance01 \
        --db-instance-class db.t1.micro \
        --engine mysql \
        --allocated-storage 10 \
        --master-username master \
        --master-user-password $(DBPASSWD)

delete-rds-instance: ## Deletes an RDS MySQL DB instance
    aws rds delete-db-instance \
        --db-instance-identifier MyInstance01 \
        --skip-final-snapshot

db-connect: ## Connects to the RDS instance
    mysql --user=master --password=$(DBPASSWD) --host myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com

create-db: ## Creates a DB with a table and records
    @echo "Dropping  and creating database"
    @echo "-------------------------------"
    @mysql -u master --password='$(DBPASSWD)' --host myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com -e "DROP DATABASE IF EXISTS $(DBNAME)" > /dev/null 2>&1
    @mysql -u master --password='$(DBPASSWD)' --host myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com -e "CREATE DATABASE $(DBNAME)" > /dev/null 2>&1
    @mysql -u master --password='$(DBPASSWD)' --host myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com $(DBNAME) < resources/schema.sql > /dev/null 2>&1
    @mysql -u master --password='$(DBPASSWD)' --host myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com $(DBNAME) < resources/seed.sql > /dev/null 2>&1
    @echo "... Done"

...

Create the RDS instance first, AWS will need some time to initialize it. Allow incoming connections to it by adjusting the "Inbound" traffic through your own IP under your Security Group:

adjust-security-group

You can connect to the RDS instance through the mysql console using the db-connect target. You'll need to adjust the hostname to yours. Once that works out, use the create-db target to create a DB with a table and add two records to it. If all goes well, this is what you should see when you query the users table in the MySQL console:

mysql> SELECT * FROM users;
+----+--------+------------------+-----------+----------+
| id | login  | email            | firstname | lastname |
+----+--------+------------------+-----------+----------+
|  1 | jsmith | jsmith@gmail.com | John      | Smith    |
|  2 | bjones | bjones@gmail.com | Bob       | Jones    |
+----+--------+------------------+-----------+----------+
2 rows in set (0.04 sec)

Commit point

6. Connect to MySQL with Rails' ActiveRecord

Add the mysql2 and active_record gems to the Ruby app's Gemfile:

gem 'activerecord'
gem 'mysql2', '0.3.18'

We need to use the 0.3.18 version of the mysql2 gem, as that comes packaged with Traveling Ruby. Run bundle install to get the new gems via Bundler.

Modify the lib/hello.rb file to have this:

#!/usr/bin/env ruby

require 'faker'
require 'active_record'

ActiveRecord::Base.establish_connection(
  :adapter  => "mysql2",
  :host     => "myinstance01.cgic5q3lz0bb.us-east-1.rds.amazonaws.com", # use your instance name
  :username => "master",
  :password => "Kew2401Sd",
  :database => "awslambdaruby"
)

class User < ActiveRecord::Base
end

puts "Number of users: #{User.count}"
puts "First user: #{User.first.firstname} #{User.first.lastname}"
puts "Hello - '#{Faker::Name.name}' from Ruby!"

You need to adjust the :host value to your RDS instance host name as mine won't work for you. You'll know that everything is set up properly when you see this in the terminal:

$: bundle exec ruby lib/hello.rb
Number of users: 2
First user: John Smith
Hello - 'Miss Darrick Powlowski' from Ruby!

Commit point

7. Use Traveling Ruby’s packaged mysql gem

You need to download the Traveling Ruby packaged mysql2 gem from their S3 bucket. Let’s put it into our resources directory.

Modify the package target like this:

...

package: ## Packages the code for AWS Lambda
    @echo 'Package the app for deploy'
    @echo '--------------------------'
    @rm -fr $(LAMBDADIR)
    @rm -fr deploy
    @mkdir -p $(LAMBDADIR)/lib/ruby
    @tar -xzf resources/traveling-ruby-20150715-2.2.2-linux-x86_64.tar.gz -C $(LAMBDADIR)/lib/ruby
    @mkdir $(LAMBDADIR)/lib/app
    @cp hello_ruby/lib/hello.rb $(LAMBDADIR)/lib/app/hello.rb
    @cp -pR hello_ruby/vendor $(LAMBDADIR)/lib/
    @rm -fr $(LAMBDADIR)/lib/vendor/ruby/2.2.0/extensions
    @tar -xzf resources/mysql2-0.3.18-linux.tar.gz -C $(LAMBDADIR)/lib/vendor/ruby/
    @rm -f $(LAMBDADIR)/lib/vendor/*/*/cache/*
    @mkdir -p $(LAMBDADIR)/lib/vendor/.bundle
    @cp resources/bundler-config $(LAMBDADIR)/lib/vendor/.bundle/config
    @cp hello_ruby/Gemfile $(LAMBDADIR)/lib/vendor/
    @cp hello_ruby/Gemfile.lock $(LAMBDADIR)/lib/vendor/
    @cp resources/wrapper.sh $(LAMBDADIR)/hello
    @chmod +x $(LAMBDADIR)/hello
    @cp resources/index.js $(LAMBDADIR)/
    @cd $(LAMBDADIR) && zip -r hello_ruby.zip hello index.js lib/ > /dev/null
    @mkdir deploy
    @cd $(LAMBDADIR) && mv hello_ruby.zip ../deploy/
    @echo '... Done.'

...

We need to replace the content of the 2.0.0/extensions directory with the Traveling Ruby's Linux version, as the one copied there is OSX specific.

AWS Lambda has an IP address other than your IP. In order to make it easy for you now, (and don't do this anywhere else), I'd suggest making your AWS Instance available without IP restriction. Do this only temporarily, to test things out, remove this Inbound rule once you've seen your Lamba working. You can specify the VPC your Lambda has access to, but the topic of AWS Lambda security would need another blog post just in itself.

This is how I opened up my RDS instance for any IP out there:

connect-anywhere

If everything is configured properly, you should see something like this in your terminal when you call the Lambda function with the make invoke command:

% make invoke
rm -fr tmp && mkdir tmp
aws lambda invoke \
        --invocation-type RequestResponse \
        --function-name HelloFromRuby \
        --log-type Tail \
        --region us-east-1 \
        --payload '{"name":"John Adam Smith"}' \
        tmp/outfile.txt \
        | jq -r '.LogResult' | base64 -D
START RequestId: 8444ede9-26d8-11e6-954c-fbf57aab89fb Version: $LATEST
2016-05-31T02:36:50.587Z        8444ede9-26d8-11e6-954c-fbf57aab89fb
Number of users: 2
First user: John Smith
Hello - 'Jeanne Hansen' from Ruby!

END RequestId: 8444ede9-26d8-11e6-954c-fbf57aab89fb
REPORT RequestId: 8444ede9-26d8-11e6-954c-fbf57aab89fb
Duration: 5072.62 ms
Billed Duration: 5100 ms
Memory Size: 512 MB
Max Memory Used: 53 MB

Sweet! The Ruby code in this AWS Lambda function reports back 2 users and correctly displays the first record.

Commit point

Being able to use MRI Ruby with gems opens up a ton possibilities for us (and I hope for you as well). AWS Lambdas are neat little workers that can scale up and down very well. It's much easier to launch a 1000 AWS Lambdas at the same time than running Ruby processes with resque or sidekiq on worker boxes.

Friday, June 3, 2016

Using Ruby in AWS Lambda

It was May 2015 at the AWS Summit in Chicago, where I first heard about AWS Lambda. The company I worked for used Linode at that time, I had no chance of using it, but I still found the serverless concept fascinating.

aws-lambda

The bulk of my work at my current gig is about transforming data: we pull it from an API, we need to transform and load it into our own data store. Sure the worker boxes can do the job, but maintaining a series of these instances takes effort. AWS Lambda would be the perfect solution for us, but Amazon does not support Ruby natively, which is most of our business logic is written in.

AWS, as of this writing, offers Lambda for three main platforms: Java, Node.JS, and Python. I played around running Clojure on it, which worked as the code is compiled into a jar file, but our current app - due to its monolithic nature - can’t support any other languages just yet.

Amazon claims you can run your language of choice on AWS Lambda, Ruby included, but I have not found a comprehensive guide that would describe how. Once you can package up your app to run as an executable, you can run it. I found this blog post that describes how Swift code can be bundled, deployed and invoked on AWS Lambda. It was clear to me that this solution would work, I only had to package Ruby with its own interpreter to accomplish the same. I looked for tools that can do this and found Traveling Ruby. You can package your code and run it as an executable on the user’s computer, no local Ruby installation is needed. I wanted to try it locally first, thinking if it works there (on OSX), it should work on AWS Lambda as well.

This blog post is a step-by-step tutorial to run MRI Ruby on AWS Lambda. You can follow along with the accompanying project, I listed commit points at the end of each section.

This tutorial assumes you are familiar with AWS, you have access to the AWS Management Console and you have the AWS Command Line Interface configured to interact with your services via the terminal.
You'll need the same version of Ruby as the one Traveling Ruby offers. The latest there is Ruby 2.2.2, I'd recommend installing that through Rbenv or RVM.

1. Setting up the project

I named the project aws-lambda-ruby and created a directory structure like this:

- aws-lambda-ruby
    |- hello_ruby
         |- lib
             |- hello.rb

I put this code in the hello.rb file:

puts 'Hello from Ruby!'

I made sure my Ruby version in the project is 2.2.2 by setting it with Rbenv.

$: cd hello_ruby && ruby lib/hello.rb
Hello from Ruby!

Commit point

2. Execute the Ruby Code with Traveling Ruby

Create a directory under the project root directory with the name resources. Your directory structure should look like this:

- aws-lambda-ruby
    |- hello_ruby
    |- resources

Download the Ruby runtimes from Traveling Ruby's S3 bucket into the resources directory. I only needed the OSX version for local development and the Linux x86_64 version for AWS. My directory had these two files in it:

- aws-lambda-ruby
    |- hello_ruby
    |- resources
         |- traveling-ruby-20150715-2.2.2-linux-x86_64.tar.gz
         |- traveling-ruby-20150715-2.2.2-osx.tar.gz

Commit point

Create two new directories for assembling the project under OSX and Linux X86_64 like these:

- aws-lambda-ruby
    |- hello-2.0.0-linux-x86_64
    |- hello-1.0.0-osx
    |- hello_ruby
    |- resources

Add a Makefile to the project under the root directory, we want to automate all the different steps as early as possible. Create a Make target to package and run the code on OSX like this:

run: ## Runs the code locally
    @echo 'Run the app locally'
    @echo '-------------------'
    @rm -fr $(OSXDIR)
    @mkdir -p $(OSXDIR)/lib/ruby
    @tar -xzf resources/traveling-ruby-20150715-2.2.2-osx.tar.gz -C $(OSXDIR)/lib/ruby
    @mkdir $(OSXDIR)/lib/app
    @cp hello_ruby/lib/hello.rb $(OSXDIR)/lib/app/hello.rb
    @cp resources/wrapper.sh $(OSXDIR)/hello
    @chmod +x $(OSXDIR)/hello
    @cd $(OSXDIR) && ./hello

Traveling Ruby suggests running the app through an executable shell script, that's what the resources/wrapper.sh file is:

#!/bin/bash
set -e

# Figure out where this script is located.
SELFDIR="`dirname \"$0\"`"
SELFDIR="`cd \"$SELFDIR\" && pwd`"

# Run the actual app using the bundled Ruby interpreter.
exec "$SELFDIR/lib/ruby/bin/ruby" "$SELFDIR/lib/app/hello.rb"

If you have all the right files in the correct directories and your Makefile has the run target with the code above when you execute make run, this is what you should see in your terminal:

$: make run
Run the app locally
-------------------
Hello from Ruby!

We ran the Ruby code with the Traveling Ruby packaged Ruby runtime, not with the locally installed Ruby, that was set up with a Ruby version manager.

Commit point

3. Package the Code for AWS Lambda

We need to package the code for AWS Lambda after running the app locally on OSX. You can easily check the Lambda runtime by running an AWS Lambda function with Python. Create a new AWS Lambda with the "hello-world-python" template with this Python code in it:

from __future__ import print_function

import json
import commands

print('Loading function')

def lambda_handler(event, context):
    print(commands.getstatusoutput('cat /etc/issue'))
    print(commands.getstatusoutput('uname -a'))
    print(commands.getstatusoutput('pwd'))

There are plenty of tutorials out there to guide you through creating an AWS Lambda, please Google the solution if you don’t know what to do. When you run it, this is the information you should get:

python-system-info

We will use Node.js to execute the code, place this JavaScript file in your resources directory with the name index.js:

process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT']

var exec = require('child_process').exec;
exports.handler = function(event, context) {
  var command = `./hello`;
  child = exec(command, {env: {'LD_LIBRARY_PATH': __dirname + '/lib'}}, function(error) {
    // Resolve with result of process
    context.done(error, 'Process complete!');
  });
  // Log process stdout and stderr
  child.stdout.on('data', console.log);
  child.stderr.on('data', console.error);
};

The index.handler will be invoked by Lambda, which will spawn a new child process by executing the hello shell script, which will run the Ruby code with Traveling Ruby.

The package Make target will assemble the directory for AWS Lambda and compress it into a zip file. This is how that code looks:

LAMBDADIR=hello-1.0.0-linux-x86_64

...

package: ## Package the code for AWS Lambda
    @echo 'Package the app for deploy'
    @echo '--------------------------'
    @rm -fr $(LAMBDADIR)
    @rm -fr deploy
    @mkdir -p $(LAMBDADIR)/lib/ruby
    @tar -xzf resources/traveling-ruby-20150715-2.2.2-linux-x86_64.tar.gz -C $(LAMBDADIR)/lib/ruby
    @mkdir $(LAMBDADIR)/lib/app
    @cp hello_ruby/lib/hello.rb $(LAMBDADIR)/lib/app/hello.rb
    @cp resources/wrapper.sh $(LAMBDADIR)/hello
    @chmod +x $(LAMBDADIR)/hello
    @cp resources/index.js $(LAMBDADIR)/
    @cd $(LAMBDADIR) && zip -r hello_ruby.zip hello index.js lib/
    @mkdir deploy
    @cd $(LAMBDADIR) && mv hello_ruby.zip ../deploy/
    @echo '... Done.'

...

I only list the content that I added, the run target is still in the Makefile but I omitted it here for brevity. When you execute make package, you should see the following output:

$: make package
Package the app for deploy
--------------------------
... Done.

and a hello_ruby.zip file should be created in your deploy directory.

Commit point

4. Deploy the Packaged Ruby Code to AWS Lambda

We created a hello_ruby.zip file in the previous section, let's deploy this zip file to AWS Lambda. Open the AWS Management Console and select "Lambda" from the options. Your created Lambdas (if you had any) are listed here. Let’s start creating a new one by clicking on the "Create a Lambda function" button. Select the "node-exec" template:

node-exec

Fill out the form as you see it in this screenshot:

create_function

  1. Name it "HelloFromRuby"
  2. Chose the option of "Upload a .ZIP file"
  3. Use the lambda_basic_execution role, if you don’t have it, create it

Confirm it and create the Lambda function.

Test the function by clicking on the blue "Test" button. You can accept the HelloWorld test template, those arguments are going to be ignored for now. You should see the following output:

log_output

The string "Hello from Ruby!" is coming from the Ruby code executed by Traveling Ruby, just like we did locally.

Woohoo! Congrats, you’ve just created an AWS Lambda function with MRI Ruby.

5. Use the AWS Command Line Interface to Publish an AWS Lambda Function

Although creating a Lambda through the GUI works, it's not something I'd do in the long run. The steps of dropping and creating Lambdas can be automated through the AWS Command Line Interface, those scripts can be easily executed from a Make target. Let's add a new target to drop the already existing Lambda function:

(This blog post assumes you already know how to use the AWS Command Line Interface, you have it configured properly. There is good documentation around this, please look it up and set it up for yourself.)

...

delete: ## Removes the Lambda
    aws lambda delete-function --function-name HelloFromRuby

...

Your 'HelloFromRuby' Lambda function will be deleted when you run make delete in your terminal. Go back to the AWS Management Console to verify that your Lambda function got deleted.

Add your lambda with the following script in your Make file:

...

create: ## Creates an AWS lambda function
    aws lambda create-function \
        --function-name HelloFromRuby \
        --handler index.handler \
        --runtime nodejs4.3 \
        --memory 512 \
        --timeout 10 \
        --description "Saying hello from MRI Ruby" \
        --role arn:aws:iam::___xyz___:role/lambda_basic_execution \
        --zip-file fileb://./deploy/hello_ruby.zip

...

I masked the role argument, you need to find your correct "Role ARN" value under Security -> IAM -> Roles. You should look for it here:

role-arn

If everything is configured properly, you should be able to create your AWS Lambda function by running make create in the terminal.

We can invoke the lambda from the command line as well, this Make target will do just that:

...

invoke: ## Invoke the AWS Lambda in the command line
    rm -fr tmp && mkdir tmp
    aws lambda invoke \
    --invocation-type RequestResponse \
    --function-name HelloFromRuby \
    --log-type Tail \
    --region us-east-1 \
    --payload '{"name":"John Adam Smith"}' \
    tmp/outfile.txt \
    | jq -r '.LogResult' | base64 -D

...

Please note, that I am using a lightweight JSON parser, jq to extract information from the response. You should see the following response from AWS Lambda:

START RequestId: e8c24c91-2165-11e6-a0b6-35430628271f Version: $LATEST
2016-05-24T04:13:46.403Z        e8c24c91-2165-11e6-a0b6-35430628271f

Hello from Ruby!

END RequestId: e8c24c91-2165-11e6-a0b6-35430628271f
REPORT RequestId: e8c24c91-2165-11e6-a0b6-35430628271f
       Duration: 214.12 ms
       Billed Duration: 300 ms
       Memory Size: 512 MB
       Max Memory Used: 20 MB

Commit point

This blog post guided you through the steps of running MRI Ruby on AWS lambda. In the upcoming post, I'll show you how you can add gems and talk with an RDS instance from your Ruby code on AWS Lambda.