Archive for category Tutorial

Building and implementing a Single Sign-On solution

Most modern web applications start as a monolithic code base and, as complexity increases, the once small app gets split apart into many “modules”. In other cases, engineers opt for a SOA design approach from the beginning. One way or another, we start running multiple separate applications that need to interact seamlessly. My goal will be to describe some of the high-level challenges and solutions found in implementing a Single-Sign-On service.

Authentication vs Authorization

I wish these two words didn’t share the same root because it surely confuses a lot of people. My most frequently-discussed example is OAuth. Every time I start talking about implementing a centralized/unified authentication system, someone jumps in and suggests that we use OAuth. The challenge is that OAuth is an authorization system, not an authentication system.

It’s tricky, because you might actually be “authenticating” yourself to website X using OAuth. What you are really doing is allowing website X to use your information stored by the OAuth provider. It is true that OAuth offers a pseudo-authentication approach via its provider but that is not the main goal of OAuth: the Auth in OAuth stands for Authorization, not Authentication.

Here is how we could briefly describe each role:

  • Authentication: recognizes who you are.
  • Authorization: know what you are allowed to do, or what you allow others to do.

If you are feel stuck in your design and something seems wrong, ask yourself if you might be confused by the 2 auth words. This article will only focus on authentication.

A Common Scenario

SSO diagram with 3 top applications connecting to an authorization service.

This is probably the most common structure, though I made it slightly more complex by drawing the three main apps in different programming languages. We have three web applications running on different subdomains and sharing account data via a centralized authentication service.


  • Keep authentication and basic account data isolated.
  • Allow users to stay logged in while browsing different apps.

Implementing such a system should be easy. That said, if you migrate an existing app to an architecture like that, you will spend 80% of your time decoupling your legacy code from authentication and wondering what data should be centralized and what should be distributed. Unfortunately, I can’t tell you what to do there since this is very domain specific. Instead, let’s see how to do the “easy part.”

Centralizing and Isolating Shared Account Data

At this point, you more than likely have each of your apps talk directly to shared database tables that contain user account data. The first step is to migrate away from doing that. We need a single interface that is the only entry point to create or update shared account data. Some of the data we have in the database might be app specific and therefore should stay within each app, anything that is shared across apps should be moved behind the new interface.

Often your centralized authentication system will store the following information:

  • ID
  • first name
  • last name
  • login/nickname
  • email
  • hashed password
  • salt
  • creation timestamp
  • update timestamp
  • account state (verified, disabled …)

Do not duplicate this data in each app, instead have each app rely on the account ID to query data that is specific to a given account in the app. Technically that means that instead of using SQL joins, you will query your database using the ID as part of the condition.

My suggestion is to do things slowly but surely. Migrate your database schema piece by piece assuring that everything works fine. Once the other pieces will be in place, you can migrate one code API a time until your entire code base is moved over. You might want to change your DB credentials to only have read access, then no access at all.

Login workflow

Each of our apps already has a way for users to login. We don’t want to change the user experience, instead we want to make a transparent modification so the authentication check is done in a centralized way instead of a local way. To do that, the easiest way is to keep your current login forms but instead of POSTing them to your local apps, we’ll POST them to a centralized authentication API. (SSL is strongly recommended)

diagram showing the login workflow

As shown above, the login form now submits to an endpoint in the authentication application. The form will more than likely include a login or email and a clear text password as well as a hidden callback/redirect url so that the authentication API can redirect the user’s browser to the original app. For security reasons, you might want to white list the domains you allow your authentication app to redirect to.

Internally, the Authentication app will validate the identifier (email or login) using a hashed version of the clear password against the matching record in the account data. If the verification is successful, a token will be generated containing some user data (for instance: id, first name, last name, email, created date, authentication timestamp). If the verification failed, the token isn’t generated. Finally the user’s browser is redirected to the callback/redirect URL provided in the request with the token being passed.

You might want to safely encrypt the data in a way that allows the clients to verify and trust that the token comes from a trusted source. A great solution for that would be to use RSA encryption with the public key available in all your client apps but the private key only available on the auth server(s). Other strong encryption solutions would also work. For instance, another appropriate approach would be to add a signature to the params sent back. This way the clients could check the authenticity of the params. HMAC or DSA signature are great for that but in some cases, you don’t want people to see the content of the data you send back. That’s especially true if you are sending back a ‘mobile’ token for instance. But that’s a different story. What’s important to consider is that we need a way to ensure that the data sent back to the client can’t be tampered with. You might also make sure you prevent replay attacks.

On the other side, the application receives a GET request with a token param. If the token is empty or can’t be decrypted, authentication failed. At that point, we need to show the user the login page again and let him/her try again. If on the other hand, the token can be decrypted, the content should be saved in the session so future requests can reuse the data.

We described the authentication workflow, but if a user logins in application X, (s)he won’t be logged-in in application Y or Z. The trick here, is to set a top level domain cookie that can be seen by all applications running on subdomains. Certainly, this solution only works for apps being on the same domain, but we’ll see later how to handle apps on different domains.

The cookie doesn’t need to contain a lot of data, its value can contain the account id, a timestamp (to know when authentication happened and a trusted signature) and a signature. The signature is critical here since this cookie will allow users to be automatically logged in other sites. I’d recommend the  HMAC or DSA encryptions to generate the signature. The DSA encryption, very much like the RSA encryption is an asymmetrical encryption relying on a public/private key. This approach offers more security than having something based a shared secret like HMAC does. But that’s really up to you.

Finally, we need to set a filter in your application. This auto-login filter will check the presence of an auth cookie on the top level domain and the absence of local session. If that’s the case, a session is automatically created using the user id from the cookie value after the cookie integrity is verified. We could also share the session between all our apps, but in most cases, the data stored by each app is very specific and it’s safer/cleaner to keep the sessions isolated. The integration with an app running on a different service will also be easier if the sessions are isolated.



For registration, as for login, we can take one of two approaches: point the user’s browser to the auth API or make S2S (server to server) calls from within our apps to the Authentication app. POSTing a form directly to the API is a great way to reduce duplicated logic and traffic on each client app so I’ll demonstrate this approach.

As you can see, the approach is the same we used to login. The difference is that instead of returning a token, we just return some params (id, email and potential errors). The redirect/callback url will also obviously be different than for login. You could decide to encrypt the data you send back, but in this scenario, what I would do is set an auth cookie at the level when the account is created so the “client” application can auto-login the user. The information sent back in the redirect is used to re-display the register form with the error information and the email entered by the user.

At this point, our implementation is almost complete. We can create an account and login using the defined credentials. Users can switch from one app to another without having to re login because we are using a shared signed cookie that can only be created by the authentication app and can be verified by all “client” apps. Our code is simple, safe and efficient.

Updating or deleting an account

The next thing we will need is to update or delete an account. In this case, this is something that needs to be done between a “client” app and the authentication/accounts app. We’ll make S2S (server to server) calls. To ensure the security of our apps and to offer a nice way to log requests, API tokens/keys will be used by each client to communicate with the authentication/accounts app. The API key can be passed using a X-header so this concern stays out of the request params and our code can process separately the authentication via X-header and the actual service implementation. S2S services should have a filter verifying and logging the API requests based on the key sent with the request. The rest is straight forward.

Using different domains

Until now, we assumed all our apps were on the same top domain. In reality, you will often find yourself with apps on different domains. This means that you can’t use the shared signed cookie approach anymore. However, there is a simple trick that will allow you to avoid requiring your users to re-login as they switch apps.


The trick consists, when a local session isn’t present, of using an iframe in the application using the different domain. The iframe loads a page from the authentication/accounts app which verifies that a valid cookie was set on the main top domain. If that is the case, we can tell the application that the user is already globally logged in and we can tell the iframe host to redirect to an application end point passing an auth token the same way we did during the authentication. The app would then create a session and redirect the user back to where (s)he started. The next requests will see the local session and this process will be ignored.

If the authentication application doesn’t find a signed cookie, the iframe can display a login form or redirect the iframe host to a login form depending on the required behavior.

Something to keep in mind when using multiple apps and domains is that you need to keep the shared cookies/sessions in sync, meaning that if you log out from an app, you need to also delete the auth cookie to ensure that users are globally logged out. (It also means that you might always want to use an iframe to check the login status and auto-logoff users).


Mobile clients

Another part of implementing a SSO solution is to handle mobile clients. Mobile clients need to be able to register/login and update accounts. However, unlike S2S service clients, mobile clients should only allow calls to modify data on the behalf of a given user. To do that, I recommend providing opaque mobile tokens during the login process. This token can then be sent with each request in a X-header so the service can authenticate the user making the request. Again, SSL is strongly recommended.

In this approach, we don’t use a cookie and we actually don’t need a SSO solution, but an unified authentication system.


Writing web services

Our Authentication/Accounts application turns out to be a pure web API app.

We also have 3 sets of APIs:

  • Public APIs: can be accessed from anywhere, no authentication required
  • S2S APIs: authenticated via API keys and only available to trusted clients
  • Mobile APIs: authenticated via a mobile token and limited in scope.

We don’t need dynamic HTML views, just simple web service related code. While this is a little bit off topic, I’d like to take a minute to show you how I personally like writing web service applications.

Something that I care a lot about when I implement web APIs is to validate incoming params. This is an opinionated approach that I picked up while at Sony and that I think should be used every time you implement a web API. As a matter of fact, I wrote a Ruby DSL library (Weasel Diesel) allowing you describe a given service, its incoming params, and the expected output. This DSL is hooked into a web backend so you can implement services using a web engine such as Sinatra or maybe Rails3. Based on the DSL usage, incoming parameters are be verified before being processed. The other advantage is that you can generate documentation based on the API description as well as automated tests.

You might be familiar with Grape, another DSL for web services. Besides the obvious style difference Weasel Diesel offers the following advantages:

  • input validation/sanitization
  • service isolation
  • generated documentation
  • contract based design
Here is a hello world webservice being implemented using Weasel Diesel and Sinatra:
describe_service "hello_world" do |service|
service.formats :json
service.http_verb :get
service.disable_auth # on by default
service.param.string :name, :default => 'World'
service.response do |response|
response.object do |obj|
obj.string :message, :doc => "The greeting message sent back. Defaults to 'World'"
obj.datetime :at, :doc => "The timestamp of when the message was dispatched"
service.documentation do |doc|
doc.overall "This service provides a simple hello world implementation example."
doc.param :name, "The name of the person to greet."
doc.example "<code>curl -I 'http://localhost:9292/hello_world?name=Matt'</code>"
service.implementation do
{:message => "Hello #{params[:name]}", :at =>}.to_json
view raw hello_world.rb hosted with ❤ by GitHub

Basis test validating the contract defined in the DSL and the actual output when the service is called:

class HelloWorldTest < MiniTest::Unit::TestCase
def test_response
TestApi.get "/hello_world", :name => 'Matt'
view raw gistfile1.rb hosted with ❤ by GitHub

Generated documentation:

If the DSL and its features seem appealing to you and you are interested in digging more into it, the easiest way is to fork this demo repo and start writing your own services.

The DSL has been used in production for more than a year, but there certainly are tweaks and small changes that can make the user experience even better. Feel free to fork the DSL repo and send me Pull Requests.

, ,


Quick dive into Ruby ORM object initialization

Yesterday I did some quick digging into how ORM objects are initialized and the performance cost associated to that. In other words, I wanted to see what’s going on when you initialize an ActiveRecord object.

Before I show you the benchmark numbers and you jump to conclusions, it’s important to realize that in the grand scheme of things, the performance cost we are talking is small enough that it is certainly not the main reason why your application is slow. Spoiler alert: ActiveRecord is slow but the cost of initialization isn’t by far the worse part of ActiveRecord. Also, even though this article doesn’t make activeRecord look good, and I’m not trying to diss it. It’s a decent ORM that does a great job in most cases.

Let’s get started by the benchmarks number to give us an idea of the damage (using Ruby 1.9.3 p125):


                                                             | Class | Hash  | AR 3.2.1 | AR no protection | Datamapper | Sequel |
.new() x100000                                               | 0.037 | 0.049 | 1.557    | 1.536            | 0.027      | 0.209  |
.new({:id=>1, :title=>"Foo", :text=>"Bar"}) x100000          | 0.327 | 0.038 | 6.784    | 5.972            | 4.226      | 1.986  |


You can see that I am comparing the allocation of a Class instance, a Hash and some ORM models. The benchmark suite tests the allocation of an empty object and one with passed attributes. The benchmark in question is available here.

As you can see there seems to be a huge performance difference between allocating a basic class and an ORM class. Instantiating an ActiveRecord class is 20x slower than instantiating a normal class, while ActiveRecord offers some extra features, why is it so much slower, especially at initialization time?

The best way to figure it out is to profile the initialization. For that, I used perftools.rb and I generated a graph of the call stack.

Here is what Ruby does (and spends its time) when you initialize a new Model instance (click to download the PDF version):


Profiler diagram of AR model instantiation by Matt Aimonetti


This is quite a scary graph but it shows nicely the features you are getting and their cost associated. For instance, the option of having the before and after initialization callback cost you 14% of your CPU time per instantiation, even though you probably almost never use these callbacks. I’m reading that by interpreting the node called ActiveSupport::Callback#run_callbacks, 3rd level from the top. So 14.1% of the CPU time is spent trying to run callbacks. As a quick note, note that 90.1% of the CPU time is spent initializing objects, the rest is spent in the loop and in the garbage collection (because the profiler runs many loops). You can then follow the code and see how the code works, creating a dynamic class callback method on the fly (the one with the long name) and then recreating the name of this callback to call it each time the object is allocated. It sounds like that’s a good place for some micro optimizations which could yield up to 14% performance increase in some cases.

Another major part of the CPU time is spent in ActiveModel’s sanitization. This is the piece of code that allows you to block some model attributes to be mass assigned. This is useful when you don’t want to sanitize your incoming params but want to create or update a model instance by using all the passed user params. To avoid malicious users to modify some specific params that might be in your model but not in your form, you can protect these attributes. A good example would be an admin flag on a User object. That said, if you manually initialize an instance, you don’t need this extra protection, that’s why in the benchmark above, I tested and without the protection. As you can see, it makes quite a big difference. The profiler graph of the same initialization without the mass assignment protection logically ends up looking quite different:


Matt Aimonetti shows the stack trace generated by the instantiation of an Active Record model


Update: My colleague Glenn Vanderburg pointed out that some people might assuming that the shown code path is called for each record loaded from the database. This isn’t correct, the graph represents instances allocated by calling #new. See the addition at the bottom of the post for more details about what’s going on when you fetch data from the DB.

I then decided to look at the graphs for the two other popular Ruby ORMs:



and Sequel



While I didn’t give you much insight in ORM code, I hope that this post will motivate you to sometimes take a look under the cover and profile your code to see what’s going on and why it might be slow. Never assume, always measure. Tools such as perftools are a great way to get a visual feedback and get a better understanding of how the Ruby interpreter is handling your code.


I heard you liked graphs so I added some more, here is what’s going on when you do Model.first:




And finally this is the code graph for a call to Model.instantiate which is called after a record was retrieved from the database to convert into an Object. (You can see the #instantiate call referenced in the graph above).


, , , ,


My RubyConf 2011 talk is online

I realize I forgot to mention that my RubyConf talk is now online on the confreaks site (wait until the end, Matz actually answers a question from the audience).

Photo of Matt Aimonetti giving a talk at RubyConf 2011 with one of his slides showing how thread scheduling works

I wrote a couple follow up posts you might also be interested in:

, , ,

No Comments

Go’s reflection example

The Go Programming language is really cool language by Google. According to the sales pitch, it’s a “fast, statically typed, compiled language that feels like a dynamically typed, interpreted language”. Well, if you are like me, you don’t trust sales pitches because you know that people writing them dont’ care about you, they care about their product. However cynical you are, you still have to check the facts. So here is a quick demonstration showing how to use Go’s reflection feature.

Installing Go is actually really straight forward on a Mac, and slightly harder on Linux, check this guide to see how to build Go in a few minutes.

Once all setup, you might want to read the documentation to see how to code in Go. Go is actually a kind of nice version of C with a simplified syntax, no header files, really fast compilation time, a garbage collector and a simple way to approach object inheritance without turning in the complicated mess C++ is. The language is designed around the concept of goroutines, a very nice way to handle concurrency. It also has some features that Rubyists, Pythonistas and Javascripters wouldn’t want to live without such as closures and some they probably wish they had such as defer. But of the things we are used to with dynamic languages is the concept of reflection. In a nutshell, at runtime, your code can reflect on the type of a given object and let the developer act accordingly. Depending on your programming background that might be obvious or you might not see the value. To be honest, that’s not the question here. What I’m interested in showing you is how it works.

For the sake of this demo, let’s pretend we want to have a “Dish” data model, each instance of the “Dish” type will have a few attributes, an id, a name, an origin and a custom query which really is a function that we store as an attribute. Here is how we would represent that model in Go:

// Data Model
type Dish struct {
  Id  int
  Name string
  Origin string
  Query func()

This is more or less the equivalent of the following Ruby code:

class Dish
  attr_accessor :id, :name, :origin, :query

Ruby works slightly differently in the sense that defining attribute accessors create getters and setter methods but doesn’t technically create instance variables until they are used. Here is what I mean:

shabushabu =
shabushabu.instance_variables # => [] = "Shabu-Shabu"
shabushabu.instance_variables # => ["@name"]
shabushabu.origin = "Japan"
shabushabu.instance_variables # => ["@name", "@origin"]

Another way of checking on the accessors is to check the methods defined on the object:

shabushabu.methods -
=> ["name", "name=", "origin", "origin=", "id=", "query", "query="]

But anyway, this post isn’t about Ruby, it’s about Go and what we would like is to reflect on an object of “Dish” type and see its attributes. The good news is that the Go language ships with a package to do just that. Here is the full implementation:

package main
func main(){
  // iterate through the attributes of a Data Model instance
  for name, mtype := range attributes(&Dish{}) {
    fmt.Printf("Name: %s, Type %s\n", name, mtype.Name())
// Data Model
type Dish struct {
  Id  int
  Name string
  Origin string
  Query func()
// Example of how to use Go's reflection
// Print the attributes of a Data Model
func attributes(m interface{}) (map[string]reflect.Type) {
  typ := reflect.TypeOf(m)
  // if a pointer to a struct is passed, get the type of the dereferenced object
  if typ.Kind() == reflect.Ptr{
    typ = typ.Elem()
  // create an attribute data structure as a map of types keyed by a string.
  attrs := make(map[string]reflect.Type)
  // Only structs are supported so return an empty result if the passed object
  // isn't a struct
  if typ.Kind() != reflect.Struct {
    fmt.Printf("%v type can't have attributes inspected\n", typ.Kind())
    return attrs
  // loop through the struct's fields and set the map
  for i := 0; i < typ.NumField(); i++ {
    p := typ.Field(i)
      if !p.Anonymous {
        attrs[p.Name] = p.Type
  return attrs

Unfortunately, my code highlighter doesn’t support the Go syntax, but GitHub does, so here is a pretty version.

There are ways of running Go source code like Ruby or Python scripts but in this case, we’ll use the compiler & linker provided with Go. I named my source file “example.go”, and here is how I compiled, linked and run it:

$ 6g example.go && 6l example.6 && ./6.out
Name: Origin, Type string
Name: Id, Type int
Name: Query, Type 
Name: Name, Type string

As you can see each attribute is printed out with its name and type. The code might seem a bit odd if you never looked at Go before.
Here is a quick rundown of the code:

In our main function, we create a new instance of type Dish on which we call attributes on. The call returns a map on which we iterate through and print the attribute name (key) and type (value).
The attributes function is defined a bit below and and it takes any type of objects (empty interface) and returns a map, which is like a Hash or a Dictionary. The map has keys of String type and values of “Type” type. The “Type” type is defined in the reflect package. Inside the function, 23 then use the previously mentioned reflect package to check on the type and the name of each attribute and assign it to a map object. (note that I’m explicitly returning the map, but I could have done it in a more implicit way)

So there you go, that’s how you use reflection in Go. Pretty nifty and simple.


No Comments

MacRuby tips: browse for folder or file dialog

This is yet another pretty simple tip.
Use case: let say you want your applications users to choose one or multiple files or folder on their file system. A good example would be that you want the user to choose a file to process or a folder where to save some data.

In the example above, I added a browse button and a text field.

I would like my users to click on the browse button, locate a folder and display it in the text field.

In your MacRuby controller, use a simple action method as well as an accessor to the text field:

attr_accessor :destination_path
def browse(sender)

Now, in Interface builder bind the destination_path outlet to the text field you want to use to display the path and bind the button to the browse action.

Let’s go back to our action method and let’s create a dialog panel, set some options and handle the user selection:

def browse(sender)
  # Create the File Open Dialog class.
  dialog = NSOpenPanel.openPanel
  # Disable the selection of files in the dialog.
  dialog.canChooseFiles = false
  # Enable the selection of directories in the dialog.
  dialog.canChooseDirectories = true
  # Disable the selection of multiple items in the dialog.
  dialog.allowsMultipleSelection = false
  # Display the dialog and process the selected folder
  if dialog.runModalForDirectory(nil, file:nil) == NSOKButton
  # if we had a allowed for the selection of multiple items
  # we would have want to loop through the selection
    destination_path.stringValue = dialog.filenames.first

That’s it, your user can now browse for a folder and the selection will be displayed in the text field. Look at the NSOpenPanel documentation for more details on the Cocoa API.

, ,

1 Comment

MacRuby tips: capturing keyboard events

If you are writing any type of games you might want your users to interact with your application using their keyboards.

This is actually not that hard. The approach is simple and fast forward if you are used to Cocoa.

Everything starts in Interface Builder, add a custom view instance to your window.

Now switch to your project and a new file with a class called KeyboardControlView and make in inherit from NSView. We are creating a subview of NSView so we will be able to make our top view “layer” use this subclass.

class KeyboardControlView &lt; NSView
  attr_accessor :game_controller
  def acceptsFirstResponder

As you can see in the example above, I added an attribute accessor. attr_accessor class method creates getters and setters. It’s basically the same as writing:

 def game_controller=(value)
  @game_controller = value
def game_controller

MacRuby is keeping an eye on these accessors and let bind outlets to them.
But let’s not get ahead of ourselves, we’ll keep that for another time.

Let’s go back to our newly created class. Notice, we also added a method called `
acceptsFirstResponder` and returns true. acceptsFirstResponder returns false by default.
But in this case we want it to return true so our new class instance can be first in the responder chain.

Now that our class is ready, let’s go back to Interface Builder, select our new custom view and click on the inspector button.

Click on the (i) icon and in the Class field choose our new KeyboardControlView.
Yep, our new class just shows up by magic, it’s also called the lrz effect, just don’t ask ;)
So now when our application starts, a new instance of our NSView class is created and Cocoa will call different methods based on events triggered.

The two methods we are interested in reimplementing are keyDown and keyUp. They get called when a key gets pressed or released.

def keyDown(event)
  characters = event.characters
  if characters.length == 1 &amp;&amp; !event.isARepeat
    character = characters.characterAtIndex(0)
    if character == NSLeftArrowFunctionKey
      puts "LEFT pressed"
    elsif character == NSRightArrowFunctionKey
      puts "RIGHT pressed"
    elsif character == NSUpArrowFunctionKey
      puts "UP pressed"
    elsif character == NSDownArrowFunctionKey
      puts "DOWN pressed"

I don’t think the code above needs much explanation. The only things that you might not understand are ‘event.isARepeat’. This method returns true if the user left his/her finger on the key. The other thing is the use of the ‘super’ call at the end of the method. Basically, we reopened a method that was already defined and we don’t want to just overwrite it, we just want to inject out code within the existing method, so once we are done handling the event, we just pass it back to original method.

Final result:

class KeyboardControlView &lt; NSView
  attr_accessor :game_controller
  def acceptsFirstResponder
  def keyDown(event)
    characters = event.characters
    if characters.length == 1 &amp;&amp; !event.isARepeat
      character = characters.characterAtIndex(0)
      if character == NSLeftArrowFunctionKey
        puts "LEFT pressed"
      elsif character == NSRightArrowFunctionKey
        puts "RIGHT pressed"
      elsif character == NSUpArrowFunctionKey
        puts "UP pressed"
      elsif character == NSDownArrowFunctionKey
  	puts "DOWN pressed"
  # Deals with keyboard keys being released
  def keyUp(event)
    characters = event.characters
    if characters.length == 1
      character = characters.characterAtIndex(0)
      if character == NSLeftArrowFunctionKey
       puts "LEFT released"
      elsif character == NSRightArrowFunctionKey
        puts "RIGHT released"
      elsif character == NSUpArrowFunctionKey
       puts "UP released"
      elsif character == NSDownArrowFunctionKey
        puts "DOWN released"

Now it’s up to you to handle the other keystrokes and do whatever you want. That’s it for this tip, I hope it helps.

No Comments

Ruby, Rack and CouchDB = lots of awesomeness

Over the weekend, I spent some time working on a Ruby + Rack +CouchDB project. Three technologies that I know quite well but that I never put to work together at the same time, at least not directly.  Let’s call this Part I.

Before we get started, let me introduce each component:

  • Ruby : if you are reading this blog, you more than likely know at least a little bit about, what I consider, one of the most enjoyable programming language out there. It’s also a very flexible language that lets us do some interesting things. I could have chosen Python to do the same project but that’s a whole different topic. For this project we will do something Ruby excels at: reopening existing classes and injecting more code.
  • Rack: a webserver interface written in Ruby and inspired by Python’s WSGI. Basically, it’s a defined API to interact between webservers and web frameworks. It’s used by most common Ruby web frameworks, from Sinatra to Rails (btw, Rails3 is going to be even more Rack-focused than it already is). So, very simply put, the webserver receives a request, passes it to Rack, that converts it, passes it to your web framework and the web framework sends a response in the expected format (more on Rack later).
  • CouchDB: Apache’s document-oriented database. RESTful API, schema-less, written in Erlang with built-in support for map/reduce. For this project, I’m using CouchRest, a Ruby wrapper for Couch.

Goal: Log Couch requests and analyze data

Let’s say we have a Rails, Sinatra or Merb application and we are using CouchRest (maybe we are using CouchRest and ActiveRecord, but let’s ignore that for now).

Everything works fine but we would like to profile our app a little and maybe optimize the DB usage. The default framework loggers don’t support Couch. The easy way would be to tail the Couch logs or look at the logs in CouchDBX. Now, while that works, we can’t really see what DB calls are made per action, so it makes any optimization work a bit tedious. (Note that Rails3 will have some better conventions for logging, making things even easier)

So, let’s see how to fix that. Let’s start by looking at Rack.

Rack Middleware

Instead of hacking a web framework specific solution, let’s use Rack. Rack is dead simple, you just need to write a class that has a call method.
In our case, we don’t care about modifying the response, we just want to instrument our app. We just want our middleware to be transparent and let our webserver deal with it normally.

Here we go … that wasn’t hard, was it? We keep the application reference in the @app variable when a new instance of the middleware is created. Then when the middleware is called, we just call the rest of the chain and pretend nothing happened.

As you can see, we just added some logging info around the request. Let’s do one better and save the logs in CouchDB:

Again, nothing complicated. In our rackup file we defined which Couch database to use and we passed it to our middleware (we change our initialize method signature to take the DB).
Finally, instead of printing out the logs, we are saving them to the database.

W00t! At this point all our requests have been saved in the DB with all the data there, ready to be manipulated by some map/reduce views we will write. For the record, you might want to use the bulk_save approach in CouchDB which will wait for X amount of records to save them in the DB all at once. Couch also let’s you send new documents, but only save it to the DB every X documents or X seconds.

As you can see, our document contains the timestamps and the full environment as a hash.

All of that is nice, but even though we get a lot of information, we could not actually see any of the DB calls made in each request. Let’s fix that and inject our logger in CouchRest (you could apply the same approach to any adapter).

Let’s reopen the HTTP Abstraction layer class used by CouchRest and inject some instrumentation:

Again, nothing fancy, we are just opening the module, reopening the methods and wrapping our code around the super call (for those who don’t know, super calls the original method).

This is all for Part I. In Part II, we’ll see how to process the logs and make all that data useful.

By the way, if you make it to RailsSummit, I will be giving a talk on Rails3 and the new exciting stuff you will be able to do including Rack based stuff, CouchDB, MongoDB, new DataMapper etc..

, , , , ,