Quick dive into Ruby ORM object initialization

Yesterday I did some quick digging into how ORM objects are initialized and the performance cost associated to that. In other words, I wanted to see what’s going on when you initialize an ActiveRecord object.

Before I show you the benchmark numbers and you jump to conclusions, it’s important to realize that in the grand scheme of things, the performance cost we are talking is small enough that it is certainly not the main reason why your application is slow. Spoiler alert: ActiveRecord is slow but the cost of initialization isn’t by far the worse part of ActiveRecord. Also, even though this article doesn’t make activeRecord look good, and I’m not trying to diss it. It’s a decent ORM that does a great job in most cases.

Let’s get started by the benchmarks number to give us an idea of the damage (using Ruby 1.9.3 p125):


                                                             | Class | Hash  | AR 3.2.1 | AR no protection | Datamapper | Sequel |
.new() x100000                                               | 0.037 | 0.049 | 1.557    | 1.536            | 0.027      | 0.209  |
.new({:id=>1, :title=>"Foo", :text=>"Bar"}) x100000          | 0.327 | 0.038 | 6.784    | 5.972            | 4.226      | 1.986  |


You can see that I am comparing the allocation of a Class instance, a Hash and some ORM models. The benchmark suite tests the allocation of an empty object and one with passed attributes. The benchmark in question is available here.

As you can see there seems to be a huge performance difference between allocating a basic class and an ORM class. Instantiating an ActiveRecord class is 20x slower than instantiating a normal class, while ActiveRecord offers some extra features, why is it so much slower, especially at initialization time?

The best way to figure it out is to profile the initialization. For that, I used perftools.rb and I generated a graph of the call stack.

Here is what Ruby does (and spends its time) when you initialize a new Model instance (click to download the PDF version):


Profiler diagram of AR model instantiation by Matt Aimonetti


This is quite a scary graph but it shows nicely the features you are getting and their cost associated. For instance, the option of having the before and after initialization callback cost you 14% of your CPU time per instantiation, even though you probably almost never use these callbacks. I’m reading that by interpreting the node called ActiveSupport::Callback#run_callbacks, 3rd level from the top. So 14.1% of the CPU time is spent trying to run callbacks. As a quick note, note that 90.1% of the CPU time is spent initializing objects, the rest is spent in the loop and in the garbage collection (because the profiler runs many loops). You can then follow the code and see how the code works, creating a dynamic class callback method on the fly (the one with the long name) and then recreating the name of this callback to call it each time the object is allocated. It sounds like that’s a good place for some micro optimizations which could yield up to 14% performance increase in some cases.

Another major part of the CPU time is spent in ActiveModel’s sanitization. This is the piece of code that allows you to block some model attributes to be mass assigned. This is useful when you don’t want to sanitize your incoming params but want to create or update a model instance by using all the passed user params. To avoid malicious users to modify some specific params that might be in your model but not in your form, you can protect these attributes. A good example would be an admin flag on a User object. That said, if you manually initialize an instance, you don’t need this extra protection, that’s why in the benchmark above, I tested and without the protection. As you can see, it makes quite a big difference. The profiler graph of the same initialization without the mass assignment protection logically ends up looking quite different:


Matt Aimonetti shows the stack trace generated by the instantiation of an Active Record model


Update: My colleague Glenn Vanderburg pointed out that some people might assuming that the shown code path is called for each record loaded from the database. This isn’t correct, the graph represents instances allocated by calling #new. See the addition at the bottom of the post for more details about what’s going on when you fetch data from the DB.

I then decided to look at the graphs for the two other popular Ruby ORMs:



and Sequel



While I didn’t give you much insight in ORM code, I hope that this post will motivate you to sometimes take a look under the cover and profile your code to see what’s going on and why it might be slow. Never assume, always measure. Tools such as perftools are a great way to get a visual feedback and get a better understanding of how the Ruby interpreter is handling your code.


I heard you liked graphs so I added some more, here is what’s going on when you do Model.first:




And finally this is the code graph for a call to Model.instantiate which is called after a record was retrieved from the database to convert into an Object. (You can see the #instantiate call referenced in the graph above).


, , , ,


Books to read in 2012 – recommended to me by Twitter

Today, I asked on Twitter what non-technical books I should read in 2012.

I was nicely surprised to see so many of my followers send recommendations. Here is a list of 25 books that like-minded people suggested I read. Hopefully you will find a book or two to read too. Feel free to send more recommendations via the comments.


1Q84 by Haruki Murakami suggested by @mrb_bk and @chadfowler
The Floating Opera and The End of the Road by John Barth suggested by @chadfowler
Into Thin Air by Jon Krakauer suggested by @bradly
Cutting for Stone by Abraham Verghese suggested by @bradly
Atlas Shrugged by Ayn Rand suggested by @bradly
Cien años de soledad by Gabriel Garcia Marquez (es) suggested by @romanandreg & @jrfernandez & @edgarschmidt
One Hundred Years of Solitude by Gabriel García Marquez suggested by @romanandreg & @jrfernandez & @edgarschmidt
Jitterbug Perfume by Tom Robbins suggested by @supaspoida
The Sisters Brothers by Patrick deWitt suggested by @dennismajor1
The Glass Bead Game by Hermann Hesse suggested by @dj2sincl
The Wind-Up Bird Chronicle by Haruki Murakami suggested by @chadfowler
Mindfire by Scott Berkun suggested by @lucasdicioccio
Les Fourmis by Bernard Werber (fr) suggested by @twitty_tim
Perfume: The Story of a Murderer by Patrick Suskind suggested by @twitty_tim
Les Miserables by Victor Hugo (en, free ebook) suggested by @tutec
Song Of Ice and Fire by George R.R. Martin (Game of Thrones saga) suggested by @eeppa & @jarin
Clockwork Century by Cherie Priest suggested by @eeppa
The Darkness that Comes Before by R. Scott Bakker suggested by @eeppa
Drood by Dan Simmons suggested by @eeppa
This Is Water by David Foster Wallace suggested by @atduskgreg
Anathem by Neal Stephenson suggested by @jarin
Ender’s Game by Orson Scott Card (entire saga) suggested by @jarin & @edgarschmidt
Snow Crash by Neal Stephenson suggested by @jarin
Fixing the Game by Roger L. Martin suggested by @jarkko
The Road by Cormac McCarthy suggested by @mrreynolds


Developing a Curriculum

Recently I asked a friend of mine to give me pointers on how to develop a curriculum (he used to teach an education PHD program), after discussing his response on Twitter, people asked me to put it somewhere, so here it is:

Process to develop a curriculum:

Purpose. Know why you’re doing what you’re doing.

  • You know how to do this.

Product. Start with the end in mind.

  • What does the student look like when they walk out the door at the end of the training.
  • Usually, we break these down into Knowledge, Skills, or Attitudes.
  • Sometimes it’s helpful to see a photograph or drawing of a someone who finished the program and just talk about what they can do that makes them successful.
  • This “product” should be connected and help you accomplish your mission

Practices. Then ask yourself, “How do people become like this?”

  • If you can break down your Product into 3-5 bit-sized chunks, then see how people learn each one of those skills, gain each one of those knowledge points, and how to they gain the attitudes you want them to have.
  • This one is much easier the more experience you have in seeing people develop the “Product.”
  • This is also easier to determine when you understand Learning Theory.
  • The results from this section will result in a list of:
    •        Activities or experiences
    •        Resources. What books, website, teachers, software, etc. will help them learn more effectively and efficiently
    •        Assessments. How you would know if the activity was helpful?

Plans. Make your plans based on the practices you’ve determined you’ve needed.


On a related topic, Chad Fowler posted an interesting blog post about what LivingSocial is doing to change the software development education.

1 Comment

My RubyConf 2011 talk is online

I realize I forgot to mention that my RubyConf talk is now online on the confreaks site (wait until the end, Matz actually answers a question from the audience).

Photo of Matt Aimonetti giving a talk at RubyConf 2011 with one of his slides showing how thread scheduling works

I wrote a couple follow up posts you might also be interested in:

, , ,

No Comments

Data safety and GIL removal

After my recent RubyConf talk and follow up post addressing the Ruby & Python’s Global Interpreter Lock (aka GVL/Global VM Lock). a lot of people asked me to explain what I meant by “data safety”. While my point isn’t to defend one approach or the other, I spent a lot of time explaining why C Ruby and C Python use a GIL and where it matters and where it matters less. As a reminder and as mentioned by Matz himself, the main reason why C Ruby still has a GIL is data safety. But if this point isn’t clear to you, you might be missing the main argument supporting the use of a GIL.

Showing obvious concrete examples of data corruption due to unsafe threaded code isn’t actually as easy at it sounds. First of all, even with a GIL, developers can write unsafe threaded code. So we need to focus only on the safety problems raised by removing the GIL. To demonstrate what I mean, I will try to create some race conditions and show you the unexpected results you might get. Again, before you go crazy on the comments, remember that threaded code is indeterministic and the code below might potentially work on your machine and that’s exactly why it is hard to demonstrate. Race conditions depend on many things, but in this case I will focus on race conditions affecting basic data structures since it might be the most surprising.


@array, threads = [], []
4.times do
  threads << Thread.new { (1..100_000).each {|n| @array << n} }
threads.each{|t| t.join }
puts @array.size

In the above example, I’m creating an instance variable of Array type and I start 4 threads. Each of these threads adds 100,000 items to the array. We then wait for all the threads to be done and check the size of the array.

If you run this code in C Ruby the end result will be as expected:


Now if you switch to JRuby you might be surprised by the output. If you are lucky you will see the following:

ConcurrencyError: Detected invalid array contents due to unsynchronized modifications with concurrent users
        << at org/jruby/RubyArray.java:1147
  __file__ at demo.rb:3
      each at org/jruby/RubyRange.java:407
  __file__ at demo.rb:3
      call at org/jruby/RubyProc.java:274
      call at org/jruby/RubyProc.java:233

This is actually a good thing. JRuby detects that you are unsafely modifying an instance variable across threads and that data corruption will occur. However, the exception doesn’t always get raised and you will potentially see results such as:


This is a sign that the data was corrupted but that JRuby didn’t catch the unsynchronized modification. On the other hand MacRuby and Rubinius 2 (dev) won’t raise any exceptions and will just corrupt the data, outputting something like:


In other words, if not manually synchronized, shared data can easily be corrupted. You might have two threads modifying the value of the same variable and one of the two threads will step on top of the other leaving you with a race condition. You only need 2 threads accessing the same instance variable at the same time to get a race condition. My example uses more threads and more mutations to make the problem more obvious. Note that TDD wouldn’t catch such an issue and even extensive testing will provide very little guarantee that your code is thread safe.


So what? Thread safety isn’t a new problem.

That’s absolutely correct, ask any decent Java developer out there, he/she will tell how locks are used to “easily” synchronize objects to make your code thread safe. They might also mention the deadlocks and other issues related to that, but that’s a different story. One might also argue that when you write web apps, there is very little shared data and the chances of corrupting data across concurrent requests is very small since most of the data is kept in a shared data store outside of the process.

All these arguments are absolutely valid, the challenge is that you have a large community and a large amount of code out there that expects a certain behavior. And removing the GIL does change this behavior. It might not be a big deal for you because you know how to deal with thread safety, but it might be a big deal for others and C Ruby is by far the most used Ruby implementation. It’s basically like saying that automatic cars shouldn’t be made and sold, and everybody has to switch to stick shifts. They have better gas mileage, I personally enjoy driving then and they are cheaper to build. Removing the GIL is a bit like that. There is a cost associated with this decision and while this cost isn’t insane, the people in charge prefer to not pay it.


Screw that, I’ll switch to Node.js

I heard a lot of people telling me they were looking into using Node.js because it has a better design and no GIL. While I like Node.js and if I were to implement a chat room or an app keeping connections for a long time, I would certainly compare it closely to EventMachine, I also think that this argument related to the GIL is absurd. First, you have other Ruby implementations which don’t have a GIL and are really stable (i.e: JRuby) but then Node basically works the same as Ruby with a GIL. Yes, Node is evented and single threaded but when you think about it, it behaves the same as Ruby 1.9 with its GIL. Many requests come in and they are handled one after the other and because IO requests are non-blocking, multiple requests can be processed concurrently but not in parallel. Well folks, that’s exactly how C Ruby works too, and unlike popular believe, most if not all the popular libraries making IO requests are non blocking (when using 1.9). So, next time you try to justify you wanting to toy with Node, please don’t use the GIL argument.


What should I do?

As always, evaluate your needs and see what makes sense for your project. Start by making sure you are using Ruby 1.9 and your code makes good use of threading. Then look at your app and how it behaves, is it CPU-bound or IO-bound. Most web apps out there are IO-bound (waiting for the DB, redis or API calls), and when doing an IO call, Ruby’s GIL is released allowing another thread to do its work. In that case, not having a GIL in your Ruby implementation won’t help you. However, if your app is CPU-bound, then switching to JRuby or Rubinius might be beneficial. However, don’t assume anything until you proved it and remember that making such a change will more than likely require some architectural redesign, especially if using JRuby.  But, hey, it might totally be worth it as many proved it in the past.


I hope I was able to clarify things a bit further. If you wish to dig further, I would highly recommend you read the many discussions the Python community had in the last few years.





, , , ,


About management

I decided to save myself a session to the shrink and instead just write down my reflection on management. Who knows, some of you might help me and/or challenge my thought process.

I recently read a great management book called the five dysfunctions of a team by Patrick Lencioni . Instead of telling you what to do, the author highlights behavior patterns that are related to each other and when aggregated result in dysfunctional teams. I really liked the book because instead of a being a cookbook/playbook, this is more a fail book, in other words, it illustrates what you don’t want to do and explains why. It highlights very well the relation between various behaviors and nicely illustrates why teams of brilliant people can fail. The Kindle version is at less than $5, go get it and read it on your iPhone/iPad/computer/browser…

So this book somewhat changed my perception of management and leadership. Interesting enough, at Sony, my previous employer, they make a distinction between management and leadership. While they hope managers can be leaders, they don’t require them to be and to be honest very few are. I’m not sure that’s a good or a bad things, but I, for sure, was under different expectations. Finally, I spent a large amount of my life on the internet working on/with projects where meritocracy, respect and honor were key. The “ranking” is purely based on what your peers think of you and not based on your age/sex/origin/diploma/bank account. I do realize that this model has many pros but also some pretty major cons. My only point is that it did affect my worldview. In my world, seniority, a killer  job title or a fancy suit won’t buy you my automatic respect. On the other hand, job well done, great vision, honesty, over achievement will!

Taking these few trains of thoughts in consideration, I started thinking about my own expectations for a good manager/leader. I figured that if I were able to do that, I could possibly be able to define a work environment where I could thrive and maybe one day become a good “manager/leader”.

I’ve always questioned my ability to be a good leader. While most of the time, I have an opinion and can easily decide what I think should be done, I have a hard time relating to people who can’t see the “big picture”. While I usually can get decent results, I’m aware that it can unfortunately sometime be at the cost of a few bruised egos. I also know I have high expectations for myself and for others and I have a hard time understanding how some people can be ok with the “status-quo”. I’m a perfectionist who is only happy when he outperforms his previous achievement. I was raised to challenge and always push myself further, focusing on concrete end-results and achieved goals. And to be honest, that’s what I enjoy. But I also know for a fact, that many people are not like that and I can’t blame them for looking at things from a different angle and not sharing the same motivations. Furthermore, I know that most people actually don’t have the same driven temperament and that’s why I’ve questioned my abilities to lead others.

However, different temperaments can work together as long as there is respect. And by respect, I mean that everyone feel that they were being heard and know that their input was considered and addressed even though the outcome might not be as hoped for. But for respect to happen, you first need trust. And when people trust each other, Lencioni explains that “people don’t hold back one with another. They are unafraid to air their dirty laundry. They admit their mistakes, their weaknesses, and their concerns without fear of reprisal”. I think that as simple as it seems, it is the key to a successful team. A good leader should be able to create such an atmosphere where people can trust each other. In fact, I think that if a manger/leader/executive can manage to build trust as defined earlier, his technical skills or lack of vision don’t matter as much. He/she will be able to rely on people he trusts to help him make the right decisions. Of course, there is much more than to be a good leader, but I think that with this base, great things can be built, and without it, a much greater effort is required to get some good results.

Based on my findings, I think that I need to work on my communication so others don’t feel that they have to hold back and make sure everyone feels that their opinions were considered and addressed. To do that a key element is to admit my mistakes and weaknesses and asking others to help me improve. That’s it, sorry for the boring, not technical post. I promise the next one will have at least a code sample.


About concurrency and the GIL

During RubyConf 2011, concurrency was a really hot topic. This is not a new issue, and the JRuby team has been talking about true concurrency for quite a while . The Global Interpreter Lock has also been in a subject a lot of discussions in the Python community and it’s not surprising that the Ruby community experiences the same debates since the evolution of their implementations are somewhat similar. (There might also be some tension between EngineYard hiring the JRuby and Rubinius teams and Heroku which recently hired Matz (Ruby’s creator) and Nobu, the #1 C Ruby contributor)

The GIL was probably even more of a hot topic now that Rubinius is about the join JRuby and MacRuby in the realm of GIL-less Ruby implementations.

During my RubyConf talk (slides here), I tried to explain how C Ruby works and why some decisions like having a GIL were made and why the Ruby core team isn’t planning on removing this GIL anytime soon. The GIL is something a lot of Rubyists love to hate, but a lot of people don’t seem to question why it’s here and why Matz doesn’t want to remove it. Defending the C Ruby decision isn’t quite easy for me since I spend my free time working on an alternative Ruby implementation which doesn’t use a GIL (MacRuby). However, I think it’s important that people understand why the MRI team (C Ruby team) and some Pythonistas feels so strongly about the GIL.

What is the GIL?

Here is a quote from the Python wiki:

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.) [...] The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.

The same basically applies to C Ruby. To illustrate the quote above, here is a diagram representing two threads being executed by C Ruby:

Fair thread scheduling in Ruby by Matt Aimonetti

Such a scheduling isn’t a problem at all when you only have 1 cpu, since a cpu can only execute a piece of code at a time and context switching happens all the time to allow the machine to run multiple processes/threads in parallel. The problem is when you have more than 1 CPU because in that case, if you were to only run 1 Ruby process, then you would most of the time only use 1 cpu at a time. If you are running on a 8 cpu box, that’s not cool at all! A lot of people stop at this explanation and imagine that their server can only handle one request at a time and they they rush to sign Greenpeace petitions asking Matz to make Ruby greener by optimizing Ruby and saving CPU cycles. Well, the reality is slightly different, I’ll get back to that in a minute. Before I explain “ways to achieve true concurrency with CRuby, let me explain why C Ruby uses a GIL and why each implementation has to make an important choice and in this case both CPython and C Ruby chose to keep their GIL.


Why a GIL in the first place?

  • It makes developer’s lives easier (it’s harder to corrupt data)
  • It avoids race conditions within C extensions
  • It makes C extensions development easier (no write barriers..)
  • Most of the C libraries which are wrapped are not thread safe
  • Parts of Ruby’s implementation aren’t threadsafe (Hash for instance)
As you can see the arguments can be organized in two main categories: data safety and C extensions/implementation. An implementation which doesn’t rely too much on C extensions (because they run a bit slow, or because code written in a different language is preferred) is only faced with one argument: data safety.


Should C Ruby remove its GIL?

  • No: it potentially makes Ruby code unsafe(r)
  • No: it would break existing C extensions
  • No: it would make writing C extensions harder
  • No: it’s a lot of work to change make C Ruby threadsafe
  • No: Ruby is fast enough in most cases
  • No: Memory optimization and GC is more important to tackle first
  • No: C Ruby code would run slower
  • Yes: we really need better/real concurrency
  • Yes: Rubber boots analogy (Gustavo Niemeyer)
Don’t count the amount of pros/cons to jump to the conclusion that removing the GIL is a bad idea. A lot of the arguments for removing the GIL are related. At the end of the day it boils down to data safety. During the Q&A section of my RubyConf talk, Matz came up on stage and said data safety was the main reason why C Ruby still has a GIL. Again, this is a topic which was discussed at length in the Python community and I’d encourage you to read arguments from the Jython (the equivalent of JRuby for Python) developers, the PyPy (the equivalent of Rubinius in the Python community) and CPython developers. (a good collection of arguments are actually available in the comments related to the rubber boots post mentioned earlier)


How can true concurrency be achieved using CRuby?

  • Run multiple processes (which you probably do if you use Thin, Unicorn or Passenger)
  • Use event-driven programming with a process per CPU
  • MultiVMs in a process. Koichi presented his plan to run multiple VMs within a process.  Each VM would have its own GIL and inter VM communication would be faster than inter process. This approach would solve most of the concurrency issues but at the cost of memory.
Note:  forking a process only saves memory when using REE since it implements a GC patch that makes the forking process Copy on Write friendly. The Ruby core team worked on a patch for Ruby 1.9 to achieve the same result. Nari & Matz are currently working on improving the implementation to make sure overall performance isn’t affected.

Finally, when developing web applications, each thread spend quite a lot of time in IOs which, as mentioned above won’t block the thread scheduler. So if you receive two quasi-concurrent requests you might not even be affected by the GIL as illustrated in this diagram from Yehuda Katz:

This is a simplified diagram but you can see that a good chunk of the request life cycle in a Ruby app doesn’t require the Ruby thread to be active (CPU Idle blocks) and therefore these 2 requests would be processed almost concurrently.

To boil it down to something simplified, when it comes to the GIL, an implementor has to chose between data safety and memory usage. But it is important to note that context switching between threads is faster than context switching between processes and data safety can and is often achieved in environments without a GIL, but it requires more knowledge and work on the developer side.



The decision to keep or remove the GIL is a bit less simple that it is often described. I respect Matz’ decision to keep the GIL even though, I would personally prefer to push the data safety responsibility to the developers. However, I do know that many Ruby developers would end up shooting themselves in the foot and I understand that Matz prefers to avoid that and work on other ways to achieve true concurrency without removing the GIL. What is great with our ecosystem is that we have some diversity, and if you think that a GIL less model is what you need, we have some great alternative implementations that will let you make this choice. I hope that this article will help some Ruby developers understand and appreciate C Ruby’s decision and what this decision means to them on a daily basis.

, , , , , , , , ,