During RubyConf 2011, concurrency was a really hot topic. This is not a new issue, and the JRuby team has been talking about true concurrency for quite a while . The Global Interpreter Lock has also been in a subject a lot of discussions in the Python community and it’s not surprising that the Ruby community experiences the same debates since the evolution of their implementations are somewhat similar. (There might also be some tension between EngineYard hiring the JRuby and Rubinius teams and Heroku which recently hired Matz (Ruby’s creator) and Nobu, the #1 C Ruby contributor)
During my RubyConf talk (slides here), I tried to explain how C Ruby works and why some decisions like having a GIL were made and why the Ruby core team isn’t planning on removing this GIL anytime soon. The GIL is something a lot of Rubyists love to hate, but a lot of people don’t seem to question why it’s here and why Matz doesn’t want to remove it. Defending the C Ruby decision isn’t quite easy for me since I spend my free time working on an alternative Ruby implementation which doesn’t use a GIL (MacRuby). However, I think it’s important that people understand why the MRI team (C Ruby team) and some Pythonistas feels so strongly about the GIL.
What is the GIL?
Here is a quote from the Python wiki:
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.) [...] The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend a lot of time inside the GIL, interpreting CPython bytecode, that the GIL becomes a bottleneck.
The same basically applies to C Ruby. To illustrate the quote above, here is a diagram representing two threads being executed by C Ruby:
Such a scheduling isn’t a problem at all when you only have 1 cpu, since a cpu can only execute a piece of code at a time and context switching happens all the time to allow the machine to run multiple processes/threads in parallel. The problem is when you have more than 1 CPU because in that case, if you were to only run 1 Ruby process, then you would most of the time only use 1 cpu at a time. If you are running on a 8 cpu box, that’s not cool at all! A lot of people stop at this explanation and imagine that their server can only handle one request at a time and they they rush to sign Greenpeace petitions asking Matz to make Ruby greener by optimizing Ruby and saving CPU cycles. Well, the reality is slightly different, I’ll get back to that in a minute. Before I explain “ways to achieve true concurrency with CRuby, let me explain why C Ruby uses a GIL and why each implementation has to make an important choice and in this case both CPython and C Ruby chose to keep their GIL.
Why a GIL in the first place?
- It makes developer’s lives easier (it’s harder to corrupt data)
- It avoids race conditions within C extensions
- It makes C extensions development easier (no write barriers..)
- Most of the C libraries which are wrapped are not thread safe
- Parts of Ruby’s implementation aren’t threadsafe (Hash for instance)
Should C Ruby remove its GIL?
- No: it potentially makes Ruby code unsafe(r)
- No: it would break existing C extensions
- No: it would make writing C extensions harder
- No: it’s a lot of work to change make C Ruby threadsafe
- No: Ruby is fast enough in most cases
- No: Memory optimization and GC is more important to tackle first
- No: C Ruby code would run slower
- Yes: we really need better/real concurrency
- Yes: Rubber boots analogy (Gustavo Niemeyer)
How can true concurrency be achieved using CRuby?
- Run multiple processes (which you probably do if you use Thin, Unicorn or Passenger)
- Use event-driven programming with a process per CPU
- MultiVMs in a process. Koichi presented his plan to run multiple VMs within a process. Each VM would have its own GIL and inter VM communication would be faster than inter process. This approach would solve most of the concurrency issues but at the cost of memory.
Finally, when developing web applications, each thread spend quite a lot of time in IOs which, as mentioned above won’t block the thread scheduler. So if you receive two quasi-concurrent requests you might not even be affected by the GIL as illustrated in this diagram from Yehuda Katz:
This is a simplified diagram but you can see that a good chunk of the request life cycle in a Ruby app doesn’t require the Ruby thread to be active (CPU Idle blocks) and therefore these 2 requests would be processed almost concurrently.
To boil it down to something simplified, when it comes to the GIL, an implementor has to chose between data safety and memory usage. But it is important to note that context switching between threads is faster than context switching between processes and data safety can and is often achieved in environments without a GIL, but it requires more knowledge and work on the developer side.