Micro concurrency for Big throughput

Posted on Sun 30 September 2007

I started learning Erlang on my spare time (which essentially means a few hours during week-ends lately). Surprisingly, it has brought back to my memory many things I did years ago. The Prolog-ish syntax with pattern matching (a simple form of unification) and the message passing between process which isn't that far from Ada's rendez-vous (except that Ada is synchronous).

One of the great things in Erlang, that justifies the current hype around it, is that concurrency is cheap both in terms of code an in terms of memory/cpu footprint, and that message-passing makes it way easier to develop concurrent programs than threads with shared data.

So I was questioning myself about how to make use of this when developing web applications. Essentially, a web application has some background threads preparing stuff (updating data, expiring caches, whatever) and front-end threads managed by the web server to answer client requests. Erlang could be nice to avoid the pesky race conditions between front-end and background threads, but does it justify such a technological jump compared to "traditional" techniques?

And then I had a "doh!" moment while reading this blog entry from the author of ErlyWeb, an Erlang web framework.

With super-cheap processes, there's a large number of things involved in request handling that can be performed in the background. Answer the user's response as quickly as you can, and spawn small processes for everything that's has to be done but doesn't contribute to the user response. And you can also spawn parallel processes for actions contributing to the user response that require different kinds of I/O.

The global work performed by your machine won't change much (because Erlang processes are cheap) but the perceived responsiveness will greatly increase, and this is what counts. So even for what seems very standard web apps, Erlang can make a difference.

And even more: since every process has its own heap in Erlang, short-lived processes actually lower the strain on the garbage collector, since most often their heap will be reclaimed as a whole when they end, before the garbage collector has to kick in.

I now have to see how Scala compares to Erlang in terms of performance, because it brings cheap processes to the Java world.



A pig in the incubator

Dynamo: Amazon's key/value store