Friday, October 19, 2012

Talk: Continuations in servers

OK. It happened two years ago. But, for the record, I gave a talk Continuations in Servers at the JVM Language Summit 2010.

It's about experimenting with a JVM-level continuation and allowing servers to be written like a synchronous server for better productivity, but with the low overhead of an asynchronous server for better scalability. These two types of servers differ in terms of how threads are mapped to server requests:

1) A synchronous server dedicates a thread for a request for the lifetime of the request. Since a thread is dedicated, the request handling code can block, for example, for network I/O calls, database accesses, etc. It leads to more natural easier-to-understand coding style. For example, the following request handler calls the database twice in a blocking way.


Response handleRequest(Request req) {
  SomeData a = calldb1(req.param1);
  OtherData b = calldb2(a, req.param2);
  return new Response(a, b);
}


But, since a thread is relatively an expensive resource, the maximum number of in-flight requests that a server can simultaneously handle is bounded by the maximum number of threads and the associated overhead that a system can manage. This type of servers is sometimes called a thread-per-request,  thread-per-connection, or thread-based server.

2) An asynchronous server, on the other hand, does not dedicate a thread for a request for the lifetime of the request. It instead releases the thread whenever the request handling code blocks for I/O. The rest of the request handling logic is typically executed as a callback (after the I/O finishes) by (most likely) a different thread from a thread pool. This leads to a less intuitive harder-to-understand coding style. For example, the above request handler code would look like the following in an asynchronous style:


void handleRequest(final Response res, final Request req) {
  calldb1(req.param1,
    new Callback() {
      public void callback(final SomeData a) {
        calldb2(a, req.param2,
          new Callback() {
            public void callback(OtherData b) {
              res.send(a, b);
            }});
      }});
}


Basically, the flow of the code is interrupted at each I/O point, and you lose your call context (or call stack) because each part may be executed by a different thread.

But because the thread can be released and reused to handle other requests while a request is blocking in an I/O (imagine a network I/O which can take a while to come back,) it is more scalable. Here, a server can handle more requests than the number of available threads. This type of servers is sometimes called a select server or event-based server.

So, it's about a tradeoff between ease-of-coding and scalability.

Here's where a continuation comes in. A continuation can save a call context, detach it of the thread, and resume it later in a different thread. In theory, if it's applied at each I/O point (inside the I/O library, hidden from the application code), it can allow for the illusion of the synchronous style of coding for application code, while it allows the thread to be released at each I/O point as in an asynchronous style. So, it could have the best of both styles, that is, productivity and scalability.

I have done an experiment with this idea. I developed a JVM-level continuation implementation (based on a patch from the MLVM project) and applied it to a Java-based remote procedural call (RPC) library and an application server. The results are encouraging. The server code is written in a synchronous style just like in a synchronous server while the memory footprint overhead (of the thread stacks) and the server throughput are almost as good as an asynchronous server.

Of course, this is just an experiment, and the open question is whether continuations are suitable for Java and the JVM.

No comments: