Add to Technorati Favorites

Asynchronous data structures with Twisted Deferreds

Earlier this week I gave a talk titled Deferred Gratification (slides) at EuroPython in Birmingham. The title was supposed to hint at how much I love Twisted‘s Deferred class, and that it took me some time to really appreciate it.

While thinking about Deferreds in the days before the talk, I think I put my finger on one of the reasons I find Deferreds so attractive and elegant. The thought can be neatly summarized: Deferreds let you replace synchronous data structures with elegant asynchronous ones.

There are various ways that people get exposed to thinking about programming in an asynchronous style. The great majority do so via building user interfaces, usually with widget libraries wherein one specifies widgets and layouts, sets up event handlers, and then surrenders control to a main loop that thereafter routes events to handlers. Exposure to asynchronous programming via building UIs is becoming much more common as Javascript programmers build client-side web apps that operate asynchronously with web servers (and the “A” in AJAX of course stands for asynchronous). Others become aware of asynchronous programming via writing network code (perhaps using Twisted). Relatively few become aware of asynchronous programming via doing async filesystem I/O.

Because Twisted’s Deferreds don’t have anything to do with UIs or networking or filesystems, you can use them to implement other asynchronous things, like an asynchronous data structure. To show you what I mean, here’s a slightly simplified version of Twisted’s DeferredQueue class, taken from twisted/internet/defer.py:

class DeferredQueue(object):

    def __init__(self):
        # It would be better if these both used collections.deque (see comments section below).
        self.waiting = [] # Deferreds that are expecting an object
        self.pending = [] # Objects queued to go out via get.

    def put(self, obj):
        if self.waiting:
            self.waiting.pop(0).callback(obj)
        else:
            self.pending.append(obj)

    def get(self):
        if self.pending:
            return succeed(self.pending.pop(0))
        else:
            d = Deferred()
            self.waiting.append(d)
            return d

I find this extremely elegant, and I’m going to explain why.

But first, think about code for a regular synchronous queue. What happens if you call get on a regular queue that’s empty? Almost certainly one of two things: you’ll get some kind of QueueEmpty error, or else your code will block until some other code puts something into the queue. I.e., you either get a synchronous error or you get a synchronous non-empty response.

If you look at the get method in the code above, you’ll see that if the queue is empty (i.e., the pending list is empty), a new Deferred is made, added to self.waiting, and is immediately returned to the caller. So code calling get on an empty queue doesn’t get an error and doesn’t block, it always gets a result back essentially immediately. How can you get a result from an empty queue? Easy: the result is a Deferred. And because we’re in the asynchronous world, you just attach callbacks (like event handlers in the UI world) to the Deferred, and go on your merry way.

If you can follow that thinking, the rest of the code in the class above should be easy to grok. In put, if there are any outstanding Deferreds (i.e., earlier callers who hit an empty queue and got a Deferred back), the incoming object is given to the first of these by passing it to the callback function of the Deferred (and popping it out of the waiting list). If there are no outstanding Deferreds expecting a result, the incoming object is simply appended to self.pending. On the get side, if the queue (i.e., self.pending) is non-empty, the code creates a Deferred that has already been fired (using the succeed helper function) and returns it.

By now, this kind of code seems routine and straightforward to me. But it certainly wasn’t always that way. So if the above seems cryptic or abstract, I encourage you to think it over, maybe write some code, ask questions below, etc. To my mind, these kinds of constructs – simple, elegant, robust, practical, obviously(?) bug-free, single-threaded, etc. – are extremely instructive. You too can solve potentially gnarly (at least in the threaded world) networking coding challenges in very simple ways just by building code out of simple Twisted Deferreds (there’s nothing special about the DeferredQueue class, it’s just built using primitive Deferreds).

For extra points, think about the concept of asynchronous data structures. I find that concept very attractive, and putting my finger on it helped me to see part of the reason why I think Deferreds are so great. (In fact, as you can see from the use of succeed, Deferreds are not specifically asynchronous or synchronous: they’re so simple they don’t even care, and as a consumer of Deferreds, your code doesn’t have to either). That’s all remarkably elegant.

I sometimes hope O’Reilly will publish a second volume of Beautiful Code and ask me to write a chapter. I know exactly what I’d write about :-)


You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

8 Responses to “Asynchronous data structures with Twisted Deferreds”

  1. rhettinger Says:

    This is a nice recipe and demo of how to build async data structures out of deferreds.

    One nit: Please use collections.deque() objects with popleft() instead of list objects with pop(0).

  2. Hi Raymond

    Yes…. :-) I was thinking of exactly that during your idiomatic Python talk last Monday. In fact about 10 days ago in #twisted, JP Calderone, who I guess you will know, said “it should be a deque” to the agreement of those who were listening. I'm not sure I want to change the code above as I was copying from defer.py. I'll stick in a comment. Thanks a lot!

  3. Nice. It's conceptually similar to node.js (a non-blocking server-side implementation in JavaScript) which caused quite a stir last year.
    Once you get the hang of non-blocking (async) callbacks, it becomes difficult to think in any other way…

  4. hi, i was thinking about a way to add a peek method (i.e. get element without removing from the queue) in a deferred context.

    i couldn’t end up with something as elegant as the rest of the code.

    (yes, it’s a challenge :-) )

  5. […] Here’s a fun class that I can’t think of a good use for :-) But I like its simplicity and it’s another illustration of what I like to call asynchronous data structures. […]

  6. this is the winter of my elegant discontent

  7. Hi!

  8. Hi. Sorry it took me so long to think about this. One initial reaction is to keep a self.peekers list (of Deferreds) and provide a peek method. On peek() if there’s anything in self.pending, you get a defer.succeed back with the first pending thing. If not, you get a Deferred that is also added to self.peekers. On put, if the self.peekers list is not empty, fire each one with the passed object and set self.peekers to [].

    How’s that sound?