Archive for May, 2009

Full tilt at the center of the earth

Sunday, May 31st, 2009

It was cold that morning, the first winter cold-snap; the hedgerows were rimed and stiff with frost and the standing water in the roadside drainage ditches was skimmed with ice and even the edges of the running water in the Nine Mile branch glinted fragile and scintillant like fairy glass and from the first farmyard they passed and then again and again and again came the windless tang of woodsmoke and they could see in the back yards the black iron pots already steaming while women in the sunbonnets still of summer or men’s old felt hats and long men’s overcoats stoked wood under them and the men with crokersack aprons tied with wire over their overalls whetted knives or already moved about the pens where hogs grunted and squealed, not quite startled, not alarmed but just alerted as though sensing already even though only dimly their rich and immanent destiny; by nightfall the whole land would be hung with their spectral intact tallowcolored empty carcasses immobilised by the heels in attitudes of frantic running as though full tilt at the center of the earth.

From William Faulkner’s Intruder in the Dust.

Slides from FluidDB talk at PGCon

Tuesday, May 26th, 2009

Here are the slides from my talk on May 22, 2009 at the Postgres Conference (PGCon) in Ottawa. The video will be available soon.

Talking at Postgres Conference (PGCon) in Ottawa

Tuesday, May 19th, 2009

Here’s just a quick note to mention that I’m talking at the annual Postgres Conference aka PGCon. The talk is titled The design, architecture, and tradeoffs of FluidDB, and is at 3pm on May 22nd. So if you happen to be in Ottawa this week…

I could have added the subtitle “How someone who knows nothing about databases wound up in a project to build a database.”

A mixin class allowing Python __init__ methods to work with Twisted deferreds

Monday, May 11th, 2009

I posted to the Python Twisted list back in Nov 2008 with subject: A Python metaclass for Twisted allowing __init__ to return a Deferred

Briefly, I was trying to find a nice way to allow the __init__ method of a class to work with deferreds in such a way that methods of the class could use work done by __init__ safe in the knowledge that the deferreds had completed. E.g., if you have

class X(object):
    def __init__(self, host, port):
        def final(connection):
            self.db = connection
        d = makeDBConnection(host, port)
        d.addCallback(final)

    def query(self, q):
        return self.db.runQuery(q)
 

Then when you make an X and call query on it, there’s a chance the deferred wont have fired, and you’ll get an error. This is just a very simple illustrative example. There are many more, and this is a general problem of the synchronous world (in which __init__ is supposed to prepare a fully-fledged class instance and cannot return a deferred) meeting the asynchronous world in which, as Twisted programmers, we would like to (and must) use deferreds.

The earlier thread, with lots of useful followups can be read here. Although I learned a lot in that thread, I wasn’t completely happy with any of the solutions. Some of the things that still bugged me are in posts towards the end of the thread (here and here).

The various approaches we took back then all boiled down to waiting for a deferred to fire before the class instance was fully ready to use. When that happened, you had your instance and could call its methods.

I had also thought about an alternate approach: having __init__ add a callback to the deferreds it dealt with to set a flag in self and then have all dependent methods check that flag to see if the class instance was ready for use. But that 1) is ugly (too much extra code); 2) means the caller has to be prepared to deal with errors due to the class instance not being ready, and 3) adds a check to every method call. It would look something like this:

class X(object):
    def __init__(self, host, port):
        self.ready = False
        def final(connection):
            self.db = connection
            self.ready = True
        d = makeDBConnection(host, port)
        d.addCallback(final)

    def query(self, q):
        if not self.ready:
            raise IAmNotReadyException()
        return self.db.runQuery(q)
 

That was too ugly for my taste, for all of the above reasons, most especially for forcing the unfortunate caller of my code to handle IAmNotReadyException.

Anyway…. fast forward 6 months and I’ve hit the same problem again. It’s with existing code, in which I would like an __init__ to call something that (now, due to changes elsewhere) returns a deferred. So I started thinking again, and came up with a much cleaner way to do the alternate approach via a class mixin:

from twisted.internet import defer

class deferredInitMixin(object):
    def wrap(self, d, *wrappedMethods):
        self.waiting = []
        self.stored = {}

        def restore(_):
            for method in self.stored:
                setattr(self, method, self.stored[method])
            for d in self.waiting:
                d.callback(None)

        def makeWrapper(method):
            def wrapper(*args, **kw):
                d = defer.Deferred()
                d.addCallback(lambda _: self.stored[method](*args, **kw))
                self.waiting.append(d)
                return d
            return wrapper

        for method in wrappedMethods:
            self.stored[method] = getattr(self, method)
            setattr(self, method, makeWrapper(method))

        d.addCallback(restore)
 

You use it as in the class Test below:

from twisted.internet import defer, reactor

def fire(d, value):
    print "I finally fired, with value", value
    d.callback(value)

def late(value):
    d = defer.Deferred()
    reactor.callLater(1, fire, d, value)
    return d

def called(result, what):
    print ‘final callback of %s, result = %s’ % (what, result)

def stop(_):
    reactor.stop()

class Test(deferredInitMixin):
    def __init__(self):
        d = late(‘Test’)
        deferredInitMixin.wrap(self, d, ‘f1′, ‘f2′)

    def f1(self, arg):
        print "f1 called with", arg
        return late(arg)

    def f2(self, arg):
        print "f2 called with", arg
        return late(arg)

if __name__ == ‘__main__’:
    t = Test()
    d1 = t.f1(44)
    d1.addCallback(called, ‘f1′)
    d2 = t.f1(33)
    d2.addCallback(called, ‘f1′)
    d3 = t.f2(11)
    d3.addCallback(called, ‘f2′)
    d = defer.DeferredList([d1, d2, d3])
    d.addBoth(stop)
    reactor.run()
 

Effectively, the __init__ of my Test class asks deferredInitMixin to temporarily wrap some of its methods. deferredInitMixin stores the original methods away and replaces each of them with a function that immediately returns a deferred. So after __init__ finishes, code that calls the now-wrapped methods of the class instance before the deferred has fired will get a deferred back as usual (but see * below). As far as they know, everything is normal. Behind the scenes, deferredInitMixin has arranged for these deferreds to fire only after the deferred passed from __init__ has fired. Once that happens, deferredInitMixin also restores the original functions to the instance. As a result there is no overhead later to check a flag to see if the instance is ready to use. If the deferred from __init__ happens to fire before any of the instance’s methods are called, it will simply restore the original methods. Finally (obviously?) you only pass the method names to deferredInitMixin that depend on the deferred in __init__ being done.

BTW, calling the methods passed to deferredInitMixin “wrapped” isn’t really accurate. They’re just temporarily replaced.

I quite like this approach. It’s a second example of something I posted about here, in which a pool of deferreds is accumulated and all fired when another deferred fires. It’s nice because you don’t reply with an error and there’s no need for locking or other form of coordination – the work you need done is already in progress, so you get back a fresh deferred and everything goes swimmingly.

* Minor note: the methods you wrap should probably be ones that already return deferreds. That way you always get a deferred back from them, whether they’re temporarily wrapped or not. The above mixin works just fine if you ask it to wrap non-deferred-returning functions, but you have to deal with the possibility that they will return a deferred (i.e., if you call them while they’re wrapped).

Balcony music

Wednesday, May 6th, 2009

Today I discovered the wonderful Grooveshark and some thoughts occurred to me that I feel like writing down.

I haven’t spent much time thinking about rights over digital media, downloading, etc. I’ve tended to ignore the whole debate. So the following may all be commonplace observations. I have no idea.

It occurred to me that continued increases in the prevalence and bandwidth of internet access might be going to solve a problem they helped create. That we may simply be in a temporary uncomfortable phase that will soon be over.

The increase of broadband made it possible for people to download large music and video files. People had long been used to the traditional model of owning physical objects that contained their music and video: LPs, 8-track, cassettes, VHS cartridges, CDs, DVDs, etc. It was all physical property. We typically paid for it. I still have about 1,000 CDs, all paid for, sitting uselessly on my shelf.

The default frame of reference was the physical object that you bought in a store, brought home, physically put in a player, physically stored on a shelf, could lend to (and hopefully get back from) a friend. Broadband extended that, allowing us to download what we still thought of as physical objects. And they are physical objects in a real sense: occupying space on our digital hard disk shelves, needing organizational love and care, needing backups, etc.

Because the frame of reference was still physical objects, the media companies, who have their own opinion on the various rights – real and imagined – associated with these objects, had a way to go after the downloaders. They could point to the physical objects and say “hey, you stole that (object)”, or “you didn’t pay for that (object)”. They could even write worms and rootkits to dig into our computers looking for the objects, getting lists of them to hold up in court. And they had a point: where did you get that physical object after all?

But their argument, the frame of reference that shapes the debate, rests on ancient arguments: agreements and conventions regarding physical objects. Much of the law is based on these things.

The frame of reference might be due to change radically, kicking the legs out from under the music industry.

Imagine you’re walking down the street. You pass under a balcony and see open doors leading back into an apartment. There’s great music coming out of the doors, and you can hear it clearly down in the street. You stop to listen. Have you committed a crime? Would anyone even suggest that you had?

Someone comes out onto the balcony to stand in the sun. You call up and ask what the music is. They tell you, and you say how much you like it. They tell you they have other albums – and would you like to hear another song? You say yes, and stand down in the street while they put on another track. No crime there, right?

Suppose this balcony is in the building right next to yours. You go home and open your own balcony doors to be able to hear the music. You do that every day. Once in a while you bump into the neighbor in the street and comment on something else, maybe make a request. In the end the neighbor even suggests running a speaker wire into your apartment so you can hear their music whenever you like, even if it’s raining and everyone has their balcony doors closed. You buy a speaker with a volume control on it. Once in a while you even call your neighbor on the phone to ask them to play something again, or to put on a special track.

There’s no crime there, not even the hint of one. The media companies would probably like to protest. But the frame of reference has totally changed. We’ve gone from the mindset of physical possession of an object of questionable origin to the walking down the street and hearing music.

And so it will go with increasing broadband. I’ve been listening to Clem Snide all day on Grooveshark. It’s streaming into my computer and directly to my speakers without being stored as a physical object on my machine. Entire tracks are not being physically stored: the music coming out my speakers and the data on my machine are just as ephemeral as they would be if I were walking down the street overhearing Clem Snide from someone else’s balcony.

Have I broken a crime? I find it very hard to argue that I have. OTOH, if I download a file and store it on my machine (which I have done many times BTW) it’s very easy to argue that there is a crime of some sort being committed. It’s easy to ignore that feeling too, but that’s not the point.

The reality is, I think, that we don’t actually want to own the physical objects. I don’t want a shelf full of physical CDs, and I don’t want a hard drive that’s 80% full of music files that I worry about and even back up.

How many times do you watch a DVD anyway? For many people it’s silly to buy a DVD because you can rent it much more cheaply, and you’re probably only going to watch it once or maybe twice. Music, for me at least, is different as I’ll sometimes listen to a single track 100-200 times. But I still don’t need or want to own it if I can just pull it up on demand via Grooveshark. I’d rather it was their disk space than mine, and the bandwidth interference with my normal work due to the streaming audio is increasingly hard to detect.

We may just be in a temporary uncomfortable stage that will be solved by the thing that got us here – increasing broadband access.

As bandwidth increases and becomes cheaper it seems like there will be a trend towards just streaming media and not downloading it to have and to hold until the RIAA or MPAA do us part.

At that point the frame of reference will change. It will become very difficult to maintain that a crime has been committed. To do so you’ll have to also argue that walking down the street and overhearing your neighbor’s music is also a crime. Good luck making that argument.

The first loud ding-dong of time and doom

Tuesday, May 5th, 2009

But not for a little while yet; for a little while yet the sparrows and the pigeons; garrulous myriad and independent the one, the other uxorious and interminable, at once frantic and tranquil – until the clock strikes again which even after a hundred years, they still seem unable to get used to, bursting in one swirling explosion out of the belfry as though the hour, instead of merely adding one puny infinitesimal more to the long weary increment since Genesis, had shattered the virgin pristine air with the first loud ding-dong of time and doom.

The final paragraph of Act 1, (The Courthouse) in Faulkner’s Requiem for a Nun.