Archive for the ‘Essence’ Category

Tickery, for programmers

Thursday, January 21st, 2010

Where's the beef?If you’re a programmer and you’ve played around with Tickery, it should be clear that Tickery is functionally very simple when looked at from a traditional database perspective.

Tickery looks like an application in its own right. It tries to offer something so simple that anyone (at least any Twitter user) can understand and use the Simple “Enter two Twitter usernames” functionality.

But we actually designed and built Tickery mainly as a demo of what’s possible with FluidDB (description, API). So here are some first details on how Tickery uses FluidDB, and how you can use it too.

FluidDB objects

The most important thing to understand about FluidDB initially is that it maintains a collection of tagged objects that are not owned. Tags have permissions and a tag on an object can optionally have a value associated with it. Here’s a conceptual view of an object that has two tags on it that were added by Tickery.

The long identifier is the object’s unique id in FluidDB. The left column shows tag names, such as twitter.com/users/screen_name, and the right column shows the value (if any) of the tag on the object. Because objects in FluidDB are not owned, anyone (which is to say any application) can put additional tags onto this object. I’m going to ignore permissions in what follows – that’s a subject for a separate posting.

Any application can find this object in FluidDB, using a simple query, like twitter.com/users/id = 42983 or twitter.com/users/screen_name = "terrycojones".

Now let’s suppose Tickery adds @esteve to FluidDB, and wants to indicate that Esteve currently follows Terrycojones. Tickery creates a new tag, twitter.com/friends/esteve in FluidDB, and adds it to the above object. The object then looks like this:

Similarly, Tickery adds a twitter.com/friends/esteve to the objects representing all the Twitter users Esteve follows. At this point it is easy to retrieve all those users via the FluidDB query has twitter.com/friends/esteve (i.e., get me all the objects that have a twitter.com/friends/esteve tag, irrespective of the tag’s value, if any).

Suppose Tickery now adds another Twitter user, @fergusstothart who currently follows me. It adds another tag to the object, resulting in

and also puts a twitter.com/friends/fergusstothart tag onto the objects for the other users that Fergus follows. It finds these objects via a FluidDB query (using the Twitter id of Fergus, obtained from the Twitter API). If Tickery needs to tag an object for a user that it hasn’t created yet, it simply creates a new object for that user, and tags it.

Given the above, we’ve seen enough to know how Tickery does most of its work. For example, getting things like the set of people Esteve and Fergus follow in common is just an and query has twitter.com/friends/esteve and has twitter.com/friends/fergusstothart, or the set of people Esteve follows but Fergus does not has twitter.com/friends/esteve except has twitter.com/friends/fergusstothart, etc.

Where we come in

That’s all well and good, but it’s all about Tickery. What if other programmers, who perhaps don’t care or even know about Tickery want to add data and search on it too? In a normal database or with a normal application, you’d probably expect to have to ask permission. Then, supposing it was granted, you could only do the kinds of things that had been anticipated and provided for.

But in FluidDB it’s completely different. Any application can come along and put whatever it likes onto the above object, or any other object (that it can find). As a trivial example, Esteve and I have also added tags to the FluidDB objects to indicate which of the people we follow we have also met in person. Esteve has an esteve/met tag, and because we’ve met (in fact we built FluidDB together), he has put that tag onto the above object:

Think about what just happened. An unknown 3rd party (well, let’s pretend Esteve was unknown) just came along, sometime after the Tickery data already existed in FluidDB, and added something completely new and unanticipated to an existing object, without asking for permission, and without in any way disturbing the original content. Esteve, or anyone else who can read his tag, can now do interesting searches, like has twitter.com/friends/esteve except has esteve/met, which shows the people he follows but has not yet met. Further, his searches seamlessly combine the existing Tickery data with his own data, and could also include other tags that other applications add.

That kind of unanticipated use of information, flexibility in representation and search, and change of control is what’s at the heart of FluidDB.

Twitter lists

If you think about it, Esteve adding esteve/met tags to the objects for Twitter users is exactly like making a list in Twitter using their new lists function. But it’s more useful for two main reasons.

Firstly, you can query across lists, e.g., has terrycojones/met and has esteve/met will find people we have both met in person, and (has twitter.com/friends/esteve and has terrycojones/met) except has esteve/met will find people Esteve follows that I have met in person but whom he has not. As you can see, querying on lists makes them much more useful.

Secondly, you can use the FluidDB permissions system to control who can see or read your tags. So it’s not only possible to have a completely private list, a public one, but you can also have a list that’s visible just to some friends, or one that you let certain other people add to (by giving them write permission on the tag involved).

Permissions in FluidDB are very simple and very flexible, and because they apply at the level of the tag (not the object), you can control who can do what to individual pieces (tags) of an object. That’s a subject for another post, as I mentioned. You might like to have a read of the FluidDB permissions docs and/or check out Nicholas Radcliffe’s post Permissions Worth Getting Excited About, plus see the comment by Nicholas Tollervey, who writes FluidDB’s "killer feature" is actually its permissions system and the implications thereof. It is so important that I’ll save that topic for its own blog post later.

More Tickery tags

Tickery also saves a few more Twitter user details onto objects in FluidDB. The object above has some additional tags:

You can put these into queries too, of course. The Tickery Advanced tab lets you type them in, e.g., I can see which of the people that Jack follows are very popular has twitter.com/friends/jack and twitter.com/users/followers_count > 100000.

Running on ahead of Tickery

Finally, here’s a subtle but very important point. What if you write an application that uses FluidDB to store data, and you want it to interact fully with Tickery, but you want to store information about a user that Tickery doesn’t know about yet?

This is crucially important because it’s about information control, and if control is completely in the hands of Tickery, other developers will be less likely to want to add information. Exactly this scenario plays out in many domains: e.g., suppose Amazon released something that let you indicate which books you own, but that you own things that are not in the Amazon database. How can you run on ahead of Amazon to insert your data before they create an object for the book, if ever? How can you do it in a way so that when they finally do create the book your data and theirs are seamlessly joined without anyone having to lift a finger or even be aware of the other party?

This is one area in which the special FluidDB “about” tag (full name fluiddb/about) makes all the difference. You can read about it here, and be sure to check out Nicholas Radcliffe’s blog which is titled, not coincidentally, About Tag.

Tickery uses the FluidDB about tag to hold a Twitter user id, like this:

There’s a ton that could be written about this. Very briefly, the about tag is immutable and can only be set on an object when it’s created (in fact the about tag shown above was put there when the object was made). So, if you want to add data to Twitter user that Tickery hasn’t gotten to, just look up the user’s Twitter id (say XXXX) with the Twitter API, create the object in FluidDB with about tag twitter.com:uid:XXXX, and put your tags onto that object. If FluidDB doesn’t have an object with that about tag, it will make one for you. When/if Tickery gets around to adding its information for that user, it will put it in the same place. Magic.

Convenience API

As a convenience, though note that it’s optional and it’s use is up to you, Tickery provides a small API that you can use to have it put its twitter.com/users/screen_name and twitter.com/users/id tags onto objects for you and give you the FluidDB object id its using for a user.

E.g., if you do an HTTP GET on http://tickery.net/api/screennames/terrycojones you’ll see the object id from our examples above. Or if you happened to know my Twitter user id (42983) via the Twitter API, you could do a GET on http://tickery.net/api/uids/42983 and receive the same thing.

Truly social data

This API is just for convenience. Tickery uses the about tag in order to be able to share FluidDB objects with other apps – including apps that want to add information about a user that Tickery has not gotten to yet. Just like FluidDB, Tickery wants to encourage what we call Truly Social Data. Tickery doesn’t place itself in the center, doesn’t make its data more important than anyone else’s, and doesn’t act as a gatekeeper.

In fact, it gets even better: a normal user can turn around and stop Tickery from reading the data that Tickery stored on the user’s behalf. That’s as it should be. Users should have control over their data, and a choice of application shouldn’t result in lock in.

Getting access

FluidDB is still very new, and we’re in a private alpha phase. If you’d like to use the API, there are two steps: 1) reserve a username, and 2) send us email mentioning the name you reserved and a line or two about what you’d like to do. We apologize for this early restriction – please rest assured that we’re planning to open FluidDB up to everyone before too long. That’s the whole point.

More soon. Thanks very much for reading!

Meet Tickery

Thursday, January 21st, 2010

TickeryWe’re very excited to present Tickery: a fun tool for exploring sets of Twitter friends and finding new people to follow.

Tickery starts off very simply, letting you see who pairs of Twitter users follow in common. For example, who do Tim O’Reilly and Tim Bray follow in common? Even simple queries like this are interesting, because they’re a great way to find interesting Twitter users you may want to follow too. In addition, on the Tickery page just below where you enter the two user names, you’ll get to see whether the two users follow each other. So if you find yourself asking “I wonder if X follows Y?” you can use Tickery to find out immediately, which beats scrolling through multiple pages on twitter.com.

Please be patient with Tickery if you do a query on a user we haven’t added yet. Tickery uses the Twitter API to get information about users, and there are restrictions on how fast we can make those API calls.

Tickery lets you sign in via Twitter – see the button on the top right of the tab bar. If you sign in you can filter results to show just the people you’re following (or not), you can click to follow new people, and you can send out tweets with links to Tickery results of interest.

Tickery’s Intermediate tab offers a big jump in power. Enter simple queries using Twitter user names, and simple terms like and, or and except. For example, the query (jack and ev and biz) except terrycojones shows me Twitter users that all the three Twitter founders follow, but who I do not. Or get possible hints on which entrepreneurs are being followed by one firm and missed by another: for example who does everyone at Union Square Ventures follow that no-one at Betaworks does?

Tickery also has an Advanced tab, which gives you another big jump in power. I’ll save a description of that for another post, but to whet your appetite, here are users Tickery knows about with a Twitter id less than 1000 and people I follow on Twitter that I have also met in person. Or see the description and examples in the "huh?" button on the Advanced tab.

Powered by FluidDB

The most important and interesting thing about Tickery is that it’s built on top of FluidDB (description, API). Tickery is great fun all by itself, but it was built to show what we at Fluidinfo think the relationship between applications, their users, and their data will come to look like. That’s also the subject of an upcoming post, but here are a few bullet points to give you an idea of why applications like Tickery, written on top of FluidDB, may look normal but are in fact very different. Such applications, in combination with an information architecture like FluidDB will:

  • Leave users in control of their data, which includes letting them use other applications to work on it and, if you’re really serious, being able to turn off access to the application that stored it for you.
  • Make the world writable by default, by allowing anyone to add anything to the underlying data in any way they like.
  • Selectively protect individual aspects of data objects on a user-by-user and application-by-application basis.
  • Allow users and applications to put their name (like an internet domain name) on pieces of data, thereby stamping that data with trust and reputation.
  • Let you combine and organize your data, in isolation or with anyone else’s, via search and tagging.
  • Let other applications add more data, in a compatible and integrated way, without needing the permission or advance knowledge of the original application.
  • Explicitly allow for, and encourage, the flexible evolution of data structure conventions, similar to the way that we see evolution of tagging and hashtags.

You can read about this in the context of Tickery on the About tab.

These are the kinds of ideas that people have recently been writing about as the future of data. For example, see some of these articles: The Future: Operating System And Application-Neutral Data, We need a Wikipedia for data, Can Twitter Survive What is About to Happen to It?, Shared data services – the next frontier?, and Robert Scoble’s Twitter to turn on advertising “you will love” (here’s how: SuperTweet). While you’re at it, you might enjoy Scoble’s article The unfundable world-changing startup, which he wrote about Fluidinfo a year ago.

Stay tuned, there’s much more to come. If you’d like to find out how to write programs that can augment and use the data Tickery has stored in FluidDB, have a read of Tickery, for programmers.

Meanwhile, have fun with Tickery! Check back here, or follow me on Twitter for more news on Tickery, FluidDB, and Fluidinfo.

Putting metadata onto tweets with FluidDB

Tuesday, December 1st, 2009

novaVarious articles have recently discussed adding metadata to Twitter tweets – see the posts by Nova Spivack, Robert Scoble, and Dave Winer (who also suggests we need a programming language built into a Twitter client).

These are the sorts of things that FluidDB was designed to support, and you can do them today. If you want a password to start playing with the FluidDB API, send email to api at fluidinfo dot com and we’ll set you up.

In the meantime, here are some examples. I’m doing this at the iPython command line, using the Fluid Object Mapper (FOM) library, written by Ali Afshar. FOM provides a natural way to work with FluidDB objects, namespaces, tags, etc. But you could use any client-side software you like. The FluidDB API is just HTTP.

First, let’s get a connection to FluidDB:

from fom.session import Fluid

fdb = Fluid()
fdb.db.client.login(‘terrycojones’, ‘PASSWORD’)
fdb.bind()

That last line is a bit of internal FOM magic that makes interactive use simpler in what follows. Ignore it for now.

To put metadata onto a tweet, we’ll first ask FluidDB for the object that’s about a particular tweet. Let’s take the one in the image above by @novaspivack. That tweet has a URL of http://twitter.com/novaspivack/status/4999653280. We ask FluidDB to give us the object “about” that URL:

from fom.mapping import Object

o = Object()
o.create(‘http://twitter.com/novaspivack/status/4999653280′)
o.uid
>>> u‘ab7fa032-06df-45be-9bb2-859c18c4d342′

The argument in the o.create call is the value of the FluidDB about tag. If an object with that about tag already exists, FluidDB gives it to us. Otherwise, a new object with that about tag is created. As you can see, the object also has an identifier (o.uid). In case you’re not familiar with Python, the “u” printed in front of the id indicates that the value is a unicode string.

This is a first point of interest. We’ve just created a FluidDB object corresponding to an arbitrary tweet. We didn’t ask for permission, we just did it. It’s a bit like a wiki: you can ask a wiki for its page on anything, and if no such page exists, the wiki just makes you a new one. FluidDB does the same thing with its objects and about tag. If you want to think about FluidDB in all its generality, you should now consider that the about tag above could have been for any tweet, including tweets that don’t exist (or don’t yet exist), for any URL, in fact for any string. We also could have followed Nova’s suggestion and used an about value like "twitter.com/id=4999653280". But we’re getting ahead of ourselves.

FluidDB has a simple query language, so let’s quickly confirm that we can find this object with a search:

fdb.objects.get(‘fluiddb/about = "http://twitter.com/novaspivack/status/4999653280"’)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

The 200 is an HTTP OK status telling us the call succeeded, and you can see one object matched the search and that its id is as expected.

So how about some metadata? Let’s say I want to add a rating to the object. Here’s a bit of one-time setup. First I get my top-level namespace (which corresponds to my FluidDB user name). Then I create a new tag called rating in that namespace:

from fom.mapping import Namespace

ns = Namespace(‘terrycojones’)
ns.create_tag("rating", "A tag for Terry’s ratings.", False)

The False argument is telling FluidDB that I don’t want the tag to be indexed. Ignore that for now.

The magic of FOM lets us directly examine the tag using Python attributes. So you can get the tag and see its description like so:

rating = ns.tag(‘rating’)
rating.description
>>> u"A tag for Terry’s ratings."

At this point we have a new tag, or an abstract tag if you prefer, but we haven’t actually tagged any objects with it. So let’s tag the object we created above for Nova’s tweet:

o.set(‘terrycojones/rating’, 6)

That was pretty easy! The FluidDB object that’s about Nova’s tweet now has some metadata on it, a ‘terrycojones/rating’ tag, with a value of 6. Let’s make sure we can get that value back:

o.get(‘terrycojones/rating’)
>>> (6, None)

We get a 2-tuple whose second value is None when the tag’s value is a primitive Python type (in this case an integer).

Let’s do a couple of quick searches for objects with terrycojones/rating tags:

fdb.objects.get(‘terrycojones/rating = 6′)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

fdb.objects.get(‘terrycojones/rating > 4′)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

fdb.objects.get(‘has terrycojones/rating’)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

In each case just that one object is returned, as expected. Note that the last query just tests for the presence of the tag, irrespective of the tag’s value (if any).

So there you have it: arbitrary metadata on tweets, and with a query language to help find things.

But let’s press on and see how things get more interesting.

First of all, you may have noticed that I didn’t have to deal with permissions at all in the above. I was able to create the FluidDB object about Nova’s tweet and to tag it without asking permission. In FluidDB that’s always the case.

But there is a permissions system. Let’s log in as a different user and try a few things to see how it works. First, I’ll log in as njr another user whose password I happen to know:

fdb.db.client.login(‘njr’, ‘PASSWORD’)

The njr user is actually Nicholas Radcliffe who has written several great introductory articles about FluidDB over at About Tag.

Let’s try (as Nick) getting the terrycojones/rating tag from the object for Nova’s tweet:

o.get(‘terrycojones/rating’)
>>> (6, None)

That still works, so we can infer that the terrycojones/rating tag is readable by the njr user. Let’s log in as terrycojones again and have a look at the permissions:

fdb.db.client.login(‘terrycojones’, ‘PASSWORD’)
fdb.permissions.tag_values[‘terrycojones/rating’].get(‘read’)
>>> (200, {u‘exceptions’: [], u‘policy’: u‘open’}

We’ve asked FluidDB for read permissions on tag values for the tag terrycojones/rating. The result is a general policy (open), with exceptions (currently empty). Now I’ll put the njr user into the exceptions list, and confirm the result:

fdb.permissions.tag_values[‘terrycojones/rating’].put(‘read’, ‘open’, [‘njr’])
>>> (204, None)
fdb.permissions.tag_values[‘terrycojones/rating’].get(‘read’)
>>> (200, {u‘exceptions’: [u‘njr’], u‘policy’: u‘open’}

The 204 status above is just the HTTP way of telling us that the call succeeded and that the reply has no content (as expected).

Now let’s reconnect as njr and try getting the terrycojones/rating tag again:

fdb.db.client.login(‘njr’, ‘PASSWORD’)
o.get(‘terrycojones/rating’)
>>>

You can see we got nothing back. If FOM handled non-OK HTTP responses a little more carefully, you’d see that this request actually got a 401 (Permission Denied) status. FluidDB is now refusing to let njr read the tag.

Nick already has a rating tag, called njr/rating, so let’s go get it, make sure there’s not one already on our object, and then tag our object with it:

ns = Namespace(‘njr’)
rating = ns.tag(‘rating’)
o.get(‘njr/rating’)
o.set(‘njr/rating’, 4)
o.get(‘njr/rating’)
>>> (4, None)

Now things are getting interesting. We have tags from different users on the same object. That’s part of the point of FluidDB and its where the value comes from: putting information together allows you to do nice things, like query on it. After re-connecting as terrycojones, I can now do queries like this:

fdb.objects.get(‘terrycojones/rating > 5 and njr/rating > 3′)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

fdb.objects.get(‘terrycojones/rating > 5 and njr/rating < 3′)
>>> (200, {u‘ids’: []})

fdb.objects.get(‘has terrycojones/rating and njr/rating >= 4′)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

fdb.objects.get(‘has terrycojones/rating and has njr/rating’)
>>> (200, {u‘ids’: [u‘ab7fa032-06df-45be-9bb2-859c18c4d342′]})

fdb.objects.get(‘has terrycojones/rating except has njr/rating’)
>> (200, {u‘ids’: []})

There’s a lot more I could do too, like giving Nick permission to add terrycojones/rating tag to objects. By the way, Nick has written some nice articles about the FluidDB permissions model. See Permissions Worth Getting Excited About and The Permissions Sketch.

For a final look at metadata, let’s put something totally different onto our object:

ns = Namespace(‘terrycojones’)
page = ns.create_tag("page", "Terry’s page tag.", False)
o.set(‘terrycojones/page’, ‘<html><head><title>hi</title></head><body>Hello there!</body></html>’, ‘text/html’)

I’ve just made a new tag called terrycojones/page and tagged our object with it. What’s different here is that the value is a string, and I’m passing a MIME type with it. If I retrieve the value of the tag on the object, you’ll see the MIME type comes back too:

o.get(‘terrycojones/page’)
>>> (‘<html><head><title>hi</title></head><body>Hello there!</body></html>’,
 ‘text/html’)

and as you might hope, if you go get that tag from that object using a browser, the MIME type is returned in the HTTP Content-type header, so you end up with a real web page, with a predictable URL. Try clicking: http://fluiddb.fluidinfo.com/objects/ab7fa032-06df-45be-9bb2-859c18c4d342/terrycojones/page. We can do the same for any MIME type at all – including ones you invent for your own convenience.

So there you go. That’s metadata on tweets. With a permissions model, with a query language, with user identity, with the freedom to add anything you want, and with typed data. We don’t need a new programming language for doing this sort of thing. What we need is a better data architecture.

FluidDB was designed with exactly this kind of use in mind. And it’s not specific to Twitter or tweets, or anything in fact. So you can put metadata onto anything you like, search on it, continue to own/control your own data, combine it as you like, and get data in and out using a simple HTTP API.

This is all live. It’s up and running, you can do this today. I should also add that FluidDB is still an early alpha, and is not yet particularly fast. For more information on FluidDB, start with the high-level description and if you’re a programmer, read the API docs.

Next time I’ll show you how we’re putting metadata onto Twitter users, and how you can too, of course! I might also start to talk about Tickery, our upcoming Twitter query application.

If you like all this, please pass on this article. We’d love to get the word out about FluidDB. It’s a little difficult from Barcelona.

Why are Post-it notes sticky?

Thursday, November 12th, 2009
Image: PabloBM

Image: PabloBM

[Update: it has been politely pointed out (in the comments following this post) by David Semeria that the Post-it note analogy to FluidDB came from him, not me!]

It’s a pretty simple question, but you may have never thought about it directly. It’s the stickiness of Post-it notes that makes them so extraordinarily useful. The stickiness allows us to put a note in the place that makes the most sense, in the place where its information will be in context, and where it will have the greatest utility.

That’s trivial, agreed, but I nevertheless find it interesting and instructive.


Image: someToast

Image: someToast

Using Post-it notes we can add information to things in a very wide range of ways. The information might be about the thing to which it’s attached, or the object might just be something we anticipate being encountered at a relevant future moment.

How often do people ask for permission to attach a Post-it note to an object? Probably not very often. I’ve certainly never done it.

Image: someToast

Image: someToast

The information on the Post-it note can’t be presented by the object the note is on. If the object had been designed to carry that information, there’d be no need for a Post-it note. So Post-it notes are almost by definition for adding information to things in unanticipated ways.

And as for the content of Post-it notes – that’s clearly highly unpredictable. I’ve illustrated this posting with some fun examples.

Image: _nickd

Image: _nickd

It’s very useful to be able to put information in its most natural or useful place. We do it all the time. The other day I returned to my apartment and taped to the inside wall of the elevator was a form for the neighbors of my building to enter their gas meter readings. The elevator was the perfect place for the notice. But it certainly wasn’t designed for that purpose. The representative of the gas company didn’t ask permission, they just taped up the form.

Image: Iain Farrell

Image: Iain Farrell

I love thinking about how we work with information in the real world – especially the kinds of things we do so frequently or naturally that we barely notice them.

Image: wrestlingentropy

Image: wrestlingentropy

All of which brings me, inevitably, to FluidDB. As I’ve mentioned before several times, FluidDB objects have no owner. That means that anyone can put the digital equivalent of a Post-it onto anything they like, for whatever purpose, without asking for permission, and without anyone having to anticipate that they would want to do so.

Image: mulmatsherm

Image: mulmatsherm

That’s all I’ll say for now, as I’m trying to keep this short. I hope it will be thought provoking. Think about how tightly controlled, how unspontaneous, and how awkward our typical computational experiences are. There’s a reason for that, and it’s rooted in information architecture. Think about how Post-it notes, in a simplistic but important way, make the world writable. And then, for extra points, think about FluidDB :-)


I can’t resist one final image.

Image: Mr.Thomas

Image: Mr.Thomas

Digital hobgoblins

Sunday, October 4th, 2009

hobgoblin-hallMuch of the thinking behind FluidDB comes from thinking how we work with information in the real world, and comparing that to how we do so in the computational world (aka Hobgoblin Hall). The differences are striking.

Over the years, I’ve often asked myself why things are so bizarre in the computational world and why we don’t do something about it. Without going into the answers to those questions, I’ll just say that I think we’ve all grown up in Hobgoblin Hall and despite the fact that we’re all perfectly familiar with the freedoms of the outside world, we take it for granted that things are deeply weird in our computational homes.

Here I’ll quickly outline a few of the more glaring oddities. (BTW, I’d be remiss not to point out that FluidDB has none of the following restrictions.)

Things must be named, and have one name. In the real world we have plenty of things that don’t have names. As I look around my desk right now, I can see dozens of things that don’t have names. We also often give things many names – first names, surnames, nicknames, abbreviated names, English/Spanish/Chinese/etc names. Flexibility in naming (no names, one name, multiple names, private names, etc) is obviously of great utility. Yet in computational systems we’re usually compelled to name things, and we’re restricted to a single name. These are just a couple of the problems I have with file systems – I have about 10 others, but will spare you.

Inconsistency and ambiguity are common in the real world. While they’re obviously often not helpful, there are times when it is very useful to have both. Things may become clearer over time. Systems evolve. In the natural world we use representations that allow high degrees of inconsistency and ambiguity, and it’s very useful to be able to do so – how else would we learn or get anything accomplished if not? Yet if you suggest a computational system that explicitly allows for any level of inconsistency and ambiguity in information, people start to get nervous or even upset. They’ll begin to argue with you, and suggest ways to “fix” the system to get rid of the undesirable qualities. Why is that?

Multiple organizations of the same information are very common in the natural world. We do it all the time. Computationally it’s rare that systems allow us to multiply organize things. That’s changing, thankfully, with the rise of tagging and with music collection software that allows multiple simultaneous playlists (or “smart” dynamic playlists) of the same underlying sound files. But those systems are the exception rather than the rule.

There’s an obsession with “meaning” and pinning down what things are “about” in the computational world. In the real world we don’t seem to care that much – we’re more concerned with utility. What’s a book for? Something to read? Something to stop other objects from blowing away? Something to be hollowed out to hold a gun? Something to create an intellectual impression? A decoration? Something to hold up other books? A hiding place? A book can be all these things, and we can move seamlessly between them. What’s a glass for? Is it a weapon? Something we can hold to the wall to hear a conversation? Something to use as an insect trap? A fingerprint capturing device? A musical instrument? Maybe even something to drink out of? We don’t really know, and it’s not important to know. We don’t obsess over the “meaning” of a glass or try to determine what single that it might be “about” etc. We just use it as we see fit. Similarly we can’t anticipate how people will want to use information – and our storage architecture shouldn’t try. (You could counter by pointing out the FluidDB about tag. But usage of the about tag is entirely optional. It’s a convenience. You can make your own, or use none at all. And a FluidDB object can be about whatever you think it’s about and used for whatever you want to use it for – even if others have completely different interpretations of and uses for the same object. No problem.)

Later (meta?) data is often most usefully put with the original data. That’s what we commonly do in the real world. It’s convenient, easy, useful, and natural. For example, when you’re reading a book and you want to remember what page you’re up to, you can simpy dog-ear the page or insert a bookmark. The extra information travels with the book. In the computational world you can’t do things like that unless a programmer has anticipated that you might want to and made provision for the extra information in the underlying data structure (or database). So we’re very often forced to put extra unanticipated information elsewhere – e.g., in a file, in our heads. Unfortunately, that later information is often the most important – because it’s generated by individuals who are trying to customize or personalize their computational world. I’ll have much more to say about that another time. For now: a writable architecture like FluidDB does not have this limitation because you can always put the new (meta?) information with the old (content?). And search on it, etc. That offers a fundamental change in how we work with information. I’ll blog about it at length one of these days. You get to think about it in the meantime :-)

FluidDB as a universal metadata engine

Saturday, October 3rd, 2009

Image: Jin Wicked

Image: Jin Wicked

One way to use FluidDB, among many, is as a universal engine for metadata.

I’ll have to explain what I mean by that, especially seeing as some people got the impression from the earlier post on data vs metadata that we don’t think metadata is important, or that it doesn’t exist, or similar. I tried to make it clear in the post, and in responding to the comments that followed, that that’s not what was meant: In fact that’s one of the major initial goals of FluidDB – to be a metadata engine for everything. So that’s how important we think metadata is! The way to support metadata on anything is to have an underlying architecture that’s flexible enough to allow that to happen – without someone setting the thing up with an a priori determination of what’s meta- and what’s not. True support for metadata is too important for that – to do it properly you need the architecture to be neutral.

The question is: how can FluidDB be used as a universal metadata engine?

Metadata can be loosely defined as data that’s about other data. So to provide a universal metadata engine, any time any application wants to store some metadata (M) about anything (A), FluidDB should always have a place to put M. Moreover, the application shouldn’t have to stop to ask if it’s ok to store the metadata, and its needs shouldn’t have to be anticipated.

The key word in the above paragraph is about. FluidDB has an about tag that can (optionally) be added to objects to indicate what they’re about. There’s a lot that could be said about the about tag – in fact, the person who pushed for its inclusion in FluidDB, Nicholas Radcliffe even started a blog of that name. The main thing to know for now is that the about tag on an object (if any), is immutable and its value (always a string) is unique across all FluidDB objects.

To give some simple examples, there might be objects in FluidDB with about tags that have values such as isbn:140679239X or http://www.abebooks.com/servlet/BookDetailsPL?bi=588210745 or US:ZIP:90210 or info@fluidinfo.com or IP:207.171.166.252 or….. anything you like. That’s the point.

So when an application – any application – wants to store information about A, it just asks FluidDB for the object whose about tag has the value A. If the object already exists, FluidDB returns it. If not, FluidDB creates a new object, sets its about value to A, and returns it.

That’s the first part of being a universal metadata engine: if you want to store information about something, FluidDB gives you an obvious place to put that information (provided you can convert your particular A into a string of some kind). In this regard, FluidDB is like a wiki. When you use a wiki, you can ask it for the page on any subject, and if it doesn’t exist it will be created. As with a wiki, you can think of FluidDB as already having objects about everything; just like a wiki, FluidDB doesn’t actually create any particular object until someone asks for it.

The second crucial component is FluidDB’s model of control. As mentioned in the Information. Naturally. post, FluidDB objects do not have owners. That means that all applications are guaranteed that they can store information onto the FluidDB object about A.

Putting these two together, you get something that starts to look very much like a universal metadata engine. Got some metadata to store about something? FluidDB gives you an obvious place to put it and a guarantee, in advance, that you’ll be able to do so. This is what we mean when we say FluidDB makes the world more writable.

To give a couple of quick examples, Emanuel Carnevale has written two Javascript programs for the Firefox 3.5 Jetpack extension. These are just quick proofs of concept for now, but they will mature. One is fluidy-hood that offers functionality along the lines of Google’s Sidewiki (though more general), and the second is BRB, the Borthwick Remember Button, in honor of John Borthwick of Betaworks who asked for one. These are very simple pieces of code that use FluidDB as a universal metadata engine, in both cases putting information onto the object that’s about the URL you’re currently looking at in Firefox.

A final comment about the creation of value: These tiny apps have limited and unremarkable value to their individual users. Things get much more interesting though we you consider that these applications are creating truly social data. It is directly searchable via the FluidDB query language. It can be combined with other information—homogenous (created by the same app being run by someone else) and heterogenous (related but different information about the same thing created by other apps). It can be accessed, augmented, and mashed up by others. And the person who created the information can continue to control it: share it, protect it, edit it, delete it, etc.

When you look at data and applications like this, you begin to see why we’re so excited about the kinds of changes in how we work with information that we think FluidDB can help to introduce.

There’s a lot more that can be said regarding the about tag, about how all this affects customization, personalization, and information organization in general, about ambiguity and its resolution, and about the creation of value via putting information into context. Those things will have to wait for later blog postings, though.

Stay tuned.

The myriad benefits of a simple query language

Thursday, September 10th, 2009

FluidDB has a simple query language. If you are familiar with any other query language, you can probably learn the entire FluidDB language in a couple of minutes. The image below shows a summary of the whole language. Without going into details, you can immediately tell there’s not much to it. Click on the image to read more. In contrast, SQL is massive. The SQL 2008 standard comes in 9 parts, the second of which is over 1300 pages.

FluidDB query language summary

The downside to having such a simple query language is that complicated data retrieval, processing and organization is not done server-side. Applications have to request data in a simpler fashion, process it locally, and make further network requests if they need additional related data.

The strong upside is that a deliberately simple query language permits architectural simplicity. Because query processing is the most complex part of FluidDB, it bounds underlying complexity and has a direct influence on overall system implementation and architecture. Whereas a complex query language, such as SQL, makes it difficult to scale, a simple one makes scaling simpler—at least in theory; you still have to build it, of course!

The trick is getting the balance right: design a query language that’s practical and useful for a wide variety of common tasks, but whose simplicity confers important architectural advantages.

Here are a few ways in which the FluidDB query language and the resultant architecture give us hope that we’re building something that can grow.

  • Complex queries are not possible. You can make a big query in FluidDB or a deep query or a query that returns many results, but you can’t make a complex query—I mean the kind of query that can bring an SQL server to its knees. Just for starters, the FluidDB query language has no JOIN statement. When a query language is complex, the database is at the mercy of its applications: Applications can submit queries with JOINs that are so complex that the required data cannot reasonably be brought together (JOINed) in order for the selection to proceed.
  • All query resolution is simple. In the parse tree of any FluidDB query, all the leaves are simple. Each requires either a single lookup in a B-tree (or similar), or a single text match. The result of the processing at a leaf is always a set of object ids. The internal nodes of the query tree only require set operations (union, intersection, difference) on object ids. Below is a fragment of a query parse tree. There’s nothing else.

    A FluidDB query parse tree fragment

  • Parallelization is trivial. Because the values of FluidDB tags are stored separately, as in a column store, leaf queries are always sent in parallel to the independent servers that maintain the tags in question.
  • It scales horizontally. Because tag values are stored independently and internal query tree nodes are always simple set operations on object ids, the architecture is easy to scale horizontally. We built (and open-sourced) txAMQP to combine Thrift and AMQP with Twisted to give ourselves transparent messaging-mediated RPC. That means the new servers can be deployed and run services that simply join or create the appropriate AMQP queues, and immediately begin receiving RPC calls. When more tag servers or set operation servers are needed, it is trivial to add them.
  • Unused tags can be taken offline. Because tags are stored independently, those that have not been used for some time can have their values serialized and stored in a cheaper medium for the interim. They need not occupy expensive and scarce RAM. When they’re next queried—if ever—they can rapidly be brought back online. This is an architectural advantage that’s mainly made possible by the system design, not the query language simplicity. I’ve included it nevertheless, because this kind of optimization might not be possible in a system with a query language that demanded a more complex underlying data organization.
  • It can scale down as well as up. Just as scaling up by adding servers is simple, servers can be taken down during quieter periods. Set operations servers can simply disappear. Tag servers can migrate management of their tags to other servers or just take tags offline – they will be re-animated by another tag server when next needed.
  • Adaptive affinity is straightforward. When tags are frequently being queried together, they can be migrated to the same tag server. Then an entire sub-query involving both can be sent to that server and the result, just a set of object ids, flows up through the query tree exactly as it would have had the leaves been processed on separate servers. And when things get too hot, i.e., tags being stored together have created a hotspot, they can be migrated to separate servers.

That’s enough for now. There are other, more detailed, advantages that I’ve omitted for brevity. I’m trying to keep each of these posts down to reasonable size.

Metadata vs Data: a wholly artificial distinction

Saturday, September 5th, 2009

Image: psd

Image: psd

Computer scientists are fond of talking about metadata. There often seems to be an assumption that drawing a distinction between metadata and data is useful and perhaps even necessary.

At an architectural level, I think that’s entirely wrong. Any storage architecture that maintains a distinction between metadata and data has real problems that will limit its flexibility and usefulness. Note that I’m not saying that an application shouldn’t maintain a distinction between metadata and data, or that applications shouldn’t present things to users in those terms, or that it’s not useful to think in terms of metadata and data. I’m also not claiming that every storage architecture needs to be flexible – there are obviously times where that appears unnecessary (though in many cases you may end up wanting more flexibility).

I’ll simply argue that if you aim to build a storage architecture with real flexibility, maintaining a distinction between data and metadata runs directly counter to your goal. Below I’ll outline some reasons why.

But first, consider the natural world. If you talk to a regular person — meaning someone who’s not a computer scientist, a librarian, an archivist etc. — and ask them if they know what metadata is, you’ll probably draw a blank. Why is that? It’s because the distinction between data and metadata is entirely artificial. It does not exist in the real world, and it’s clear that regular people can get by just fine without it. FluidDB draws its inspiration from the way we work with information in the natural world, and maintains no such distinction.

It’s interesting to speculate on the origins of the metadata vs data distinction. I’d love to know its full history. I suspect that it arose from early architectural constraints, from the relative design and programming ease of maintaining a set of constant-size chunks of information about files apart from the dynamic and variable-size memory required by the contents of files. I suspect it probably also has to do with architectural limitations and the slowness of early machines.

Here then are the main reasons why the distinction is harmful.

  • Two access methods: When metadata and data are stored separately, the way to get at those two different things is likely to be different. Consider inodes in a UNIX filesystem versus the disk blocks containing file data. They are stored differently and cannot be accessed in a uniform way. This causes internal complexity for the storage architecture.
  • Two permissions systems: There are likely to be two permissions systems governing changes to metadata and data. This is another source of internal complexity for the architecture.
  • Search across the two is complex or impossible: Why has it traditionally been so hard to find, for example, a file with “accounts” in its name and “automobiles” in the contents? Because this is a simultaneous search across file metadata and file content. The division between metadata (the name) and the data (the content) made such searches extremely difficult. Even with modern systems it’s awkward. Consider the UNIX find command which searches based on file metadata and the grep command which searches file contents. Combining the two is not easy. It’s at least possible in some systems these days, but that’s because those systems pull all the information together and build a separate index on it – i.e., they allow it by removing the division between metadata and data.
  • A central piece of content: Systems, especially document or file systems, usually maintain a distinction between the content and the metadata about the content. But the real world doesn’t work that way. You may possess information about something without having the thing. There may be no pieces of content, or there may be many.
  • Who decides?: If a system maintains a distinction between metadata and data, who decides which is which? Almost inevitably, it’s a programmer, a system architect, or a product manager who makes those decisions. There’s an implicit assertion that they know more about your information than you do. They decide what should be in the metadata. While there are systems that let users create metadata, they are usually limited in scope – someone has decided in advance how much metadata a regular user should be allowed to create, what kind of metadata it can be, how it will be used, how users will be allowed to search on it, etc. The intentions are good, but the whole thing smacks of parental control, of hand-holding, of “trust us, we know better than you do”.
  • Time dependency at creation: Systems maintaining the distinction also introduce an unnatural time dependency. Until the content (i.e., the data) is available, there’s nowhere to put the metadata. E.g., a file object has to be created before it can have metadata, a web page has to come into existence before you can tag it. But the real world doesn’t work that way. E.g., you can have an opinion about someone you’ve never met, or someone who’s dead or fictional. You can have a summary of a call agenda before the call happens, or notes about a meeting before the minutes of the meeting are prepared.
  • Time dependency at deletion: The awkward time dependency bites when the content is deleted too. The metadata necessarily vanishes because the architecture doesn’t allow it to persist: there’s literally nowhere to put it. Once again, the real world doesn’t work that way. E.g., you’re sent a large image file of someone’s pet cat – you take a look and, to show you care, make a mental note of its name and breed, but you delete the image because you don’t want to store it. Or suppose you give away or lose your copy of Moby Dick – you don’t therefore immediately forget the book’s title, its plot, the author, the name of the main character, an idea of how long it is, the book’s first line, etc. The “content” is gone, but the metadata remains. You may have never owned the book, you may think you have a copy but do not, you may have two copies – in the natural world it just doesn’t matter, and nor should it in a storage architecture. Interestingly, Amazon are currently being sued because they threw away someone’s metadata in the process of removing a copy of Orwell’s 1984 from a Kindle. You can bet the metadata was removed automatically when the content was removed.

OK, enough examples for now.

FluidDB has none of the problems listed above. It has absolutely no distinction between metadata and data. It has a single permissions system that mediates access to all information. When a tag (perhaps used or presented as the “content” by an application) is removed from an object, all the other tags remain. There is no distinction between important system information and the information stored by any regular user or application – they’re all on an equal footing, and that includes future applications and users. No-one gets to set the rules about what’s more important and what’s not, there’s simply no distinction. You can search on anything, using a single query language – the system uses the query language to find things it needs, just like any other application. The single permission system mediates who can do what – equally and uniformly.

I used to argue that everything should just be considered data. But I think David Weinberger puts it better in Everything is Miscellaneous where he says it’s all metadata. Call it what you will, it’s clear (to me at least) that at a fundamental level there should be no distinction.

BTW, if you’re into self-reference, you might also interested to know that FluidDB uses itself to implement its permissions system. Permissions are just more information, after all. FluidDB stores that information for tags, namespaces, and users onto the regular FluidDB objects that are about those things. There truly is no metadata / data distinction. It’s a little like Lisp: once you have the core system in place, you can (and should) use it to implement the wider system.

Information. Naturally.

Friday, August 28th, 2009

Image: Mary Hodder

Image: Mary Hodder

From the Fluidinfo home page:

Humans are diverse and unpredictable. We create, share, and organize information in an infinity of ways. We’ve even built machines to process it. Yet for all their capacity and speed, using computers to work with information is often awkward and frustrating. We are allowed very little of the spontaneity that characterizes normal human behavior. Our needs must be anticipated in advance by programmers. Far too often we can look, but not touch.

Why isn’t it easier to work with information using a computer?

At Fluidinfo we believe the answer lies in information architecture. A rigid underlying platform inhibits or prevents spontaneity. A new information architecture could be the basis for a new class of applications. It could provide freedom and flexibility to all applications, and these advantages could be passed on to users.

FluidDB does not attempt to directly model information accumulation and use in the real world. It simply provides an information architecture that is more flexible than the ones we’re used to. It provides a fairly simple answer to the question of how we might work with information more naturally when using a computer. It does not claim to be the final word on the subject, but points out a fruitful direction for advance. And it provides a concrete implementation that can be used today.

The fruitful direction

The computational world is too read-only, and too tightly controlled. Most of the time we spend using a computer, we are either 1) strictly in read-only mode or 2) using an application that allows us to write, but only in predetermined ways. In contrast, in our normal dealings with the natural world, when using our brains, we are never in read-only mode. We are constantly processing information and adding (i.e., writing) to our mental models. I’m talking about everyday things, like noticing something and remembering that you noticed. Or seeing something you like, and being able to recall that fact later. Even in these trivial acts we are in some sense writing—laying down memories that can later be recalled, sorted amongst, shared, organized, merged, or put aside for long periods or even forever.

In thinking about this extreme fluidity, I find it illustrative to consider how we work with concepts. As I wrote in Kaleidoscope: 10 takes on FluidDB:

Concepts are very fluid: they don’t have owners, you don’t ask for permission to add to them, they have no formal structure or central piece of content, they can be organized in many ways, and they have no pre-defined set of qualities or attributes. Exactly the same can be said of FluidDB objects.

The fruitful direction—and the mission of Fluidinfo, if I may be so grandiose and dramatic—is to engineer an information architecture with a fluidity similar to that of concepts, in order to make the world more writable. The question (and the point of this posting) is how?

Objects without owners

The answer is actually very simple: FluidDB provides support for information objects that do not have top-level owners. These objects are comprised of tags (with values), for which there is a flexible and secure permissions model. Because the objects don’t have owners, anyone, or any application, can add tags to any object it can find. These objects have all the nice properties of concepts mentioned above.

That’s it?

While there’s a lot more to FluidDB than having objects with no owners, this single change is the key to the architecture and is responsible for its generality and flexibility. It’s almost embarrassingly simple.

It takes a while for the implications of this twist to sink in. I’m not going to go into the details in this post. I’ll just point out that a simple change in representation can have a surprisingly profound effect. I’ve already written about this, though without giving details of FluidDB.

I’m fascinated by representation and its role in problem solving. How can such a simple piece of mental jujitsu result in fundamental change? I’ll describe the consequences in later posts. There are several of them, and I think they’re important. For now though, if you’re interested, please follow the link above to see some simple examples of the power of changing an underlying representation.

Kaleidoscope: 10 takes on FluidDB

Tuesday, August 25th, 2009

I’ve been asked what FluidDB is hundreds of times. I’ve never really known how to answer because it can be looked at from many different angles. As I try to answer, I often feel like I’m holding up a large opaque crystal in front of me, turning it this way and that, until I find an angle that makes sense for this particular listener. I came slowly to the realization that there is no perfect answer, and that FluidDB can be many things to many different people. It’s like looking through a kaleidoscope: keep turning it until you see it in a way that’s attractive.

I’ll try to explain some of these points of view in later posts. For now, I’ll just give the flavor of some of them.

The many possible views of FluidDB do not mean that it is complex. It’s actually very simple. But it has a flexibility and generality that obviously make it difficult to grasp. Even when you do understand it, it’s common to be able to imagine two or three ways you might use it to solve a particular problem. I’m going to save a description of the object model for later, too.

In no particular order then, here are 10 ways of thinking about what FluidDB might be, or become.

1. A database with the heart of a wiki: The wiki analogy is strong in some ways: Anyone can add data (but see below), applications can collaborate, data put in a shared place is more valuable, and abstractly there are FluidDB objects for every purpose just like there are wiki pages for every subject – all waiting for someone to create them. But the analogy is very poor in others: FluidDB has a strong permissions system that can prevent others from changing or even seeing your data, there is a query language, and content is typed. So FluidDB has the flavor of a wiki, but when you get right down to it almost everything is quite unlike a wiki.

There is an interesting related question here. The world of encyclopedias was tightly controlled, and very few people were allowed to write – the encyclopedia was the ultimate read-only authority. Comically, it seemed, Wikipedia was the exact opposite. Yet in the space of a few years, the unthinkable had happened: Wikipedia had eclipsed even the mighty Encyclopedia Britannica. Can FluidDB do for applications and traditional databases what Wikipedia did for humans and traditional encyclopedias? You don’t have the rigid tables or schema, anyone can write, content can evolve, and there is no top down control.

2. A metadata store for everyone and everything: FluidDB has a special about tag. You can use it to ask for the object that’s about something, like a URL or an ISBN number. That gives applications an easy shared place to put things about other things – i.e., to store metadata. The metadata can be users’ customization or personalization information, ratings, opinions, whatever.

3. A store of concepts: Concepts are very fluid: they don’t have owners, you don’t ask for permission to add to them, they have no formal structure or central piece of content, they can be organized in many ways, and they have no pre-defined set of qualities or attributes. Exactly the same can be said of FluidDB objects.

4. A platform for mashups: When a programmer makes a mashup, combining information from different sources to create information about something, where should that new information be stored? The usual answer today is to put the new information in a database, behind a new API, to document it, to get a server, to keep it running. In effect this is just making another hoop for future programmers to jump through to make an additional mashup. In FluidDB you can put the new information with the old—because objects don’t have owners. And that’s where it’s probably most valuable, because it’s then immediately available to be mashed up with other data on the same object, and search can target heterogeneous (i.e., mashed up) data on the same object.

5. A way of storing social graphs: Because users each have a FluidDB object associated with them, it is very easy to build social graphs. For example, user Andy might put an andy/i-follow tag onto the object for user Betty. If you have a few people doing that, interesting queries are then immediately possible—both within and across social networks.

6. A new way of organizing information: When we organize things, we are creating new information. Normally we store that new information elsewhere. When you can store it on the objects that are being organized, lots of nice things happen. I have an upcoming blog post that I’m tempted to title “Multiple simultaneous non-conflicting dynamic sharable organizations.” A bit of a mouthful, but nevertheless true.

You can also build all data structures from tags in FluidDB. They’re slower to use than data structures in a typical program, but where you lose in speed you gain in flexibility.

7. Something that frees us from APIs/UIs: APIs and UIs are usually regarded in a positive way: they make getting to information easy for programs and people. But they also control us. We can only do what they allow and what they anticipate. Tight limits are imposed on us in getting to our own data. FluidDB can change this: you can own your own data, you can always add data and customize, and you can directly search on anything you like.

8. A communication system: You can look at FluidDB objects as places for cooperating applications to exchange information. The information could be messages, jobs and results, etc. Nicholas Radcliffe, who has understood FluidDB for years, today found a new pleasing angle to look at it from, as a Twitter for data.

You could easily use a FluidDB object as a voting box, with some nice properties (e.g., retract or change your vote, verify that someone voted without being able to read their vote). And you can do more complex things, too.

9. An evolutionary data system: FluidDB allows reputation, trust, and convention to evolve. Its namespaces, tags, and users all have objects, and these give natural places to accumulate fitness information. Conventions will evolve for naming and tag values, just as they do for tags and hashtags. Selection pressure will take care of fixing ambiguities exactly to the extent that it’s important and worthwhile to fix them.

10. An alist on everything: One of the oddest moments ever in trying to explain FluidDB came when talking to Paul Graham. After at least 10 minutes of trying to find an angle for him, he finally said “oh, I get it. It’s an alist on everything.” I smiled, breathed a sigh of relief, and said yes. Well why didn’t you say so? replied Paul. Just goes to show you can never have too much background in computer science.