Archive for the ‘Awesomeness’ Category

How we made an API for BoingBoing in an evening

Thursday, January 27th, 2011

Yesterday the folks over at boingboing.net posted eleven year’s worth of posts as a zipped up XML file. XML is good, but having a searchable database of posts is better. So I (ntoll) am in the process of importing all the data into Fluidinfo. πŸ™‚

When finished, every post and author in the boingboing data dump will be represented by an object in Fluidinfo and tagged with useful information. The diagram below shows a representation of what a typical object about a boingboing.net post looks like:

Tags on an object representing a boingboing.net post.

The object (the red blob with a unique ID written inside it) has several tags attached to it (named “boingboing.net/author” and “boingboing.net/comment_count” for example) with associated values (“Mark Frauenfelder” and “53” respectively).

Furthermore, while I was cleaning/preparing the data for upload I made sure to extract every domain name and URL referenced in each post and annotate the publication date as computer friendly values rather than just a human readable date.

An instant win is the ability to query data. For example, you’ll be able to search for all posts that link to techcrunch.com written in 2010 by Cory Doctorow. This is how to write the query in Fluidinfo’s super simple query language:

boingboing.net/domains contains "techcrunch.com" and
boingboing.net/year = 2010 and
boingboing.net/author = "Cory Doctorow"

The result will depend on how you make the query, but let’s assume you’re using a /values based call in Fluidinfo’s REST api and you’ve asked for each post’s title, publication date and a list of domains mentioned. You’ll get back some JSON encoded data that looks something like this:

[
  "results" : {
        "id" : {
            "05eee31e-fbd1-43cc-9500-0469707a9bc3" : {
                "boingboing.net/title" : {
                    "value" : "This is a made up title for illustrative purposes"
                },
                "boingboing.net/created_on" : {
                    "value" : "2010-08-19 13:23:41"
                },
                "boingboing.net/domains" : {
                    "value": [
                        "techcrunch.com",
                        "microsoft.com"
                    ]
                }
            },
            "0521e31e-fbd1-43cc-9500-046974569bc3" : {
               … more results …
            }
        }
    }
  }
]
 


api

Wait a minute..!?!? This is just as if boingboing.net had an API.

Actually, by importing the flat XML file into Fluidinfo they do have an API – for free! Because of Fluidinfo’s open nature anyone can now make use of boingboing’s data via a few simple and easy to construct RESTful calls to Fluidinfo.

But that’s not all..!

Fluidinfo isn’t just openly readable – it’s openly writeable too.

Huh..?

Any user of Fluidinfo can tag data to any object. For example, I control a couple of tags called “ntoll/rating” and “ntoll/comment” which I could attach to any of the objects representing boingboing.net posts. By tagging an object with associated values I’m indicating what I thought about the post.

Importantly, I know which object I want to tag because it has a special unique tag called “about” whose value is the URL to the boingboing.net post in question. Other people who want to add information about this post will know to use the same object as me because the about tag-value tells them, er, what the object is about.

This brings me to the killer point: accessing data from boingboing.net is good, but the facility to annotate, discover and re-use everyone’s data about boingboing.net posts is better. That’s why we sometimes say we’re trying to do to databases what Wikipedia did to encyclopaedias.

Users of Fluidinfo don’t have to retrieve information about boingboing.net posts by building queries using just boingboing.net tags. It’s possible to search using other people’s tags. For example, here’s how to search for posts where I’ve given it a relatively high rating and added a comment:

ntoll/rating > 6 and has ntoll/comment and
has boingboing.net/title

And users don’t have to just ask for boingboing.net related tag-values either. It’s possible to ask objects for all their tags that you have permission to see. For example, you could retrieve a matching post’s title, body, author and any comments I make about the post with the ntoll/comment tag.

I’m only scratching the surface here so I’ll follow up with another post soon with some example code and use cases. In the meantime, if you want to find out more feel free to get in touch with us. We’re more than happy to help.

If you’re a developer and want to play with the boingboing.net data you should take a read of my last post explaining how to explore Fluidinfo’s API with Python.

In case you were wondering, it really was only half an evening’s work to prepare the data and write the import script. πŸ™‚

Note: The import is currently running but should be complete later this afternoon. Not all posts will be in Fluidinfo yet (so far we have everything up to the end of September 2008).

Image credits: Diagram generated by abouttag written by Nick Radcliffe and the “API Sign” is © 2006 ulybug under a Creative Commons license.

Introducing the Fluidinfo Explorer

Monday, January 10th, 2011

Normally users will use applications that use Fluidinfo and are unaware that the application is using Fluidinfo. Programmers will use Fluidinfo through its API. So, what if you’re a non-programmer and not using an application and you just want to have a look around inside Fluidinfo? Pier-Andre Parent has written the Fluidinfo Explorer – a web-based “explorer” GUI. If you’re not a developer, this is probably your best way of starting to interact with Fluidinfo without having to get into all the nitty-gritty details of the API. It’s like the file-system explorer you find on Windows, Mac or Linux.

We’ve found the Explorer so useful that we’ve made it available via the explorer.fluidinfo.com name. The URL you visit in your browser is very important. The pattern is http://explorer.fluidinfo.com/INSTANCE/NAMESPACE where INSTANCE is either “fluidinfo” or “sandbox” and NAMESPACE is name of the user, organisation or application you’re interested in. For example, the following link will display my (ntoll’s) top-level namespace in the explorer: http://explorer.fluidinfo.com/fluidinfo/ntoll

The result will look something like the following:

Notice how the namespace / tag structure is displayed in a collapsable tree control on the left hand side. The main body of the user interface contains a helpful welcome message and at the top right hand side is a search box for queries written in Fluidinfo’s query language and the login button.

Right click a namespace or tag to view and update its attributes, to create and delete namespaces and tags, and to set permissions on them. Clicking on a tag in on the left hand side fills the main area of the UI with all the objects that it has tagged:

So far, so simple…

But what about exploring the tags on a specific object? Click on an objects object id in the result set to display a list of all the tags attached to it. Click “Load all tag values” to display the associated values. Notice how the explorer differentiates between primitive (numbers, booleans, strings etc..) and opaque (images, audio, binary files etc…) values – primitive values are displayed whereas the cells for opaque values contain a description of the type of value stored in Fluidinfo:

Click the “open” link next to each of the opaque tag value to trigger a pop-up with the opaque value presented therein. In this example the value is an image:

Finally, if you follow the “View visual representation” link a rather nice graphical representation of the object is presented to you:

These diagrams are automatically generated by yet another third party application created by Nicholas Radcliffe and hosted on Google’s AppEngine. Given an appropriate URL a rather cool image is generated, e.g., http://abouttag.appspot.com/id/butterfly/8c2860e1-0d3f-47aa-9064-8a682cea6154.

The great thing about the explorer is that it provides an intuitive and visual representation of how Fluidinfo is structured. Have fun exploring!

Watering a Peace Lily with Fluidinfo

Tuesday, November 23rd, 2010

I (ntoll) belong to a nascent hackerspace called NortHACKton. It’s an opportunity to learn new skills and to collaborate with a great bunch of people who create cool stuff. I’m going to describe just such a collaboration with Stephen Bridges, one of the organisers of the hackerspace.

Our aim was to combine a simple hardware project with Fluidinfo and do it in such a way that others could repeat, extend and enhance what we’d been up to. We decided to connect an Arduino to a sensor and put the resulting reading into Fluidinfo at regular intervals. In the end we built something to make a moisture reading of the soil in Stephen’s plant pot and update a value in Fluidinfo every 10 minutes.

The Arduino has an Ethernet shield so the device can communicate autonomously with Fluidinfo via the HTTP API. The support circuitry is adapted from Botanicalls.com (Creative Commons) and Stephen created the sensor from a pen lid, sticky tape and a couple of wires. πŸ™‚

The source code can be found on GitHub and contains two parts:

  1. A generic and reusable layer that handles basic interaction with Fluidinfo
  2. The application logic that takes the reading and controls the Arduino.

From Fluidinfo’s point of view, there is an object that represents Stephen’s peace lily (its about tag value is “Stephen’s Peace Lily (houseplant)”) and the tag widget/ffm/reading attached to this object is updated with the appropriate value.

Interestingly, I’ve also added some tags to the object representing the peace lily which hold html, css and javascript values. This is a classic case of putting information in context since the peace lily’s web page is a tag-value attached to its object in Fluidinfo. So it’s possible to view the peace lily’s current status with your browser.

The whole thing was hacked together in an afternoon over a drink in a pub in Northampton. Unfortunately for Stephen my mobile phone takes video so I press-ganged him into the following explanation:

You can find Stephen’s write-up on the NortHACKton wiki. If you’re interested in doing something similar with Fluidinfo please don’t hesitate to drop in on our IRC channel (#fluidinfo on Freenode – connect via the web) and ask questions. Alternatively, drop by either the fluidinfo-users or fluidinfo-discuss mailing lists. We’d be more than happy to help.

Coming soon to a FluidDB near you…

Monday, November 15th, 2010

Today (Monday 15th November) commencing from 10am GMT (11am Western Eurozone, 5am EST) the main instance of FluidDB will be offline for several hours while we roll out a major update.

We’re excited to announce the following new features and changes:

  • /about added to HTTP API – It will be possible to access FluidDB objects that have a fluiddb/about tag value with requests whose path starts with /about. For example, the object about “Barcelona” can be reached directly via /about/Barcelona. The behaviour of /about, when given an about value, is exactly like that of /objects when given an object id. More information will be available in the API docs at http://api.fluidinfo.com/. Many thanks to Holger DΓΌrer (http://twitter.com/hd42) for suggesting this improvement.
  • /values added to HTTP API – It is now possible to manipulate multiple tag values in a single API request to /values via the PUT, GET and DELETE HTTP methods. From the user’s perspective, this will result in a significant improvement in performance. More information can be found in the API docs at http://api.fluidinfo.com/.
  • “SEE” permission replaced with “READ” – the permissions system has been simplified. FluidDB now uses only the READ permission on tags to decide whether API calls accessing the tag values should be allowed to proceed. Anything that used the SEE permission now uses READ. For example, when you do a GET on an object to retrieve the names of its tags, you will only receive those for which you have READ permission. Many thanks to Jamu Kakar (http://twitter.com/jkakar) for suggesting this simplification.
  • Deleting a tag instance now always returns an HTTP 204 (No content) code – DELETEing a tag value from an object that did not have that tag used to result in a “404 (Not found)” status. This will be changed to simply return the non-error “204 (No Content)”.
  • “Content-MD5” header for checking payload content – It will be possible to send a checksum of a payload to FluidDB via the “Content-MD5” header. FluidDB will attempt to validate the checksum with the payload and return a “412 (Precondition failed)” status in the case of a mismatch.
  • Cross Origin Resource Sharing (CORS) added to HTTP API – it will be possible to make cross origin requests as specified by http://www.w3.org/TR/cors/ rather than rely on JSONP. FluidDB will have an almost complete implementation of this emerging standard although we expect to make changes and improvements as the specification matures.
  • Text indexing of fluiddb/about tag values – text indexing is coming to FluidDB but is definitely a work in progress. This release is just the very first step: the fluiddb/about tag will be indexed from the update onwards (existing fluiddb/about tag values will be indexed over the coming days/weeks).

For those of you who have written or maintain a client library for FluidDB we’d like to refer you to the changes we’ve made to the Fluid Object Mapper (FOM) library as a reference for what you might want to do with your own library.

To encourage people to add the new FluidDB capabilities to libraries, we’re going to extend the FluidDB Weekend of Code offer to library authors. Let us know when you’re working on your library and where we can find it (Github, Bitbucket, Sourceforge etc) and we’ll order you a pizza and send you a book of your choice from Amazon.

Finally, we’re moving to a four-week development cycle so expect regular updates, pro-active bug squashing and lots of progress in the coming months. We’ve got lots of exciting stuff in the pipeline and we can’t wait to see how the FluidDB community reacts.

Tim O’Reilly joins the Fluidinfo advisory board

Tuesday, August 10th, 2010

Tim O'Reilly - out standing in his field

The wild rumors are all true. Tim O’Reilly has joined the Fluidinfo advisory board!

I’m an unabashed fan of Tim, his company, and everyone I’ve ever met who works at or has worked at O’Reilly (especially Sara Winge). I’ve been talking to Tim on and off about FluidDB since March of 2007 after being introduced by Esther Dyson. Tim would blog about something, and I’d email him and say “FluidDB will be able to do that” or (more recently) “FluidDB can do that.” When we first met, we spent 90 minutes together and I showed him a demo of a few things. He drilled down hard in the first 2 minutes: “Tell me what’s different about it.” So I went for it. When the meeting was over, Tim left the room and went to the elevator to leave. I was packing up my stuff when suddenly he was back, wanting to ask and suggest more. He came back to the room four times, lastly to get my phone number πŸ™‚

Since then, Tim has been extremely generous, introducing us to many great people. You can see who very easily in the graph of introductions I put together over the years as I talked to people about FluidDB. He made the introduction that led through Gerry Campbell to John Borthwick and Andy Weissman at Betaworks who led the Fluidinfo investment. Tim invited me to the Social Graph Foo camp in 2008, to a Science Foo camp, and to two general Foo camps, and he’s been helping me in the (ongoing) attempt to get a US visa.

It’s a personal thrill to have Tim formally involved with Fluidinfo. During years working on what at times seemed like the dark side of the tech moon in Barcelona, to have had Tim and Esther (and others!) behind us along the way has been wonderful. I often feel I want us to succeed as much for them as for anyone else.