Fluidinfo

March 21, 2011

Announcing a writable API for O’Reilly books and authors

Filed under: Awesomeness,Events,Writable APIs — Terry Jones @ 9:41 am

Today we’re excited to announce the release of a writable API for O’Reilly books and authors. There’s far too much news and information around this release to pack into a single blog post. Here’s a summary of what’s new today and where to find out more.

Here’s an extract from the press release:

General manager and publisher Joe Wikert is excited by the opportunities that a writable API provides to O’Reilly and other publishers. “It’s like LEGOs for publishing,” he says of the new malleability in his industry. “It’s as though we’ve been selling plastic children’s toys and the pieces were all glued together so customers could only use them the way we intended them to be used,” he adds. “Now we’ve decided to break the pieces into their component parts and let customers build whatever they want.”

Last but not least: if you want a modern, writable API for your data, drop us a line at info at fluidinfo com, and let’s talk.

O’Reilly Fluidinfo Chrome extension

Filed under: Howto,People,Programming,Writable APIs — Nicholas Tollervey @ 9:40 am

To help people get going with the API competition announced today on the O’Reilly Radar site, Emanuel Carnevale has written a cool extension for Google’s Chrome browser. The extension shows some of the non-O’Reilly tags on the book objects and also lets you indicate which O’Reilly books you own. It does this by putting tags onto the objects representing O’Reilly books in Fluidinfo.

To install the extension onto your Chrome browser click on the following link (from within Chrome): https://fluiddb.fluidinfo.com/about/oreilly.com/fluidinfo/chrome-extension.crx. Your browser will guide you through what to do. It’s pretty obvious stuff. Once it’s installed you’ll see a new icon in the top right hand corner of the browser window between the address bar and the little spanner icon:

Click the icon and sign in with your Fluidinfo credentials. If you don’t yet have an account on Fluidinfo you can sign up here.

How do you use it..?

Simple. Go visit the O’Reilly catalog and click on one of the books you own. For example, I happen to be the proud owner of Natural Language Processing with Python. If you visit the page for the book you’ll notice a new small Fluidinfo icon in the book details:


Click the icon and you’ll see a pop-up like this:

You can click on the appropriate statement at bottom to indicate ownership or not, as the case may be.

The writable API gives us all a voice

The extension uses an “owns” tag in your top-level Fluidinfo namespace to indicate book ownership on the objects in Fluidinfo. For example, my tag is called “ntoll/owns”. The extension attaches this tag to the object representing the O’Reilly book whose page you are visiting.

Because the extension tags the exact same Fluidinfo objects that have the O’Reilly information, I can start to do some really cool searches. For example, I happen to know Terry has a particularly large O’Reilly “zoo” as do I (in fact, doesn’t every developer..?). We can see what books we both own about Python with the following query:

oreilly.com/title matches "Python" and has terrycojones/owns and has ntoll/owns

The following code snippet for running this query uses the fluidinfo.py client library from within the Python shell. Alternatively, you can see the result directly if you visit this URL.

>>> import fluidinfo
>>> import pprint
>>> headers, result = fluidinfo.call('GET', '/values', tags=['oreilly.com/title',], query='oreilly.com/title matches "Python" and has terrycojones/owns and has ntoll/owns')
>>> pprint.pprint(result)
{u'results': {u'id': {
                      u'01371c03-9097-4267-a137-ae88a23790ef': {u'oreilly.com/title': {u'value': u'Python Pocket Reference, Fourth Edition'}},
                      u'4e9c42b6-68cb-43f5-9b75-60af9c0bd5a7': {u'oreilly.com/title': {u'value': u'Programming Python, Fourth Edition'}},
                      u'cd0838db-96ae-42ae-98c9-248a1507e2bb': {u'oreilly.com/title': {u'value': u'Python in a Nutshell, Second Edition'}}}}}

This illustrates how anyone can add tags to the objects being used by O’Reilly, and can then search based on their own additions and those of others. That’s why we say that Fluidinfo provides writable APIs. Cool 🙂

Run with it!

There’s obviously a lot more that could be done with this extension. We kept it simple mainly because we wanted to give an example of how such an extension could be written. We hope it can provide a basis for your own efforts, especially if you’re entering the O’Reilly API competition. Emanuel has released the source code for the extension so you can grab it from Github and take it from there!

February 23, 2011

Putting domain names onto data with Fluidinfo

Filed under: Data,Essence,Writable APIs — Terry Jones @ 11:15 am

Internet domain names can be thought of as a mechanism for attaching trust and reputation to digital information. We do this in two major ways: (1) by using domain names in the URLs of web pages, and (2) by putting them in the sender’s “From” address of email messages.

To give a concrete example, suppose you see some shoes for sale on a web page. If you look at the page URL and see the amazon.com or zappos.com domain name, trust and reputation knowledge springs instantly to mind. You know the quality is probably good, the price competitive, and that if the shoes are lost in shipping you’ll be sent another pair for no charge. On the other hand, if you see ebay.com in the URL, a different matrix of trust and reputation knowledge will spring to mind. A similar thing happens if you get email from someone you’ve never met. If you see stanford.edu or forbes.com in the email “From” line, reputation information springs to mind.

Looked at in this way, domain names are small tokens that we send alongside other pieces of content such as web pages and emails. The domain name carries vital trust and reputation information. Recognition and trust in domain names is globally distributed, spread variously through the brains of most of the people on the planet, with its integrity guaranteed by DNS. Domain names make the internet useful. Without them, digital information online would be almost useless as we could not confidently trust any 3rd party data.

Question: given that we can attach domain names to web pages and email messages, can we find a way to attach them to other things?

Domain names on data

We’re excited to announce that Fluidinfo now makes it possible to put domain names onto individual pieces of data.

To illustrate, the image on the right shows a fanciful example book object in Fluidinfo (large version). The tag names on the object are colored. You’ll see that some of them contain domain names: amazon.com/price, barnesandnoble.com/price and vintage.com/epub. Tags in Fluidinfo can have values, as illustrated by the amazon.com/price tag whose value for this book is $19.

The combination of a Fluidinfo tag name containing a domain and an associated tag value is exactly like a URL containing a domain name and an associated HTML value (i.e., a web page) or an email message with a domain name in its From line.

Because Fluidinfo objects don’t have owners (their tags do, though), any number of domain owners are free to put their information, branded with their domain name, onto any Fluidinfo object.

A killer combination: writable APIs with domain-branded data

Fluidinfo automatically provides a writable API for all its data. By allowing for domain names on data, domain holders who want to publish information about their products can now do so with an API that has three major advantages:

  • Your data is branded with your domain name.
  • Your data lives in a writable ecology of related data, collecting on the same Fluidinfo objects. This allows for search across data from different users and domains, put there by different applications. It allows for additional data of all kinds, for mashups, and for customization, personalization, and filtering.
  • Fluidinfo has a flexible permissions system at the level of its tags, so you maintain full control of your own data. You can make it public or private, or can allow or disallow access for specific others.

Because Fluidinfo objects are fine grained, composed simply of tags with values as in the image above, applications can fetch, search on, or combine specific pieces (or combinations) of data provided by different trusted sources with single requests. There is a general principle here: information becomes more useful and valuable when it is stored in context. This is illustrated vividly by Google, which collects web pages into one place to enable search, and by Wikipedia, which allows people to pool related information. Although these examples have very different models of trust and reputation, they both illustrate the underlying principle.

Getting your domain name in Fluidinfo

To start using your domain in Fluidinfo, first sign up, using your domain name as your user name. Our sign-up system will recognize that the username is a domain and will send you an email telling you how to prove that you control the domain. Once that’s done, you can begin using Fluidinfo to upload information branded with your domain and to provide an API for others (or for your own company) to find your products or otherwise use the information you make available.

In other words, all Fluidinfo usernames that correspond to actual internet domains are automatically reserved for their owners. Besides preventing a chaotic land grab, this is how we can guarantee to people seeing information in Fluidinfo that the value of a Fluidinfo tag whose name includes a domain name can be trusted exactly as it would be if that domain appeared in a web page URL or email From address.

So there you have it… domain names on data. We’re very excited to see where this will lead and we’re actively building out some writable APIs with domain-branded data. You can too. Claim your domain name in Fluidinfo right now.

ReadWriteWeb ReadWriteAPI

Filed under: APIs,Data,Writable APIs — Nicholas Tollervey @ 11:04 am

Over the weekend I scraped the 11300 or so articles in the ReadWriteWeb archive. These are a great source of technology news and analysis covering stories from 2003 to the present day. Rather than keep this to myself (and rather unsurprisingly) I imported the metadata about each article into Fluidinfo. Hey presto, another instant API emerges!

Here’s how it works. For each article in the ReadWriteWeb archive there is an object in Fluidinfo. Each object has a unique “about” tag-value: the URL of the article. Furthermore, each object is annotated with information using tags found under the readwriteweb.com top level namespace. Tags include title, extract, date, categories and so on. In other words, you might visualize each object something like this:

I’ve also created and annotated objects about each of the authors of ReadWriteWeb articles and tagged objects representing each website ever mentioned by ReadWriteWeb.

So, it’s now possible to use the API like this:

>>> import fluidinfo
>>> returnTags = ['readwriteweb.com/title', 'readwriteweb.com/author-name', 'readwriteweb.com/extract', 'readwriteweb.com/date', ]
>>> query = "readwriteweb.com/year = 2010 and readwriteweb.com/month = 5 and readwriteweb.com/day = 5"
>>> head, result = fluidinfo.call('GET', '/values', tags=returnTags, query=query)
>>> head['status']
'200'
>>> result
{u'results': 
    {u'id': 
        {u'05936b9b-4c20-4887-9607-f63752e7f274':
            {u'readwriteweb.com/author-name': {u'value': u'Sarah Perez'},
              u'readwriteweb.com/date': {u'value': u'May  5, 2010  7:24 AM'},
              u'readwriteweb.com/extract': {u'value': u"Feel like hacking your phone today? If you've got about 10 minutes to spare, you can turn your iPhone into a Wi-Fi hotspot using a combination of the ..."},
              u'readwriteweb.com/title': {u'value': u'How To Turn Your iPhone into a Wi-Fi Hotspot'}},
        ... etc....

What’s just happened..? I used a client library (fluidinfo.py) to ask Fluidinfo to return the author name, publication date, title and an extract of all ReadWriteWeb articles published on the 5th May 2010.

Being able to search and extract data from an API is cool, especially since you get this by virtue of simply hosting your data in Fluidinfo. But this is ReadWriteWeb we’re talking about. Happily, Fluidinfo can accommodate.

>>> fluidinfo.login('ntoll', 'mysecretpassword') # change as appropriate
>>> headers, result = fluidinfo.call('PUT', ['about', 'http://www.readwriteweb.com/archives/android_app_growth_on_the_rise_9000_new_apps_in_march_2010.php', 'ntoll', 'rating'], 10)
>>> headers
{'cache-control': 'no-cache',
 'connection': 'keep-alive',
 'content-type': 'text/html',
 'date': 'Wed, 23 Feb 2011 15:07:29 GMT',
 'server': 'nginx/0.7.65',
 'status': '204'}

The example above shows how I sign in and annotate the object “about” the article http://www.readwriteweb.com/archives/android_app_growth_on_the_rise_9000_new_apps_in_march_2010.php with a tag called ntoll/rating and an associated value of 10 (obviously I enjoyed this article). The HTTP 204 response status tells me the value was successfully tagged.

Let’s just pause here for a moment and consider what I’ve just been able to do. Because Fluidinfo is openly writable I’m able to annotate the objects about ReadWriteWeb articles with my own data. Since objects in Fluidinfo don’t have owners or permissions attached to them I didn’t have to ask ReadWriteWeb for permission to augment the data about the article in question. Furthermore, if I only want my buddies to see what my ratings are I can set the tag to be only visible to a specific group of people. In this way Fluidinfo remains openly writable yet I still retain ownership and control over my data.

We’ve seen “read” and “write”, but what about “web”..?

Well it turns out I can stretch this analogy even further. Because everyone is tagging the same objects (identified by their “about” tag values) the data is being linked by virtue of the context of the object. We’re starting to get a web of linked data (yeah, I know, bear with me on this one…).

Since I can search and retrieve using any of the tags for which I have “read” permission I can start to create really cool mash-ups of data like this:

>>> header, result = fluidinfo.call('GET', '/values', tags=['fluiddb/about', 'boingboing.net/mentioned', 'readwriteweb.com/mentioned'], query="has boingboing.net/mentioned and has readwriteweb.com/mentioned and has unionsquareventures.com/portfolio")
>>> header
{'cache-control': 'no-cache',
 'connection': 'keep-alive',
 'content-length': '23528',
 'content-location': 'https://fluiddb.fluidinfo.com/values?query=has+boingboing.net%2Fmentioned+and+has+readwriteweb.com%2Fmentioned+and+has+unionsquareventures.com%2Fportfolio&tag=fluiddb%2Fabout&tag=boingboing.net%2Fmentioned&tag=readwriteweb.com%2Fmentioned',
 'content-type': 'application/json',
 'date': 'Wed, 23 Feb 2011 15:24:36 GMT',
 'server': 'nginx/0.7.65',
 'status': '200'}
>>> len(result['results']['id'])
4
>>> for r in result['results']['id'].values():
...     print r['fluiddb/about']['value']
... 
http://www.twitter.com
http://www.etsy.com
http://www.boxee.tv
http://www.meetup.com

What..? I’ve just asked Fluidinfo for all the articles from BoingBoing and ReadWriteWeb about companies backed by Union Square Ventures that both BoingBoing and ReadWriteWeb have covered. It turns out there are four companies: Twitter, Etsy, Boxee and Meetup.

What do one of these results look like..?

{u'boingboing.net/mentioned': 
    {u'value': [u'http://boingboing.net/2009/11/06/vampireotherkinenerg.html',
                     u'http://boingboing.net/2010/01/11/ny-times-on-urban-ca.html',
                     u'http://boingboing.net/2010/10/26/ron-paul-supporter-w.html',
                     u'http://boingboing.net/2002/06/27/meetup-meatspace-cam.html',
                     u'http://boingboing.net/2004/03/17/wired-rave-awards.html',
                     u'http://boingboing.net/2006/01/05/net-pug-nabbed-by-cr.html']},
u'fluiddb/about': 
    {u'value': u'http://www.meetup.com'},
u'readwriteweb.com/mentioned': 
    {u'value':  [u'http://www.readwriteweb.com/archives/meetup_the_secret_campaign_weapon.php']}}

What was involved in making such a cool query possible..? Simply importing data into Fluidinfo.

I’ll say no more and let you ponder the implications of what I’ve just demonstrated…

February 15, 2011

How I made a writable API for Union Square Ventures in an hour

Filed under: APIs,Essence,Howto,Programming,Writable APIs — Terry Jones @ 9:11 am

Image: Eric Archivell

I was mailing Fred Wilson and Albert Wenger of Union Square Ventures late last year, talking about Fred’s article Giving every person a voice. Fred said

I hadn’t really thought that we are all about shrinking the minimal viable publishing object, but that may well be true in hindsight.

I wanted to illustrate Fluidinfo as doing both: providing a minimal viable way to publish data (with an API), and also giving everyone a voice. So I decided to build Union Square Ventures a minimal API, and to then add my voice. In an hour.

A minimal viable API for USV

USV currently has 30 investments. If you want to get a list of the 30 company URLs, how would you do it? A non-programmer would have no choice but to go to the USV portfolio page, and click on each company in turn, then right-click on the link to each company’s home page and copy the link address, and then add that URL to your list. That process is boring and error prone.

If you’re a programmer though, you’d find this ridiculously manual. You’d much rather do that in one command, for example if you’re collecting information on VC company portfolios, perhaps for research or to get funded. Or if you were building an application, perhaps to do what Jason Calacanis is doing as part of the collecting who’s funding whom on Twitter and Facebook. You want your application to be able to fetch the list of USV company URLs in one simple call.

So I made a unionsquareventures.com user in Fluidinfo (sign up here), did the repetitive but one-time work of getting their portfolio companies’ URLs out of their HTML (so you wouldn’t have to), and added it to Fluidinfo. I put a unionsquareventures.com/portfolio tag onto the Fluidinfo object about each of those URLs. In other words, because Fluidinfo has an object for everything (including all URLs), I asked it to tag that object.

That was just 7 lines of code using the elegant and simple Python FOM library for Fluidinfo written by Ali Afshar:

import sys
from fom.session import Fluid

fdb = Fluid()
fdb.login('unionsquareventures.com', 'password')
urls = [i[:-1] for i in sys.stdin.readlines()] # Read portfolio URLs from stdin

for url in urls:
    fdb.about[url]['unionsquareventures.com/portfolio'].put(True)

As a result, using the jsongrep script I wrote to get neater output from JSON, I can now use curl and the Fluidinfo /values method to get the list of USV portfolio companies in the blink of an eye:

curl 'http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about' |
jsongrep.py results . . fluiddb/about value | sort
u'http://amee.cc'
u'http://getglue.com'
u'http://stackoverflow.com'
u'http://tumblr.com'
u'http://www.10gen.com'
u'http://www.boxee.tv'
u'http://www.buglabs.net'
u'http://www.clickable.com'
u'http://www.cv.im'
u'http://www.disqus.com'
u'http://www.edmodo.com'
u'http://www.etsy.com'
u'http://www.flurry.com'
u'http://www.foursquare.com'
u'http://www.hashable.com'
u'http://www.heyzap.com'
u'http://www.indeed.com'
u'http://www.meetup.com'
u'http://www.oddcast.com'
u'http://www.outside.in'
u'http://www.returnpath.net'
u'http://www.shapeways.com'
u'http://www.simulmedia.com'
u'http://www.soundcloud.com'
u'http://www.targetspot.com'
u'http://www.twilio.com'
u'http://www.twitter.com'
u'http://www.workmarket.com'
u'http://www.zemanta.com'
u'http://zynga.com'

There you have it, a sorted list of all Union Square Ventures portfolio companies’ URLs, from the command line. I can do it, you can do it, and any application can do it.

The jsongrep.py program can also be used to pull out selective pieces of the output. For example, which of the companies have “ee” in their URL?

curl 'http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about' |
jsongrep.py results . . fluiddb/about value '.*ee' | sort
u'http://www.meetup.com'
u'http://amee.cc'
u'http://www.indeed.com'
u'http://www.boxee.tv'

So maybe, in order to be funded by USV, it helps to have “ee” in your URL? 🙂

What about USV companies that don’t have “.com” URLs?

curl 'http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about' |
jsongrep.py results . . fluiddb/about value '.*(?

OK, these things are geeky, but that's part of the point of an API: to enable applications to do things. We've made the portfolio available programmatically, and you can immediately see how to do fun things with it that you couldn't easily do before. In fact, it's quite a bit more interesting than that. As a result of doing this work, I can tell you that there was a company listed a couple of months ago on the portfolio page that is no longer there. And there's a company that's been invested in that's not yet listed. That's a different subject, but it does illustrate the power of doing things programmatically.

This is a minimal viable API for USV because there's only one piece of information being made available (so far). But an API it is, and it's already useful.

It's also writable.

Giving everyone a voice

In a sense we've just seen that everyone has a voice. USV put a tag onto the Fluidinfo objects that correspond to the URLs of their portfolio companies and they didn't have to ask permission to do so.

But what about me? I'm a person too. I've met the founders of some of those companies, so I'm going to put a terrycojones/met-a-founder-of tag onto the same objects. Fluidinfo lets me do that because its objects don't have owners, its permission system is instead based at the level of the tags on the objects.

So I wrote another 7 line program, like the one above, and added those tags. I also added another USV tag, called unionsquareventures.com/company-name. Let's pull back just the names of the companies whose founders I've met:

curl 'http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio%20and%20has%20terrycojones/met-a-founder-of&tag=unionsquareventures.com'/company-name |
jsongrep.py results . . . value | sort
u'Bug Labs'
u'Foursquare'
u'GetGlue'
u'Meetup'
u'Shapeways'
u'Stack Overflow'
u'Tumblr'
u'Twitter'
u'Zemanta'

Isn't that cool? I do indeed have a voice!

You have one too. If you sign up for a Fluidinfo account you can add your own tags and values to anything in Fludinfo. And you can use Fluidinfo, just as I've illustrated above, to make your own writable API. See also: our post from yesterday, What is a writable API?

February 14, 2011

What is a writable API?

Filed under: APIs,Essence,Writable APIs — Terry Jones @ 9:46 am

When we released the Fluidinfo API for Boing Boing two weeks ago, Simon Willison noted on his blog:

“Fluidinfo really is a fascinating piece of software.” …. “Writable APIs are much less common than read-only APIs – Fluidinfo instantly provides both.”

If you search online to try to discover what people mean by a “writable API”, it’s hard to find anything that merits the name. So what did Simon mean? What is a writable API?

Both Simon and the team at Fluidinfo think “writable API” should be a kind of shorthand for an API that provides access to underlying data that is writable. This is not meant in the trivial already-possible sense wherein you pass data to an API method that stores them into a database you can’t otherwise access. We mean it in a more fundamental sense: that the underlying data is writable. That anyone or any application can directly access the data storage layer and add new information to it – without the knowledge of the people who stored the original data. That sounds pretty radical. But if you have a model of control in which objects are not owned but their pieces are, it’s not scary at all. In fact it’s liberating.

And, you guessed it, Fluidinfo has exactly that model of control. Any information stored into Fluidinfo instantly has a writable API in the sense just described. Let’s see a concrete example from the recent Boing Boing data imported into Fluidinfo.

Below is an illustration of an object in Fluidinfo, showing a subset of the tags that are on every Fluidinfo object representing a Boing Boing article. (The image was generated using Nick Radcliffe‘s fun About Tag image generator for Fluidinfo objects. Click the image to see the all its tags.)

An object

Simply by virtue of being stored in Fluidinfo, Boing Boing instantly got an API for all their articles. The API lets you find Boing Boing articles, as represented by objects in Fluidinfo, via querying on tags such as those shown on the object above. For example, you can use the API to find Boing Boing articles published in December 2008 that were written by Cory Doctorow. Or you can get a list of all the Boing Boing articles that contain a reference to the domain www.whitehouse.gov. (You can see details of these sorts of queries in our article on Mining the Boing Boing API.)

Those kinds of searches on Boing Boing data were not previously possible. We put the whole thing together in a single evening, which illustrates how simple it can be to make a Fluidinfo-fueled API for your own information. As cool as these examples are, though, they’re just reading & searching Boing Boing controlled data, as with a traditional API. What about writing?

Writing the Boing Boing data – without stopping to ask permission

The tags on the object above were put there by the Fluidinfo user named boingboing.net. That user controls those tags, and has given the rest of us read permission. But no-one owns the Fluidinfo object that the tags are on. As a result, anyone with a Fluidinfo account (sign up here) can add any information to the exact same object.

To give a very simple example, suppose someone wrote a simple browser extension (or extensions) that let Boing Boing readers mark stories as being funny or not suitable for work. Two users, Alice and Bob, might then put alice/funny and bob/nsfw tags onto the above object. Assuming I had read permission on those tags, I could then find Boing Boing articles by Cory Doctorow that Alice enjoyed and Bob found too risqué for work. Someone else could write a browser extension that popped up a warning about NSFW content based on Bob’s tag. In fact, take a proper look at the object above, you’ll see that I have added a terrycojones/nsfw tag to it (terrycojones is my username in Fluidinfo).

That’s customization and personalization – in our hands. It’s adding data to the exact same objects that Boing Boing created, combining their data and ours as we please, and all without stopping to ask permission or requiring that a database administrator or programmer anticipate our idiosyncratic needs. Boing Boing and any applications they create, may not be aware of, care about, or even be able to detect the new data (depending on permissions).

In other words, we can say that Boing Boing has a writable API, because other people and other applications are always free to add information to the same objects that the Boing Boing API is providing access to. The same applies to any application or API that uses Fluidinfo. A writable API opens the door onto a very different world, allowing unlimited possibilities for mash-ups, new applications, extensions, widgets, etc. It allows arbitrary customization and personalization. Fluidinfo acts like a universal metadata engine, providing guaranteed write access to anything, with a permissions system at the level of the tag, not the object.

We’ll give another example of a simple but fun writable API tomorrow. Next week we’ll release a much more substantial one at the LAUNCH conference in San Francisco. We’re really excited about it, and have a series of not-to-be-missed upcoming blog posts on what we’ve been up to.

Stay tuned!

Powered by WordPress