How I made a writable API for Union Square Ventures in an hour

February 15th, 2011 by Terry Jones. Filed under APIs, Essence, Howto, Programming, Writable APIs.

Image: Eric Archivell

I was mailing Fred Wilson and Albert Wenger of Union Square Ventures late last year, talking about Fred’s article Giving every person a voice. Fred said

I hadn’t really thought that we are all about shrinking the minimal viable publishing object, but that may well be true in hindsight.

I wanted to illustrate Fluidinfo as doing both: providing a minimal viable way to publish data (with an API), and also giving everyone a voice. So I decided to build Union Square Ventures a minimal API, and to then add my voice. In an hour.

A minimal viable API for USV

USV currently has 30 investments. If you want to get a list of the 30 company URLs, how would you do it? A non-programmer would have no choice but to go to the USV portfolio page, and click on each company in turn, then right-click on the link to each company’s home page and copy the link address, and then add that URL to your list. That process is boring and error prone.

If you’re a programmer though, you’d find this ridiculously manual. You’d much rather do that in one command, for example if you’re collecting information on VC company portfolios, perhaps for research or to get funded. Or if you were building an application, perhaps to do what Jason Calacanis is doing as part of the collecting who’s funding whom on Twitter and Facebook. You want your application to be able to fetch the list of USV company URLs in one simple call.

So I made a unionsquareventures.com user in Fluidinfo (sign up here), did the repetitive but one-time work of getting their portfolio companies’ URLs out of their HTML (so you wouldn’t have to), and added it to Fluidinfo. I put a unionsquareventures.com/portfolio tag onto the Fluidinfo object about each of those URLs. In other words, because Fluidinfo has an object for everything (including all URLs), I asked it to tag that object.

That was just 7 lines of code using the elegant and simple Python FOM library for Fluidinfo written by Ali Afshar:

import sys
from fom.session import Fluid

fdb = Fluid()
fdb.login('unionsquareventures.com', 'password')
urls = [i[:-1] for i in sys.stdin.readlines()] # Read portfolio URLs from stdin

for url in urls:
    fdb.about[url]['unionsquareventures.com/portfolio'].put(True)
 

As a result, using the jsongrep script I wrote to get neater output from JSON, I can now use curl and the Fluidinfo /values method to get the list of USV portfolio companies in the blink of an eye:

curl ‘http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about’ |
jsongrep.py results . . fluiddb/about value | sort
u’http://amee.cc’
u’http://getglue.com’
u’http://stackoverflow.com’
u’http://tumblr.com’
u’http://www.10gen.com’
u’http://www.boxee.tv’
u’http://www.buglabs.net’
u’http://www.clickable.com’
u’http://www.cv.im’
u’http://www.disqus.com’
u’http://www.edmodo.com’
u’http://www.etsy.com’
u’http://www.flurry.com’
u’http://www.foursquare.com’
u’http://www.hashable.com’
u’http://www.heyzap.com’
u’http://www.indeed.com’
u’http://www.meetup.com’
u’http://www.oddcast.com’
u’http://www.outside.in’
u’http://www.returnpath.net’
u’http://www.shapeways.com’
u’http://www.simulmedia.com’
u’http://www.soundcloud.com’
u’http://www.targetspot.com’
u’http://www.twilio.com’
u’http://www.twitter.com’
u’http://www.workmarket.com’
u’http://www.zemanta.com’
u’http://zynga.com’

There you have it, a sorted list of all Union Square Ventures portfolio companies’ URLs, from the command line. I can do it, you can do it, and any application can do it.

The jsongrep.py program can also be used to pull out selective pieces of the output. For example, which of the companies have “ee” in their URL?

curl ‘http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about’ |
jsongrep.py results . . fluiddb/about value ‘.*ee’ | sort
u’http://www.meetup.com’
u’http://amee.cc’
u’http://www.indeed.com’
u’http://www.boxee.tv’

So maybe, in order to be funded by USV, it helps to have “ee” in your URL? :-)

What about USV companies that don’t have “.com” URLs?

curl ‘http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio&tag=fluiddb/about’ |
jsongrep.py results . . fluiddb/about value ‘.*(?<!\.com)$’
u’http://www.outside.in’
u’http://amee.cc’
u’http://www.cv.im’
u’http://www.buglabs.net’
u’http://www.returnpath.net’
u’http://www.boxee.tv’

OK, these things are geeky, but that’s part of the point of an API: to enable applications to do things. We’ve made the portfolio available programmatically, and you can immediately see how to do fun things with it that you couldn’t easily do before. In fact, it’s quite a bit more interesting than that. As a result of doing this work, I can tell you that there was a company listed a couple of months ago on the portfolio page that is no longer there. And there’s a company that’s been invested in that’s not yet listed. That’s a different subject, but it does illustrate the power of doing things programmatically.

This is a minimal viable API for USV because there’s only one piece of information being made available (so far). But an API it is, and it’s already useful.

It’s also writable.

Giving everyone a voice

In a sense we’ve just seen that everyone has a voice. USV put a tag onto the Fluidinfo objects that correspond to the URLs of their portfolio companies and they didn’t have to ask permission to do so.

But what about me? I’m a person too. I’ve met the founders of some of those companies, so I’m going to put a terrycojones/met-a-founder-of tag onto the same objects. Fluidinfo lets me do that because its objects don’t have owners, its permission system is instead based at the level of the tags on the objects.

So I wrote another 7 line program, like the one above, and added those tags. I also added another USV tag, called unionsquareventures.com/company-name. Let’s pull back just the names of the companies whose founders I’ve met:

curl ‘http://fluiddb.fluidinfo.com/values?query=has%20unionsquareventures.com/portfolio%20and%20has%20terrycojones/met-a-founder-of&tag=unionsquareventures.com’/company-name |
jsongrep.py results . . . value | sort
u’Bug Labs’
u’Foursquare’
u’GetGlue’
u’Meetup’
u’Shapeways’
u’Stack Overflow’
u’Tumblr’
u’Twitter’
u’Zemanta’

Isn’t that cool? I do indeed have a voice!

You have one too. If you sign up for a Fluidinfo account you can add your own tags and values to anything in Fludinfo. And you can use Fluidinfo, just as I’ve illustrated above, to make your own writable API. See also: our post from yesterday, What is a writable API?

  • http://www.fridnet.com/slava/blog SF

    This is definitely interesting. I particularly like the possibility of an openly-evolving API ecosystem, since you can allow others to contribute to API. Fact is that since API is based on tags (great insight from fluidinfo folks) it is easily accessible to non-technical users. I am a big fan of IBM Mashup Center product, but I have not seen people use it in this way – although it is capable. That’s the difference between enterprise and start-up space I guess – crazy ideas that sometimes bear fruit.

  • http://twitter.com/djbressler David Bressler

    Great post, thanks!

    Quick question just to make sure I understand.

    You’re “scraping” the USV site, and turning the “scrape” into an API? Then, through Fluidinfo you’re enriching the call by adding whether or not you’ve met the company?

    So, if USV changes their site, it breaks this API you’ve created? And, breaks anyone’s app that uses your API. (Lucky for you, it looks like a USV failure.)

    And, to be really clear, you’re not in any way (professionally) associated with USV right?

    (OK, four questions)

    Thanks!

    David

  • http://blogs.fluidinfo.com/terry terrycojones

    Hi David

    Yes, I just grabbed the data from their site (by hand). It’s not enriching the call, it’s enriching the Fluidinfo object that corresponds to the URL of the portfolio company. There’s USV data on that object, and there’s my data on it. Likewise, you can add data to it, and query across any subset of the tags on the objects, including my tags, USV’s tags, etc.

    If they change their site and they wanted the API to be up to date, they’d need to update Fluidinfo (or we’d do it for them). It’s not so much as breaks, it’s just that the data would be out of date. BTW, the USV portfolio page *is* a bit out of date, I think (no Tasty Labs).

    And no, no professional relationship with USV, though we’ve met many times. If we had one, they’d be writing about us, not vice versa! :-)

    Terry

  • http://blogs.fluidinfo.com/terry terrycojones

    Hi SF – thanks for the comment & compliment :-)

    I agree strongly about being accessible to non-technical users. I’ve been thinking about that a huge amount over the last months, and have reached some conclusions that I find really exciting (and which I plan to blog about). I gave an Ignite talk in NYC a week ago that touched on the issues. The video’s not online yet, but my slides are at http://blogs.fluidinfo.com/fluidinfo/2011/02/10/what-the-post-it-note-can-teach-us-about-apps-and-data/ and I’ll link the video when it’s out.

    Summary is: humans are actually *great* at working with data. It’s apps that are confusing. Conclusion: apps should be as transparent as possible, and keeping a data model as simple as possible is important because it makes that possible. Information storage architecture should be as simple as using a post-it note.

    So I think you’re right on the money with your comment. Thanks again.

  • http://twitter.com/djbressler David Bressler

    Terry,

    Thanks! That makes sense…

    It’s hard to keep stuff up to date, I always wonder how advertising companies knew how to keep billboards up to date – if you don’t pass by them, you’re never reminded.

    Seriously, makes more sense, and I really like the idea of enriching the object. That’s really powerful. Thanks for the explanation.

    David

  • http://blogs.fluidinfo.com/terry terrycojones

    Thanks – and you’re welcome. We’re going to release another writable API next week (a much more substantial one). The idea in that case is that the API becomes the official one for the company, and each time they produce something new they automatically push the metadata into Fluidinfo, thereby keeping the API current. We’re really excited about it, will be announced at the LAUNCH conference.

  • http://twitter.com/djbressler David Bressler

    Terry,

    Good luck. I’ll keep an eye out. I’m contracting for OpusGrid (http://opusgrid.com/blog) and have recently had an article picked up by ProgrammableWeb (http://bit.ly/fiH4DL).

    I have a background in the Integration->SOA->API space — I was with TIBCO during the IPO, and was with Actional through the merger with security company WestBridge and then the Progress Acquisition (blogs.progress.com/business_making_progress/david_bressler/).

    What I’m trying to say is, I’m in the API space myself (and in NYC), so please stay in touch.

    David

  • http://blogs.fluidinfo.com/terry terrycojones

    Will do :-) I’m terry at fluidinfo com if you feel like saying hi in email.

  • http://twitter.com/pauldprice Paul Price

    Very cool post. As someone who used to do a lot of screen scraping for a living, this is a very interesting evolution in turning any site into an open API.

  • http://blogs.fluidinfo.com/terry terrycojones

    Thanks Paul – glad you like it. BTW, have you seen scraper wiki? I didn’t use it for this project, but we did for the thing we have coming out soon. It’s great. http://scraperwiki.com/

  • http://www.fridnet.com/slava/blog SF

    Terry,

    Great point – I wonder how much application design would change if
    that – humans are great at working with data – was the default
    assumption. My guess is – a LOT. I wonder if the issue is not so much
    as “keeping data model as simple as possible” but “matching data
    access and model to real life mental models/usage” – *that* would make
    it “simple” to users by default.

    If so, then it is the developers who have been screwing up for a few
    decades because, if you think about it, we kept making data “simple”
    for the computers to store and retrieve. What else is 3NF if not an
    offering on the alter of Big O notation?! This would also help explain
    the rise of NoSQL – as computers try to handle more human-like tasks -
    dealing with conversations and millions of human users (not dozens of
    trained corporate users – trained to work the way a computer would be
    able to handle!) – they would evolve into models that actually *are*
    similar to what humans would use themselves…

    Does that make sense? :) I think you are on to something big that has
    been discussed, but is not talked about enough. yet.

  • http://blogs.fluidinfo.com/terry terrycojones

    Hi. Yes, that makes sense to me, most of it anyway! :-) I also wonder about your first point. And the start of the second paragraph is something I’ve been pondering too…. that the received wisdom for application developers is that “users are stupid” because you can’t assume they’ll know how to do anything. But maybe it really is the other way round.

    Thanks for commenting. Sorry for the slow reply.