Fluidinfo

December 20, 2010

Delicious to FluidDB

Filed under: Howto,Programming — Nicholas Tollervey @ 11:21 am

In case you’ve missed the brouhaha, Yahoo were rumoured to be shutting down the rather excellent delicious bookmark/tagging service. Since reading this post in the Washington Post and checking from the horses mouth it looks like the rumours are mistaken. Nevertheless a plethora of tools for grabbing and backing-up data from delicious have been posted “just in case”.

Since I (ntoll) have always wanted to use FluidDB (our openly writable shared database – sign up here) as a delicious clone, the rumours prompted me to quickly knock together a script to extract my bookmarks from delicious and store them in FluidDB. I’m not the only person to have thought of this: Fluidinfo advisor Nick Radcliffe described one method for achieving this aim last year. Over the weekend both Nick and I have been thinking hard about how to organise the imported data within FluidDB.

The result is a simple standalone Python script called delicious2fluid that does exactly what its name implies. The source code is hosted at Github and I’ve added it as a package on PyPI (the Python Package Index). The rest of this post explains how to use delicious2fluid then describes some of the benefits of using FluidDB (a flexible schema, simple yet powerful queries, values associated with tags etc).

There are two options for installation:

  1. Download the source code and run the installation script:
    $ git clone git://github.com/ntoll/delicious2fluid.git
    $ cd delicious2fluid
    $ python setup.py install
  2. Use PyPI with pip or easy_install:
    $ pip install -U delicious2fluid or
    $ easy_install delicious2fluid

Once installed you simply need to run the command and answer the questions:

$ delicious2fluid
Delicious username: ntoll
Delicious password:
FluidDB username: ntoll
FluidDB password:
FluidDB path (hit return to default to root namespace: ntoll)
2010-12-17 21:09:12,601 - d2f - INFO - Grabbing bookmarks from delicious
2010-12-17 21:09:29,223 - d2f - INFO - 200 OK
2010-12-17 21:09:29,492 - d2f - INFO - Creating delicious namespace in FluidDB
... etc ...

The username and password for both services are not stored in any way shape or form. As you might have guessed the script pipes a log of what it’s up to to stdout. If you do encounter any problems then the d2f.log file will contain lots of debug information (bug reports and suggestions most welcome!).

The script will ignore private bookmarks since we don’t want it to be responsible for leaking information but it will import all the tags you use even if they’re attached to private bookmarks. It’s important to note that the existence of tags in FluidDB is public since every tag has an associated object with an appropriate “about-tag” value that identifies it as an object about a specific tag (you have been warned!).

After grabbing an XML dump of your bookmarks from delicious the script creates the following tags in your root namespace (override the default location of the tags by providing a namespace path for the final question that the script asks you):

  • USERNAME/title
  • USERNAME/notes

Metadata from delicious is stored with tags created under the delicious namespace:

  • USERNAME/delicious/hash
  • USERNAME/delicious/time
  • USERNAME/delicious/meta
  • USERNAME/delicious/tag

FluidDB stores the tag names as a set of strings in the tag named USERNAME/delicious/tag. Furthermore, each tag you create in delicious will be recreated in FluidDB under your root namespace:

  • USERNAME/TAGNAME

Obviously, “USERNAME” is replaced with your username on FluidDB (i.e. your root namespace if you’ve not overridden the default location). These tags annotate objects representing bookmarks in FluidDB (one object per bookmark). The object’s about tag value is simply the URL that the bookmark references so everyone else can easily find and tag it.

For example, say I (ntoll) only used three tags (“foo”,“bar” and “baz”) then the following tags will be created in FluidDB:

  • ntoll/foo
  • ntoll/bar
  • ntoll/baz

These tags are automatically added to the correct objects to indicate how the original bookmark was tagged. Of course there is nothing to stop anyone from adding more tags and information or creating more objects to represent bookmarks that might not have originated from delicious.

I’ve succeeded in importing all my tags and bookmarks (it took a couple of hours for c2000 tags and 1800 bookmarks). If you’re interested, use the FluidDB Explorer to take a look at a user-friendly view of my delicious tags. Open the tree view on the left hand side and click on the tags to find the associated objects/bookmarks. You’ll also see the query used to generate the results (usually something along the lines of “has ntoll/delicious/tags/FOO”).

You’ll also notice that I’ve actually put all my tags in the ntoll/delicious/tags namespace and ignored the default “schema”. Why have I done this? Three reasons:

  1. It helps to indicate the origin of the data.
  2. It stops my root namespace from getting polluted with (potentially) thousands of tags.
  3. It indicates that all the tags under the “delicious” namespace are to be used just like in the delicious web-application.

But won’t that mean I’ve broken FluidDB since I’ve ignored the precedent set by Nick Radcliffe in the blog posts I mentioned earlier..?

Not at all! One of the strengths of FluidDB is that it works well across different or evolving schema. For example, I can still find interesting bookmarks with queries such as:

has njr/fluidinfo and has ntoll/delicious/tags/fluidinfo

Which leads me to an object with the id f3f80612-7015-4a61-a1ba-94087e9aa582 and fluiddb/about value of “http://paulerb.typepad.com/infosharing/2009/01/is-metadata.html” (a really great blog post, by the way). I’ve used Nick’s visualisation tool to create the following representation of the object:

If you’re eagle-eyed you’ll have spotted that I’ve also added an “ntoll/rating” tag to this object with an associated value of 10 (it’s at the bottom left hand side). This demonstrates several important aspects of FluidDB:

  • I’m not limited to using a pre-defined schema. I can annotate any object with any tag linking it to any type of data – be it a primitive (searchable) value like an integer or something more opaque like a PDF document (contrast this with delicious’s value-less tags).
  • It’s possible to ask the tag for it’s description, in which case this particular one will return “An indication of how I rate something”. Since I am the only person who could have created this tag you know “I” = ntoll.
  • Because the tag is openly readable you can use it in your queries. For example, you might want a list of all the delicious bookmarks to which I’ve tagged a high rating:
    has ntoll/delicious/description and ntoll/rating >7
    (In fact you could use any tag for which you have “read” permission no matter who created it.)

In conclusion, it’s early days for this script and whilst its original purpose as a backup for delicious’s demise is no longer relevant it has provided an opportunity to demonstrate some of the interesting ways in which the openly writable, social and evolutionary approach of FluidDB adds value to a collection of bookmarks.

November 23, 2010

Watering a Peace Lily with Fluidinfo

Filed under: Awesomeness,Programming — Nicholas Tollervey @ 9:54 am

I (ntoll) belong to a nascent hackerspace called NortHACKton. It’s an opportunity to learn new skills and to collaborate with a great bunch of people who create cool stuff. I’m going to describe just such a collaboration with Stephen Bridges, one of the organisers of the hackerspace.

Our aim was to combine a simple hardware project with Fluidinfo and do it in such a way that others could repeat, extend and enhance what we’d been up to. We decided to connect an Arduino to a sensor and put the resulting reading into Fluidinfo at regular intervals. In the end we built something to make a moisture reading of the soil in Stephen’s plant pot and update a value in Fluidinfo every 10 minutes.

The Arduino has an Ethernet shield so the device can communicate autonomously with Fluidinfo via the HTTP API. The support circuitry is adapted from Botanicalls.com (Creative Commons) and Stephen created the sensor from a pen lid, sticky tape and a couple of wires. 🙂

The source code can be found on GitHub and contains two parts:

  1. A generic and reusable layer that handles basic interaction with Fluidinfo
  2. The application logic that takes the reading and controls the Arduino.

From Fluidinfo’s point of view, there is an object that represents Stephen’s peace lily (its about tag value is “Stephen’s Peace Lily (houseplant)”) and the tag widget/ffm/reading attached to this object is updated with the appropriate value.

Interestingly, I’ve also added some tags to the object representing the peace lily which hold html, css and javascript values. This is a classic case of putting information in context since the peace lily’s web page is a tag-value attached to its object in Fluidinfo. So it’s possible to view the peace lily’s current status with your browser.

The whole thing was hacked together in an afternoon over a drink in a pub in Northampton. Unfortunately for Stephen my mobile phone takes video so I press-ganged him into the following explanation:

You can find Stephen’s write-up on the NortHACKton wiki. If you’re interested in doing something similar with Fluidinfo please don’t hesitate to drop in on our IRC channel (#fluidinfo on Freenode – connect via the web) and ask questions. Alternatively, drop by either the fluidinfo-users or fluidinfo-discuss mailing lists. We’d be more than happy to help.

November 19, 2010

Importing data into FluidDB with Flimp

Filed under: Programming,Progress — Nicholas Tollervey @ 5:26 am

We’d like to introduce you to “Flimp” (the FLuiddb IMPorter) – a tool that makes it easy to import data into FluidDB.

It works in two ways:

  1. Given a source file containing a data dump (in either json, yaml or csv format), Flimp will create the necessary FluidDB namespaces and tags and then import the records. (We expect to provide more file formats soon.)
  2. Given a filesystem path, Flimp will create the necessary FluidDB namespaces (based on directories) and tags (based on file names) and then import file contents as values tagged on a single FluidDB object.

Flimp can be configured to do custom pre-processing (e.g. cleaning, normalizing or modifying) before data is imported into FluidDB. It’s important to note that Flimp is in active development and that we welcome comments, ideas, and bug reports. Flimp is built on fom (the Fluid Object Mapper) created by my colleague Ali Afshar.

As a test, we’ve imported all the metadata from data.gov and data.gov.uk using Flimp and made it publicly readable. The rest of this article explains exactly how we did it so you can also start importing data into FluidDB using Flimp.

Open Government Data

Open linked government data

source: http://www.flickr.com/photos/opensourceway/4371001268/

Governments are making their data openly available to citizens. This has resulted in a tidal wave of hitherto unavailable information flowing onto the Internet.

Unfortunately, it’s very easy to be swamped by both the sheer amount and diversity of what is available. Furthermore, despite progress in this area, it is still difficult to search and explore the data. Plus, governments publish data in many different ways making it difficult to link, annotate and search datasets.

Both the US and UK government data sites provide a dump of their metadata (data describing the data they have available). Finding this invaluable information is hard, so for the record here’s a link to the US dump and here’s a link to the UK dump. These are the sources Flimp imported into FluidDB. No doubt there are more from other governments and when found they’ll also mysteriously find their way into FluidDB.

Get Flimp

Flimp is written in the Python programming language. You’ll need to have this installed first along with setuptools. Once you have these requirements there are two ways to get Flimp:

  1. If you want the latest and greatest “bleeding edge” version then go visit the project’s website and follow the appropriate links/instructions.
  2. If you’d rather use the current packaged stable release then follow the instructions below. The rest of this article deals with Flimp version 0.6.1.

To install the latest stable release open a terminal and issue the following commands (Flimp depends on fom and PyYaml):

$ easy_install fom
$ easy_install PyYaml
$ easy_install flimp

Once installed you can check Flimp has installed correctly by using the “flimp” command like this:

$ flimp --version
flimp 0.6.1

That’s it! You have both the “flimp” command line tool installed and the associated libraries used for importing data into FluidDB.

Help is always available via the command line tool:

$ flimp --help
Usage: flimp [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -f FILE, --file=FILE  The FILE to process (valid filetypes: .json, .csv,
                        .yaml)
  -d DIRECTORY, --dir=DIRECTORY
                        The root directory for a filesystem import into
                        FluidDB
  -u UUID, --uuid=UUID  The uuid of the object to which the filesystem import
                        is to attach its tags
  -a ABOUT, --about=ABOUT
                        The about value of the object to which the filesystem
                        import is to attach its tags
  -p, --preview         Show a preview of what will happen, don't import
                        anything
  -i INSTANCE, --instance=INSTANCE
                        The URI for the instance of FluidDB to use
  -l LOG, --log=LOG     The log file to write to (defaults to flimp.log)
  -v, --verbose         Display status messages to console
  -c, --check           Validate the data file containing the data to import
                        into FluidDB - don't import anything

Importing from data.gov.uk

First, we registered the user “data.gov.uk”. Because we’ll be using tags only associated with the data.gov.uk user you can be sure that the source of the data is legitimate. (We’d love this user to be under the control of someone from data.gov.uk – contact us if this applies to you.)

Next, we downloaded a json dump of the UK’s metadata. A quick look at the raw file indicated that it was already in a remarkably good state but we wanted to make sure. Flimp helps out:


$ flimp --file=uk_data_dump.json --check
Working... (this might take some time, why not: tail -f the log?)
The following MISSING fields were found:

geographical_granularity
temporal_coverage-from
temporal_coverage_to
geographic_granularity
temporal_coverage_from
taxonomy_url
import_source
temporal_coverage-to

Full details in the missing.json file

Flimp uses the first item in the json dump as a template for the schema. The “–check” flag tells Flimp to make sure all the items match the schema. In this case we notice that some items don’t have all the fields. This isn’t a problem and if we were to open the “missing.json” file then we’d see which items these are. Importantly, Flimp also checks if any of the items have extra fields associated with them. This would be more of an issue but Flimp would help by giving details of the problem items allowing you to rectify the problem.

It is also possible to preview what Flimp would do when importing the data:

$ flimp --file=uk_data_dump.json --preview
FluidDB username: data.gov.uk
FluidDB password:
Absolute Namespace path (under which imported namespaces and tags will be created): data.gov.uk/meta
Name of dataset (defaults to filename) [uk_data_dump]: data.gov.uk:metadata
Key field for about tag value (if none given, will use anonymous objects): id
Description of the dataset: Metadata from data.gov.uk
Working... (this might take some time, why not: tail -f the log?)
Preview of processing 'uk_data_dump.json'

The following namespaces/tags will be generated.

data.gov.uk/meta/relationships
data.gov.uk/meta/ratings_average
data.gov.uk/meta/maintainer
data.gov.uk/meta/name
data.gov.uk/meta/license
data.gov.uk/meta/author
data.gov.uk/meta/url
data.gov.uk/meta/notes
data.gov.uk/meta/title
data.gov.uk/meta/maintainer_email
data.gov.uk/meta/author_email
data.gov.uk/meta/state
data.gov.uk/meta/version
data.gov.uk/meta/resources
data.gov.uk/meta/groups
data.gov.uk/meta/ratings_count
data.gov.uk/meta/license_id
data.gov.uk/meta/revision_id
data.gov.uk/meta/id
data.gov.uk/meta/tags
data.gov.uk/meta/extras/national_statistic
data.gov.uk/meta/extras/geographic_coverage
data.gov.uk/meta/extras/geographical_granularity
data.gov.uk/meta/extras/external_reference
data.gov.uk/meta/extras/temporal_coverage-from
data.gov.uk/meta/extras/temporal_granularity
data.gov.uk/meta/extras/date_updated
data.gov.uk/meta/extras/agency
data.gov.uk/meta/extras/precision
data.gov.uk/meta/extras/geographic_granularity
data.gov.uk/meta/extras/temporal_coverage_to
data.gov.uk/meta/extras/temporal_coverage_from
data.gov.uk/meta/extras/taxonomy_url
data.gov.uk/meta/extras/import_source
data.gov.uk/meta/extras/temporal_coverage-to
data.gov.uk/meta/extras/department
data.gov.uk/meta/extras/update_frequency
data.gov.uk/meta/extras/date_released
data.gov.uk/meta/extras/categories

4023 records will be imported into FluidDB

The “–preview” flag does exactly what you’d expect: it asks you the same questions as if you were importing the data for real but instead lists the new namespace/tag combinations that will be created and the number of new objects to be annotated.

It’s important to understand how Flimp generates the “about” tag value (unsurprisingly, the about tag value indicates what each object in FluidDB is about). It needs to be unique and descriptive of what the object represents. As a result Flimp asks you to identify a field in your data containing unique values and appends this to the end of the name of the dataset (in the example above, “id” was identified as the key field):


fluiddb/about = "data.gov.uk:1ea4bfa9-9ae1-4be0-ae73-e0c4a26caa6c"

If you don’t provide a field for unique values Flimp simply generates a new object without an associated “about” value.

Nicholas Radcliffe’s About Tag blog is a great source of further information about the emerging conventions surrounding the “about” tag.

Since Flimp has satisfied us that the json data was in a good state we simply issued the following command to start the actual import:

$ flimp --file=uk_data_dump.json
FluidDB username: data.gov.uk
FluidDB password:
Absolute Namespace path (under which imported namespaces and tags will be created): data.gov.uk/meta
Name of dataset (defaults to filename) [uk_data_dump]: data.gov.uk:metadata
Key field for about tag value (if none given, will use anonymous objects): id
Description of the dataset: Metadata from data.gov.uk
Working... (this might take some time, why not: tail -f the log?)

Notice how Flimp interrogates you for sensitive information so you don’t have to have username/password credentials stored in a configuration file.

After the import completed it left a record of exactly what it did in the “flimp.log” file located in the current directory.

Importing from data.gov

Just as with the UK data, we’ve used an appropriate FluidDB username for importing the US data: data.gov (and the same applies – the data.gov user should be under the control of someone from data.gov – please contact us if this applies to you).

We took a different approach to the US metadata. They provide either an rdf document or a csv file. Since Flimp understands csv we used this as the source.

We wanted to make sure that the headers in the csv file (which get transformed into the names of tags in FluidDB) were cleaned and normalized appropriately since they contained lots of whitespace and non-alphanumeric characters. The snippet of Python code below demonstrates how we re-used Flimp in our own import script to achieve this end.

from flimp.utils import process_data_list
from flimp.parser import parse_csv
from fom.session import Fluid

def clean_header(header):
    """
    A function that takes a column header and normalises / cleans it into
    something we'll use as the name of a tag
    """
    # remove leading/trailing whitespace, replace inline whitespace with
    # underscore and any slashes with dashes.
    return header.strip().replace(' ', '_').replace('/', '-')

csv_file = open("data_gov.csv", "r")
data = parse_csv.parse(csv_file, clean_header)

# data now contains the normalized input from the csv file

# Use fom to create a session with FluidDB - remember flimp uses fom for
# connecting to FluidDB
fdb = Fluid() # defines a session with FluidDB
fdb.login('data.gov', 'secretpassword123') # replace these with something that works
fdb.bind()

root_path = 'data.gov/meta'# Namespace where imported namespaces/tags are created
name = 'data.gov:metadata' # used when creating namespace/tag descriptions 
desc = 'Metadata from data.gov' # a description of the dataset
about = 'URL' # field whose value to use for the about tag

# the following function call imports the data
result = process_data_list(data, root_path, name, desc, about)
print result

Conclusion

By importing the metadata into FluidDB we immediately gain the following:

  • FluidDB’s consistent, simple and elegant RESTful API as a view into the data.
  • The possibility of simple yet powerful queries across all the metadata.
  • The opportunity to annotate, link and augment the existing data with contributions from other sources.

Any application can now access the newly imported government data. In a future post I’ll demonstrate how to build a web-based interface for this data that is also hosted within FluidDB. I’ll also show how to query, annotate and link data yourself and re-use the contributions of others.

November 15, 2010

Coming soon to a FluidDB near you…

Filed under: Awesomeness,Happiness,Programming,Progress — Tags: — Nicholas Tollervey @ 4:51 am

Today (Monday 15th November) commencing from 10am GMT (11am Western Eurozone, 5am EST) the main instance of FluidDB will be offline for several hours while we roll out a major update.

We’re excited to announce the following new features and changes:

  • /about added to HTTP API – It will be possible to access FluidDB objects that have a fluiddb/about tag value with requests whose path starts with /about. For example, the object about “Barcelona” can be reached directly via /about/Barcelona. The behaviour of /about, when given an about value, is exactly like that of /objects when given an object id. More information will be available in the API docs at http://api.fluidinfo.com/. Many thanks to Holger Dürer (http://twitter.com/hd42) for suggesting this improvement.
  • /values added to HTTP API – It is now possible to manipulate multiple tag values in a single API request to /values via the PUT, GET and DELETE HTTP methods. From the user’s perspective, this will result in a significant improvement in performance. More information can be found in the API docs at http://api.fluidinfo.com/.
  • “SEE” permission replaced with “READ” – the permissions system has been simplified. FluidDB now uses only the READ permission on tags to decide whether API calls accessing the tag values should be allowed to proceed. Anything that used the SEE permission now uses READ. For example, when you do a GET on an object to retrieve the names of its tags, you will only receive those for which you have READ permission. Many thanks to Jamu Kakar (http://twitter.com/jkakar) for suggesting this simplification.
  • Deleting a tag instance now always returns an HTTP 204 (No content) code – DELETEing a tag value from an object that did not have that tag used to result in a “404 (Not found)” status. This will be changed to simply return the non-error “204 (No Content)”.
  • “Content-MD5” header for checking payload content – It will be possible to send a checksum of a payload to FluidDB via the “Content-MD5” header. FluidDB will attempt to validate the checksum with the payload and return a “412 (Precondition failed)” status in the case of a mismatch.
  • Cross Origin Resource Sharing (CORS) added to HTTP API – it will be possible to make cross origin requests as specified by http://www.w3.org/TR/cors/ rather than rely on JSONP. FluidDB will have an almost complete implementation of this emerging standard although we expect to make changes and improvements as the specification matures.
  • Text indexing of fluiddb/about tag values – text indexing is coming to FluidDB but is definitely a work in progress. This release is just the very first step: the fluiddb/about tag will be indexed from the update onwards (existing fluiddb/about tag values will be indexed over the coming days/weeks).

For those of you who have written or maintain a client library for FluidDB we’d like to refer you to the changes we’ve made to the Fluid Object Mapper (FOM) library as a reference for what you might want to do with your own library.

To encourage people to add the new FluidDB capabilities to libraries, we’re going to extend the FluidDB Weekend of Code offer to library authors. Let us know when you’re working on your library and where we can find it (Github, Bitbucket, Sourceforge etc) and we’ll order you a pizza and send you a book of your choice from Amazon.

Finally, we’re moving to a four-week development cycle so expect regular updates, pro-active bug squashing and lots of progress in the coming months. We’ve got lots of exciting stuff in the pipeline and we can’t wait to see how the FluidDB community reacts.

September 22, 2010

Betaworks is a Strange Attractor

Filed under: Essence — Tags: , — Terry Jones @ 8:04 am

I’ve just arrived back at Betaworks, who led the investment in Fluidinfo, after a 3-month absence. It’s an amazing environment, somehow buzzing with both intensity and diversity. A lot of people are trying to get an answer to the question “What is Betaworks?” The most obvious (and wrong) answer is that Betaworks is an incubator. I’ve spent time on and off thinking about the question, and I don’t think there’s a single good / conventional answer. Betaworks is something else. So here’s my informal answer, which I think works pretty well:

Betaworks is a Strange Attractor.

From that Wikipedia page: An attractor is a set towards which a dynamical system evolves over time. Of course I don’t mean Betaworks is literally a strange attractor, though their logo looks like an outward-facing version of the attractor on the right (more images). I’ll illustrate what I mean by describing who I ran into at Betaworks the last couple of times I arrived in New York and went into the office.

On June 17, 2010, I flew to NY from Vegas on a very early flight and got to Betaworks mid-morning. Just inside the door of the office I ran into Brady Forrest and Mike Loukides, both of whom work for O’Reilly. I went for coffee with Brady and spent at least an hour talking to Mike. Later in the afternoon Tim O’Reilly showed up. That’s a pretty interesting batch of folks to have run into at Betaworks. But it’s much more interesting than that. Get this: none of the three O’Reilly people knew that the others were going to Betaworks. At least one of the three didn’t know the others were even in town. Think about it. I find it quite extraordinary. NYC is a big place, and though there’s a ton of stuff going on here, there’s apparently only one place to be. Here was the first really tangible evidence for me that Betaworks have become a strange attractor of some kind, and a powerful one.

Second example: Yesterday when I arrived in the office, after several months away, I immediately ran into a bunch of people I know – none of whom works out of the Betaworks offices: Caterina Fake (Flickr, now Hunch), Jyri Engeström (Jaiku, now Google), Reshma Sohoni (Seedcamp), Iain Dodsworth (Tweetdeck), and Mika Salmi (many things). Like me, Mika also lives in Barcelona. But where do I unexpectedly run into him…? At Betaworks. How is that possible? Why were all these great people in the office today? Was there some event going on? No. So what’s going on here?

These are just a couple of days, chosen because they were both first days back in town. There are many similar days. Betaworks is an attractor in other senses too – my examples omit the companies and people who are in the office every day, creating the gravitational pull that’s attracting so many extraordinary visitors.

That’s it for now. I don’t think anyone can pin down exactly what Betaworks is, and it doesn’t really matter. But we can all be strangely attracted.

August 23, 2010

Fluidinfo receives an additional $170K in Series A second closing

Filed under: Progress — Terry Jones @ 12:00 pm

We’re happy to announce a second closing of Fluidinfo‘s Series A investment round. We’ve raised another $170K, taking the round to just under $1M in total. Some of the people investing in the second closing are:

Michael Parekh: A Wall Streeter for over 20 years and former partner at Goldman Sachs, Michael founded and helped to build the Internet Research effort at the firm (twitter, more info).

Esther Dyson: who seed funded Fluidinfo in late 2007, and who’s been a huge source of help ever since (and before). We’re thrilled to have her following on in this round (twitter, more info).

David Snow: the Editor in Chief of PEI Media. David also participated in the seed funding of Fluidinfo (twitter, more info).

Ted Carroll and Earl Macomber: who were both also seed-stage backers of Fluidinfo. Ted and Earl are the managing principals of traditional information and media focused private equity firm Noson Lawen Partners, and have again made personal investments (twitter, more info).

Ed Carroll: who was also a seed stage Fluidinfo investor. Ed is now entering his senior year in high school and hopes to attend USC next year as a freshman at The Marshall School. Ed spent a month at Marshall this summer and walked away with Top Five honors in their Entrepreneurialism program. Good luck Ed!

There are three other new investors who also came into the round, but who prefer not to be mentioned publicly at this stage (so you’ll have to ask us about them privately :-)). The above all join Betaworks, IA Ventures, RRE Ventures, Lerer Ventures, Chris Dixon & Founder Collective, Joshua Schacter, Andrew Rasiej, Ross Williams, and Esther Speight as Fluidinfo Series A investors.

Our thanks to everyone!

August 10, 2010

Tim O’Reilly joins the Fluidinfo advisory board

Filed under: Awesomeness,Happiness — Terry Jones @ 7:42 pm

Tim O'Reilly - out standing in his field

The wild rumors are all true. Tim O’Reilly has joined the Fluidinfo advisory board!

I’m an unabashed fan of Tim, his company, and everyone I’ve ever met who works at or has worked at O’Reilly (especially Sara Winge). I’ve been talking to Tim on and off about FluidDB since March of 2007 after being introduced by Esther Dyson. Tim would blog about something, and I’d email him and say “FluidDB will be able to do that” or (more recently) “FluidDB can do that.” When we first met, we spent 90 minutes together and I showed him a demo of a few things. He drilled down hard in the first 2 minutes: “Tell me what’s different about it.” So I went for it. When the meeting was over, Tim left the room and went to the elevator to leave. I was packing up my stuff when suddenly he was back, wanting to ask and suggest more. He came back to the room four times, lastly to get my phone number 🙂

Since then, Tim has been extremely generous, introducing us to many great people. You can see who very easily in the graph of introductions I put together over the years as I talked to people about FluidDB. He made the introduction that led through Gerry Campbell to John Borthwick and Andy Weissman at Betaworks who led the Fluidinfo investment. Tim invited me to the Social Graph Foo camp in 2008, to a Science Foo camp, and to two general Foo camps, and he’s been helping me in the (ongoing) attempt to get a US visa.

It’s a personal thrill to have Tim formally involved with Fluidinfo. During years working on what at times seemed like the dark side of the tech moon in Barcelona, to have had Tim and Esther (and others!) behind us along the way has been wonderful. I often feel I want us to succeed as much for them as for anyone else.

July 29, 2010

Top tweeters as followed by HN readers now in FluidDB

Filed under: Programming — Terry Jones @ 5:57 pm

Yesterday Jeff Miller posted some interesting data on the Twitter users most followed by readers of Hacker News.

I just took those top 100 Tweeters and added Jeff’s data (their rank and the fraction of HN readers who follow them) to FluidDB. The tags I used in FluidDB are ycombinator.com/top-100 and ycombinator.com/follow-percent. The top-100 tag has values that are the Twitter user’s rank (from 1 to 100), and the follower-percent tag holds the (floating point) percentage of Hacker News readers that follow that Twitter user, as found by Jeff.

What does this all mean?

It means you can now query on Jeff’s data using FluidDB. And because FluidDB contains various other pieces of information about Twitter users, you can combine his data with other data in searches – including searches that Jeff probably never anticipated (and, because of FluidDB, never had to anticipated).

It also mean you can add to the data too. All you need is a FluidDB account (sign up) and then you can take the FluidDB API for a spin (docs).

To see the kinds of things that are possible, you can also do some queries using the advanced tab of Tickery.

For example, Who are more than 20.0 percent of HN readers following that have a TunkRank score of at least 60?

Or, Who is in the HN top 100 that I have met?

Or, Who of the top 100 do I follow?

The possibilities are endless. The main point of FluidDB is that you can play too. You can add your own data (any data) to the exact same objects that I’ve put Jeff’s data onto and which Tickery and TunkRank and We Met At are all using – and you don’t have to ask permission.

We’ve written plenty more on this subject. See also Tickery, for programmers, TunkRank scores added to FluidDB, Putting metadata onto tweets with FluidDB and FluidDB as a universal metadata engine.

You can get all the code I used to put the data into FluidDB from our hackernews repo on GitHub. It was about 90 minutes of work from start to finish.

Have fun, and please comment below!

July 20, 2010

Open sourcing Tickery

Filed under: Programming — Terry Jones @ 6:23 am

TickeryToday we’re excited to announce that we’ve open sourced Tickery under the Apache License. You can download the source from the Fluidinfo repository on Github. If you’re not familiar with Tickery, you can go play with it and also read our two blog posts, Meet Tickery and Tickery, for programmers.

We’ve open sourced Tickery in order to show other developers the insides of a non-trivial application that uses FluidDB. Tickery was written over a three month period (November 2009 to January 2010), and much of it was done at a fairly fast pace. While the code could be cleaner and better documented, it’s not bad. We’re of course interested to help people understand the code, so please feel free to join the FluidDB users mailing list, or join us in #fluiddb on irc.freenode.net. Naturally we’ll be happy and interested to receive improvements or patches, and you can of course run your own instance of Tickery.

Tickery is written entirely in Python, and was built using a number of other open-source tools, including Twisted, Pyjamas, txFluidDB, txRDQ, txJSON-RPC, and Ply. Thanks to all those projects for their openness and support.

We also had the benefit of lots of help from Luke Leighton and the other Pyjamas developers – thanks!

June 10, 2010

Maintaining startup energy –or– learning to unicycle together

Filed under: Uncategorized — Terry Jones @ 11:13 am

Image: ibikenz

Learning to unicycle is really hard. The first time you take hold of the seat of a unicycle you immediately appreciate how impossibly difficult riding one will be. It feels like it shouldn’t be possible, and if you didn’t know by example that it is, I suspect very few people would seriously consider it.

Unicycles have so many degrees of freedom – it’s hard enough even to lean it against something static without it falling over, let alone putting yourself on top of one, with no means of support. As opposed to something like juggling, where any idiot can toss a couple of balls and probably catch them, there’s not much of an in-between zone in unicycling: either you can or you can’t. It’s always a pleasure to hand your unicycle to a loud-mouth passerby who comments how it can’t be that hard, etc.

A funny thing about learning to unicycle though is that you can have two people, neither of whom can unicycle alone, and put them side-by-side holding hands or shoulders, and the two can ride. It’s a great way to get beginners riding and learning together, and in a way it’s remarkable – neither person can ride alone, but they can reliably do it together without much trouble. The reason is that while they’re both having a hard time and are very often unbalanced or somewhat out of control, they’re very rarely falling in the same direction at the same time.

I think that’s a great analogy for how people at a startup can support one another in the very early stages. There are so many degrees of freedom – so many ways in which things can go wrong, from the mundane to the spectacular. Most apt, you may be low on energy or feeling negative or even doomed, but probably not in the same way or at the same time as your partner(s). Together you might each be able to do something that you couldn’t have done otherwise.

I was describing this analogy to Esteve Fernández (not for the first time), Nicholas Tollervey and Jamu Kakar over lunch yesterday, and decided to finally write it up. Without knowing it, Esteve kept me in a positive mood quite a few times as we spent 18 months building FluidDB. Thanks 🙂

« Newer PostsOlder Posts »

Powered by WordPress