Archive for the ‘companies’ Category

Daylight robbery: Barclays skims €170 off a 5K EUR -> GBP transfer

Tuesday, February 5th, 2013

Last month (on Jan 18, 2013) someone I’m doing some work for initiated a transfer of €5,000 into my UK bank account. According to xe.com the mid-market rate that day was 1.1937940679 euros per pound.

So you might innocently expect to receive about 5,000 / 1.1937940679 = £4,188 minus any transfer fees.

The transfer went through an intermediate bank, who charged €17. Barclays charged “our commission” of a mere £6.

But the amount that arrived in my bank was not roughly £4,188 – £20 = £4,168 as you might hope.

The amount that arrived was £4017.74.

The friendly banks decided that the appropriate exchange rate for me that day was 1.23840, which is a full 4.5% higher than the mid-market 1.19379 rate. That’s £143 (€170).

Sure, I know there’s a buy/ask spread in currency and the mid-market rate isn’t what you’d get in any transaction. But taking £143 from your own customer just because you can is pretty fucking nasty. And so, via today’s arbitrary setting of the greed parameter in a bank computer, the voracious banking industry gobbles up just a little bit more of the money made by regular people. People who actually worked to earn that money.

It’s no wonder people hate their banks and that the financial system in general is so despised.

Destructive, invasive, and dangerous behavior by UK ISP TalkTalk (aka StalkStalk)

Wednesday, December 5th, 2012

Today I spent several hours trying to figure out what was going wrong with a web service I’ve been building. The service uses websockets to let browsers and the server send messages to each other over a connection that is held open.

I built and tested the service locally and it worked fine. But when I deployed it to a remote server, the websocket connections were mysteriously dropped 30-35 seconds after they were established. The error messages on the server were cryptic, as were those in the browser. Google came to the rescue, and led me to the eye-opening explanation

It turns out TalkTalk, my ISP, are also fetching the URLs my browser fetches, after a delay of 30-35 seconds! I guess they’re not doing it with all URLs, probably because they figure the sites are “safe” and not full of content that might be deemed objectionable. All the accesses come from a single IP address (62.24.252.133), and if you block that address or otherwise deny its connection attempts, the websocket problem goes away immediately.

Dear TalkTalk – this is a very bad idea. Here are 3 strong reasons why:

  1. First of all, you’re breaking the web for your own customers, as seen above. When your customers try to use a new start-up service based on websockets, their experience will be severely degraded, perhaps to the point where the service is unusable. Time and money will be spent trying to figure out what’s going on, and people will not be happy to learn that their ISP is to blame.
  2. Second, there’s a real privacy issue here. I don’t really want to go into it, but I don’t trust my ISP (any ISP) to securely look after data associated with my account, let alone all the web content I look at. I have Google web history disabled. I don’t want my ISP building up a profile of what the people in this house look at online. There’s a big difference between recording the URLs I go to and actually retrieving their content.
  3. Third, it’s downright dangerous. What if I were controlling a medical device via a web interface and TalkTalk were interfering by killing my connections or by replaying my requests? What if there was a security system or some other sensitive controller on the other end? There’s no way on earth TalkTalk should be making requests with unknown effects to an unknown service that they have not been authorized to use. The TalkTalk legal team should consider this an emergency. Something is going to break, perhaps with fatal consequences, and they are going to get sued.

If you want to read more opinions on this issue, try Googling TalkTalk 62.24.252.133. Lots of people have run into this problem and are upset for various reasons.

See also: Phorm.

Apple channeling Microsoft?

Tuesday, February 1st, 2011

Image by Lara64

Apple’s behavior, as described today in the New York Times and in Ars Technica reminds me of Microsoft building MSIE into Windows. When that happened, other browser manufacturers cried foul. They argued that this was bundling, that few people would want to use a non-native browser, and that Microsoft was using its platform monopoly to tilt the browser playing field.

Here again we have a vendor (Apple), with an operating system platform (iOS), with a piece of extremely valuable functionality (the App Store) built in by the vendor, who are now strong-arming others writing applications for the platform into always offering access through their functionality. That all reminds me of Microsoft.

While it might now be difficult to think of the iPhone without the App Store, the iPhone existed for 18 months before the App Store came along: the iPhone was released in January 2007, the App Store in July 2008. Windows and MSIE also started life as independent entities; it was about 2 years before they were fused and optimistically declared inseparable.

The two cases are obviously not the same in detail, but I find the similarities striking and thought-provoking.

Just for fun, imagine a court case aimed at forcing Apple to make their App Store separable from their operating system platform. To allow others to build their own app stores. To give the user the choice to install/uninstall whatever app stores they liked. Imagine Apple claiming that such a separation is technically impossible and that the App Store is fundamental to the iPhone experience.

Couldn’t possibly happen, right?

Faulkner on splendid failure

Wednesday, September 29th, 2010

I always enjoy running across writing that is not about entrepreneurialism but which seems directly relevant. A couple of snippets that I’ve blogged before are The entrepreneurial spirit in literature (from Conrad‘s Heart of Darkness) and Orwell on T. S. Eliot and the path from existential angst to serial entrepreneur.

Here’s another. It’s Faulkner’s address upon receiving the National Book Award for fiction in 1955. Taken from William Faulkner Essays, Speeches & Public Letters. Random House 1965, pp 143-5.

It makes me think about what I consider Faulkner’s crowning masterpiece, Absalom, Absalom! and the effort that must have gone into its creation. It also puts me in mind of Tim O’Reilly’s exhortation to entrepreneurs to “work on stuff that matters”.

By artist I mean of course everyone who has tried to create something which was not here before him, with no other tools and material than the uncommerciable ones of the human spirit; who has tried to carve, no matter how crudely, on the wall of that final oblivion beyond which he will have to pass, in the tongue of the human spirit ‘Kilroy was here.’

That is primarily, and I think in its essence, all that we ever really tried to do. And I believe we will all agree that we failed. That what we made never quite matched and never will match the shape, the dream of perfection which we inherited and which drove us and will continue to drive us, even after each failure, until anguish frees us and the hand falls still at last.

Maybe it’s just as well that we are doomed to fail, since, as long as we do fail and the hand continues to hold blood, we will try again; where, if we ever did attain the dream, match the shape, scale that ultimate peak of perfection, nothing would remain but to jump off the other side of it into suicide. Which would not only deprive us of our American right to existence, not only inalienable but harmless too, since by our standards, in our culture, the pursuit of art is a peaceful hobby like breeding Dalmations, it would leave refuse in the form of, at best indigence and at worst downright crime resulting from unexhausted energy, to be scavenged and removed and disposed of. While this way, constantly and steadily occupied by, obsessed with, immersed in trying to do the impossible, faced always with the failure which we decline to recognize and accept, we stay out of trouble, keep out of the way of the practical and busy people who carry the burden of America.

So all are happy—the giants of industry and commerce, and the manipulators for profit or power of the mass emotions called government, who carry the tremendous load of geopolitical solvency, the two of which conjoined are America; and the harmless breeders of the spotted dogs (unharmed too, protected, immune in the inalienable right to exhibit our dogs to one another for acclaim, and even to the public too; defended in our right to collect from them at the rate of five or ten dollars for the special signed editions, and even at the rate of thousands to special fanciers named Picasso or Matisse).

Then something like this happens—like this, here, this afternoon; not just once and not even just once a year. Then that anguished breeder discovers that not only his fellow breeders, who must support their mutual vocation in a sort of mutual desperate defensive confederation, but other people, people whom he had considered outsiders, also hold that what he is doing is valid. And not only scattered individuals who hold his doings valid, but enough of them to confederate in their turn, for no mutual benefit of profit or defense but simply because they also believe it is not only valid but important that man should write on the wall ‘Man was here also A.D. 1953, or ’54 or ’55’, and so go on record like this this afternoon.

To tell not the individual artist but the world, the time itself, that what he did is valid. That even failure is worth while and admirable, provided only that the failure is splendid enough, the dream splendid enough, unattainable enough yet forever valuable enough, since it was of perfection.

So when this happens to him (or to one of his fellows; it doesn’t matter which one, since all share the validation of the mutual devotion) the thought occurs that perhaps one of the things wrong with our country is success. That there is too much success in it. Success is too easy. In our country a young man can gain it with no more than a little industry. He can gain it so quickly and easily that he has not had time to learn the humility to handle it with, or even to discover, realise, that he will need humility.

At what point does an Amazon EC2 reserved instance become worth it?

Friday, January 8th, 2010

If you purchase an Amazon EC2 reserved instance, you’ll pay a certain amount up front (pricing). If you don’t use the instance much, it will be more expensive per hour than a regular on-demand instance. E.g., if you paid $227.50 to reserve a small instance for a year but then only used it for a single day, you’d be paying almost $10/hr and it would obviously be much cheaper to just get an on-demand instance and pay just 8.5 cents per hour.

OTOH, if you ran a small instance for a year at the on-demand price, you’d pay $745 and it would obviously be cheaper to pay the up-front reservation price ($227.50) plus a year of the low per-hour pricing (365 * 24 * $0.03), or $490.

So for how long do you have to run an instance in order for it to be cheaper to pay for a reserved instance? (Note that I’m ignoring the time value of money, what you might do with the up-front money in the meantime if you didn’t give it to Amazon in advance, etc.)

The answer is pretty simple: for a one-year reservation you need to run the instance for about 6 months to make it worthwhile. For a three-year reservation you need to run the instance for at least 3 months per year, on average.

Here’s a fragment from a simple spreadsheet I made, based on the US N. Virginia prices:

ec2

Fault-tolerant Python Twisted classes for getting all Twitter friends or followers

Thursday, October 22nd, 2009

It’s been forever since I blogged here. I just wrote a little Python to grab all of a user’s friends or followers (or just their user ids). It uses Twisted, of course. There were two main reasons for doing this: 1) I want all friends/followers, not just the first bunch returned by the Twitter API, and 2) I wanted code that is fairly robust in the face of various 50x HTTP errors (I regularly experience INTERNAL_SERVER_ERROR, BAD_GATEWAY, and SERVICE_UNAVAILABLE).

If you want to use the code below and you’re not familiar with the Twitter API, consider whether you can use the FriendsIdFetcher and FollowersIdFetcher classes as they’ll do far fewer requests (you get 5000 results per API call, instead of 100). If you can live with user ids and do the occasional fetch of a full user, you’ll probably do far fewer API calls.

For the FriendsFetcher and FollowersFetcher classes, you get back a list of dictionaries, one per user. For FriendsIdFetcher and FollowersIdFetcher you get a list of Twitter user ids.

Of course there’s no documentation. Feel free to ask questions in the comments. Download the source.

import sys

from twisted.internet import defer
from twisted.web import client, error, http
   
if sys.hexversion >= 0x20600f0:
    import json
else:
    import simplejson as json

class _Fetcher(object):
    baseURL = ‘http://twitter.com/’
    URITemplate = None # Override in subclass.
    dataKey = None # Override in subclass.
    maxErrs = 10
    okErrs = (http.INTERNAL_SERVER_ERROR,
              http.BAD_GATEWAY,
              http.SERVICE_UNAVAILABLE)
   
    def __init__(self, name):
        assert self.baseURL.endswith(‘/’)
        self.results = []
        self.errCount = 0
        self.nextCursor = -1
        self.deferred = defer.Deferred()
        self.URL = self.baseURL + (self.URITemplate % { ‘name’ : name })

    def _fail(self, failure):
        failure.trap(error.Error)
        self.errCount += 1
        if (self.errCount < self.maxErrs and
            int(failure.value.status) in self.okErrs):
            self.fetch()
        else:
            self.deferred.errback(failure)
       
    def _parse(self, result):
        try:
            data = json.loads(result)
            self.nextCursor = data.get(‘next_cursor’)
            self.results.extend(data[self.dataKey])
        except Exception:
            self.deferred.errback()
        else:
            self.fetch()
           
    def _deDup(self):
        raise NotImplementedError(‘Override _deDup in subclasses.’)

    def fetch(self):
        if self.nextCursor:
            d = client.getPage(self.URL + ‘?cursor=%s’ % self.nextCursor)
            d.addCallback(self._parse)
            d.addErrback(self._fail)
        else:
            self.deferred.callback(self._deDup())
        return self.deferred

class _FriendsOrFollowersFetcher(_Fetcher):
    dataKey = u‘users’
   
    def _deDup(self):
        seen = set()
        result = []
        for userdict in self.results:
            uid = userdict[‘id’]
            if uid not in seen:
                result.append(userdict)
                seen.add(uid)
        return result

class _IdFetcher(_Fetcher):
    dataKey = u‘ids’
   
    def _deDup(self):
        # Keep the ids in the order we received them.
        seen = set()
        result = []
        for uid in self.results:
            if uid not in seen:
                result.append(uid)
                seen.add(uid)
        return result

class FriendsFetcher(_FriendsOrFollowersFetcher):
    URITemplate = ‘statuses/friends/%(name)s.json’

class FollowersFetcher(_FriendsOrFollowersFetcher):
    URITemplate = ‘statuses/followers/%(name)s.json’

class FriendsIdFetcher(_IdFetcher):
    URITemplate = ‘friends/ids/%(name)s.json’

class FollowersIdFetcher(_IdFetcher):
    URITemplate = ‘followers/ids/%(name)s.json’
 

Usage is dead simple:

fetcher = FriendsFetcher(‘terrycojones’)
d = fetcher.fetch()
d.addCallback(….) # etc.
 

Enjoy.

Python code for retrieving all your tweets

Wednesday, June 24th, 2009

Here’s a little Python code to pull back all a user’s Twitter tweets. Make sure you read the notes at bottom in case you want to use it.

import sys, twitter, operator
from dateutil.parser import parse

twitterURL = ‘http://twitter.com’

def fetch(user):
    data = {}
    api = twitter.Api()
    max_id = None
    total = 0
    while True:
        statuses = api.GetUserTimeline(user, count=200, max_id=max_id)
        newCount = ignCount = 0
        for s in statuses:
            if s.id in data:
                ignCount += 1
            else:
                data[s.id] = s
                newCount += 1
        total += newCount
        print >>sys.stderr, "Fetched %d/%d/%d new/old/total." % (
            newCount, ignCount, total)
        if newCount == 0:
            break
        max_id = min([s.id for s in statuses])1
    return data.values()

def htmlPrint(user, tweets):
    for t in tweets:
        t.pdate = parse(t.created_at)
    key = operator.attrgetter(‘pdate’)
    tweets = sorted(tweets, key=key)
    f = open(‘%s.html’ % user, ‘wb’)
    print >>f, """<html><title>Tweets for %s</title>
    <meta http-equiv="
Content-Type" content="text/html;charset=utf-8">
    <body><small>"
"" % user
    for i, t in enumerate(tweets):
        print >>f, ‘%d. %s <a href="%s/%s/status/%d">%s</a><br/>’ % (
            i, t.pdate.strftime(‘%Y-%m-%d %H:%M’), twitterURL,
            user, t.id, t.text.encode(‘utf8′))
    print >>f, ‘</small></body></html>’
    f.close()
   
if __name__ == ‘__main__’:
    user = ‘terrycojones’ if len(sys.argv) < 2 else sys.argv[1]
    data = fetch(user)
    htmlPrint(user, data)
 

Notes:

Fetch all of a user’s tweets and write them to a file username.html (where username is given on the command line).

Output is to a file instead of to stdout as tweet texts are unicode and sys.stdout.encoding is ascii on my machine, which prevents printing non-ASCII chars.

This code uses the Python-Twitter library. You need to get (via SVN) the very latest version, and then you need to fix a tiny bug, described here. Or wait a while and the SVN trunk will be patched.

This worked flawlessly for my 2,300 tweets, but only retrieved about half the tweets of someone who had over 7,000. I’m not sure what happened there.

There are tons of things that could be done to make the output more attractive and useful. And yes, for nitpickers, the code has a couple of slight inefficiencies :-)

2 cents

Friday, June 5th, 2009

My bank account hits rock bottom, at 2 cents, while building Fluidinfo.

634761-bbva-highlighted

Loose cannon

Monday, January 26th, 2009

Today I referred to myself as a loose cannon to Esteve. Tonight I recalled describing a former boss that way in an email (company name obscured with xxxxx). Here it is:

i thought
hey, hold on
where the hell is quality control?
and i knew i was it
but still it’s so tiring to try to stop her
and almost impossible to make her listen
and understand
and my energy for that is limited
plus it’s just amusing to watch her rocketing along
read this, read this, read this
passing you papers printed at semi-random
from the web

  
to me
the phrase ‘loose cannon’
can be perfectly applied to her
she’s rolling around on the xxxxx deck
(wheel in deep sea fishing analogy here)
our most powerful weapon
at once capable of taking out a whole fleet of enemy ships
but also equally capable of shooting down the mast
blasting the crew
taking out the sail
firing on the powder room
and just generally causing any number
of unpredictable and inevitably high-impact results

 i know
we have to be the leather belts
that strap her to the deck
as much fun as it is to see her rolling about,
the recoil from one explosion
spinning her randomly to point at her next target
all further confused by the rolling waves
and her small metal wheels

 confused and terrified sailors
in blue-striped togs
run scrambling over the sea-slapped decks
holding their heads their ears
the ship’s wheel
half shot away by a stray ball
spins wildly
as we broach another peak

  
all hands on deck!

FluidDB domain names available early (and free) for Twitter users

Saturday, January 24th, 2009

Sometime in the next few months, Fluidinfo will launch an alpha version of FluidDB, the database with the heart of a wiki. It’s a big engineering task, and there will still be a lot to do when we go into alpha, so we’ll initially only have a small number of applications being built on FluidDB.

But that doesn’t mean you can’t get into the action early.

Starting today, we’re pleased to offer FluidDB domains for free to Twitter users. This is perhaps the simplest way you’ll ever sign up for a new web service – if you’re a Twitter user:

Simply follow FluidDB on Twitter.

Yes, that’s it. You’re done.

Later, when we create your FluidDB domain, we’ll send you your FluidDB password via a direct message in Twitter. Note that we haven’t asked for your real name, your email, a password, sent you a cookie, or asked you to fill out a pesky sign-up form. The point here is simply to give you an early opportunity to trivially claim your preferred name.

Feel free to tweet the URL of this posting (http://bit.ly/bezc). You can follow me too for extra credit. If you’re not already a Twitter user and you want a free FluidDB domain name, sign up for Twitter, and then follow FluidDB.

Mini FAQ:

Why would I do this? By following FluidDB you will reserve your (Twitter) user name as your domain name in FluidDB.

Is there any charge? No.

What is a FluidDB domain? Sorry, but you’ll have to wait to find out the answer to this. We can tell you though that FluidDB domains will have many uses, and that they wont all be free.

What if I change my mind? Just unfollow FluidDB on Twitter.

Why Twitter? Because we like Twitter. We may do a similar thing for other services, allowing users to later claim their domain via OpenID, but that introduces the potential of naming conflicts.

Finally, please note that we can’t give an iron-clad guarantee that you’ll get your Twitter user name as your FluidDB domain name, but we’ll do our best. At this early stage of the game, we reserve the right to do whatever we want :-)

Who signed up for Twitter immediately before/after you?

Wednesday, January 14th, 2009

This is just a quick hack, done in about 20 minutes in 32 lines of Python. The following script will print out the Twitter screen names of the people who signed up immediately before and after a given user.

import sys
from twitter import Api
from operator import add
from functools import partial

inc = partial(add, 1)
dec = partial(add, -1)
api = Api()

def getUser(u):
    try:
        return api.GetUser(u)
    except Exception:
        return None

def do(name):
    user = getUser(name)
    if user:
        for f, what in (dec, ‘Before:’), (inc, ‘After:’):
            i = user.id
            while True:
                i = f(i)
                u = getUser(i)
                if u:
                    print what, u.screen_name
                    break
    else:
        print ‘Could not find user %r’ % name

if __name__ == ‘__main__’:
    for name in sys.argv[1:]:
        do(name)
 

I’m happy to have reached the point in my Python development where I can pretty much just type something like this in without really having to think, including the use of operator.add and functools.partial.

BTW, the users who signed up immediately before and after I did were skywalker and kitu012.

The above is just a hack. Notes:

  1. If it can’t retrieve a user for any reason, it just assumes there is no such user.
  2. Twitter periodically deletes accounts of abusers, so the answer will skip those.
  3. Twitter had lots of early hiccups, so there may be no guarantee that user ids were actually assigned sequentially.
  4. This script may run forever.
  5. I’m using the Python Twitter library written by DeWitt Clinton. It’s been a while since it was updated, and it doesn’t give you back the time a user was created in Twitter. It would be fun to print that too.

As you were.

10,000 things: Andrew Hensel lives (on Twitter)

Monday, January 5th, 2009

Andrew Hensel was an extraordinary human being.

We were graduate students together at The University of Waterloo in Canada in 1986-88. I met him on my first day there and we spent many hours together on a daily basis over the next 2.5 years. I don’t want to try to say too much about him now. It occurred to me a few days ago that I might post a few stories here. We did lots of crazy things. At one point I had wanted to write something titled “100 things to a Hensel” and I made a bunch of notes, but it went no further.

I wrote about him in my Ph.D. acknowledgments in 1995:

Andrew Hensel, with whom I shared so much of my two and a half years at Waterloo, was the most original and creative person I have ever known well. Together, we dismantled the world and rebuilt it on our own crazy terms. We lived life at a million miles an hour and there was nothing like it. Five years ago, Andrew killed himself. There have been few days since then that I have not thought of him and the time we spent together.

I still think about him frequently. Today I was remembering one of his many, many oddball projects (most of which went unfinished), which he called “10,000 things”. It was to be a list of 10,000 things that he thought of. By the time he started sending them to me we had both dropped out of Waterloo. He was back in Australia and I was in Munich.

He only sent me 300 of the to-be 10,000. Of course I still have them. They’re all very short. At the risk of being thought macabre I’ve decided to bring Andrew back a very little and post them to Twitter, chosen at random, one a day. You can follow adhensel to get just a glimpse of his mind. The first tweet, people being planted into earth, is already up.

There are at least half a dozen twitterers who knew Andrew, including one who knew him probably better than anybody. Once in a while I get email from someone who finds my online mentions of him. Invariably they also found him extraordinary.

What would Andrew have made of Twitter? I have no doubt at all that he’d have immediately dismissed it as “weak”. That was one of his favorite adjectives. Almost everything was weak. It’s a small miracle to me to partly bring him back to life 18 years after he died, by posting just some of his 10,000 things to Twitter.

And… my apologies to anyone who knew Andrew and who finds this upsetting.

Not alone

Friday, December 5th, 2008

Robert Scoble has just written a really nice article about Fluidinfo, calling us both “world-changing” and “unfundable”. Funnily, Tim O’Reilly said something similar when I talked to him at OATV. He said something like: “This could take over the world” and in the very same sentence “but I don’t see how we could fund you.” The two things an entrepreneur most and least wants to hear, all in one sentence. I’ll never forget it.

A few people have mailed me to say that the Scoble videos create the incorrect impression that I’m building FluidDB alone. So I wanted to clear that up. Others who are actively involved in Fluidinfo are:


Esteve Fernandez is doing the most difficult coding. Esteve and I are the only two employees of the company. We even have modest salaries. We spend most of our time apart, writing code, swapping email. Once or twice a week we meet in person to talk about architecture, current problems, or for him to gently explain to me how I could have written my code more elegantly and usefully. I usually try to stay out of his way, as he’s a force of nature and I just slow him down. He left a solid and secure job that he liked in Barcelona and then said no to Google to join Fluidinfo.

Esther Dyson invested in Fluidinfo just over a year ago. Esther is an incredible investor to have involved for a company like Fluidinfo. I wont try to summarize, except to say that without her support we probably wouldn’t be here today. After a year of trying to find investors, I’m more keenly aware than ever of how extraordinary Esther is.

Russell Manley is the other company director. It was Russell who pointed Delicious out to me a few years ago and got me back onto working on this project after I’d put it aside for 6 years. Russell is a finance guy with a ton of experience in operations and running companies. He’s an investment director at Land Securities in London, and sits on over 30 boards. He’s also a close friend, incredibly smart, and widely read. I hope one day we’ll be able to get him into Fluidinfo, though that will take some doing.

Nicholas Radcliffe is an old friend and advisor. He’s the founder and CEO of Stochastic Solutions. He was also a founder, CTO, and then CEO of Quadstone, raising tens of millions of pounds along the way. Quadstone was acquired a couple of years ago. He’s into algorithmic approaches to targeted direct marketing, and he’s very successful. He has a Ph.D. in physics, so you don’t want to mess with Nick. He’s also an advisor to Scottish Equity Partners. Nick is my harshest and most unrelenting critic.

That’s it for now. There are probably a dozen others who are peripherally involved, but not on a day-to-day basis. I’m very happy to have just two people on payroll right now. We’re pretty much recession proof. I went through the 2000-2004 as CTO of Eatoni in New York, and we survived by cutting every possible cost and keeping our headcount as low as possible. So operating on a shoestring comes pretty naturally. I feel we’re strong and small like a hard nut, and not really exposed to the economic downturn. It’s a great time to be tiny and to be focussed on building a product.

It would of course be nice to be properly funded. But I’ve always been confident that’s just a matter of time. The main thing, perhaps the only thing, is to get an alpha version of FluidDB released so people can start building things on it.

Twittendipity: a chance interview with Robert Scoble

Thursday, December 4th, 2008

On Monday Tim O’Reilly posted a Twitter tweet suggesting to Robert Scoble that he contact me while in Barcelona.

First off, Tim is very generous in doing this. He’s ultra connected and he spends a significant amount of his time in Twitter pointing things out, connecting people, and re-tweeting stuff he finds interesting. Re-tweeting is really important because when you tweet you only reach the people who are already following you. But when someone re-tweets you, you reach new people who likely have no idea of your existence. And when Tim does the re-tweeting there can be a big impact. 24 hours after his message to Robert I had 50 new followers. Tim explicitly tries to help people doing things he finds interesting, but who have just a small number of Twitter followers. He filters and amplifies information, broadcasting it out to his 16,000+ followers. Robert was in a hotel about 10 minutes’ walk from my place and I had no idea. A mutual friend in California noticed and took a minute to connect us. That’s really something, and it perfectly illustrates some of the value of Twitter.

I met Robert yesterday afternoon and we spent 6 hours together. It was great. You can see at once why he’s been so successful: he’s smart, he’s thoughtful, he’s sympathetic, and he’s a careful listener. I had no idea what to expect, and seeing as what we’re building can take some time to sink in, I wondered what sort of an audience he’d be.

After we’d climbed around up in the Sagrada Familia (official site, wikipedia), Robert came back to my place to see a demo of the things I’d been describing. We sat down and he pulled out his cell phone and asked if he could film me. I didn’t really think about it and said of course. It didn’t dawn on me that we were doing an informal interview, and I was totally unprepared – which is probably a good thing.

In the end we filmed 4 segments: parts one, two, three, and four. There’s also been some discussion here on Robert’s FriendFeed page.

So if you’ve been wondering what we’re building in here, go watch the videos.

I had no idea all this was about to come down. The Fluidinfo web site (a generous word) was a single page with no contact information, no nuthin’. We simply haven’t needed a web site of any description yet. I went and added a box so you can sign up to receive news of the alpha launch.

And then there was this, posted on Twitter, and which I have absolutely no shame in reproducing (this is a blog, after all):

Wow, what @terrycojones showed me last night (a new kind of database that he’s been workng on for 11 years) blew me away. Uploading vids now

Now I have to put my head back down with Esteve to get the alpha out the door ASAP.

Amazon SimpleDB a complete flop?

Tuesday, December 2nd, 2008

Today Amazon slashed the price on storage in SimpleDB from $1.50 per Gb per month to just $0.25 per Gb per month.

Note that you can buy a 1TB hard drive these days for $75. That’s 7.5 cents per Gb for as long as the drive lasts. So Amazon were charging 200 times the price of retail hard disk storage per month. Yes, the AWS storage is replicated, and you don’t need a data center or employees, but a 200X markup (per month) seemed a bit excessive. Until last night, that $1.50 figure was the first price in the pricing section of the SimpleDB page – not a smart move (sticker shock). The storage price is now the last thing in the pricing section.

I spend a bunch of time talking to folks working at other startups. I hear about EC2 and S3 usage all the time, but I’ve never heard of anyone using SimpleDB. I hadn’t really thought about it too much. I had noticed that the price for storage in SimpleDB is (was) 10 times higher than for storage in S3, and thought that created an opportunity for Fluidinfo. But that huge difference is now gone – in fact SimpleDB is now free for everyone for the first 6 months following the public beta.

I found myself asking “What’s going on?” It’s not like Amazon to suddenly offer their services for free. The free offer coming with the service entering beta seemed pretty thin. If anything it should get more expensive, or stay the same, not suddenly transition to free.

Then I began to explicitly wonder just how many people are actually using SimpleDB. So I just ran some sample Google queries to get an idea. The results are amazing:

Query # Hits
“using amazon simpleDB” 68
“using simpleDB” 1010
“simpleDB sucks” 3
“love simpleDB” 1
“hate simpleDB” 0
“recommend simpleDB” 0
“we are using simpleDB” 0
“we are using amazon simpleDB” 0
“we use amazon simpleDB” 1
“we use simpleDB” 4

Note that all queries are entered into Google in quotes.

Given just these results, and knowledge that SimpleDB was launched a year ago, I think you’d have to conclude that SimpleDB is a complete flop. Either that or Google is playing evil tricks due to their own appEngine offering. That would seem unlikely. Plus, the numbers for the obviously popular S3 and EC2 are much much higher: If you try these queries with S3 or EC2 instead of SimpleDB, you’ll see 5K, 10K, 15K results.

I find the above numbers astounding. I’m deadly curious to know what’s going on here. Was SimpleDB just too expensive to consider using? Is its model too awkward? If it sucked, people would say so. But there’s virtually nothing out there. It’s as though developers took one look and completely ignored it. That would be my guess (in fact it’s what I did, so I’m probably biased in my explanation of what others may have done).

At least we can say that more people love SimpleDB than hate it :-)

It’s not my intention to bash Amazon or AWS. I love and use S3 and EC2 every single day. They’ve changed the world, and this is only the beginning. But I have no use at all for SimpleDB. I’d always assumed it was a big success too, but it looks like that may be wrong.

Comments very welcome. Do you know anyone using SimpleDB?

Changing POV under Twitter

Wednesday, November 26th, 2008

One thing I’d like to be able to do in Twitter is change my point of view. That is, see what Twitter looks like from the POV of another user.

Given Twitter’s asymmetric follower model and the prevalence of @ messaging, it’s very common to run across a fragment of a conversation that seems potentially interesting. It’s also common not to be following the full set of people who are interacting.

For example, four people might be exchanging tweets on a subject, and you may follow just one of them. So you’ll see roughly one quarter of the thread. Right now, to get the context for the discussion you need to go take a look at the archives of the various people and try to piece the conversation together. You have to do this one tweeter at a time. Or you could temporarily follow the people involved and then page backwards through time to see the flow of tweets. With some work on the server side, Twitter could let you see this using the Twitter search interface (you’d need to put in the names of the various parties though).

It would be much simpler and much cooler to just to click a link besides a user’s name and get that user’s POV. You’d see what they see, except for the people whose tweets are private and which you’re prevented from seeing by the Twitter permission system. Not only could you see more or all of a conversation, I bet it would be really interesting to see Twitter from someone else’s POV. You could click on the @Replies tab to see all replies to that user, etc. There’s no reason why not – it’s all public data, and you can easily fetch the @replies using the search interface. I think wandering around inside the Twitterverse jumping from the POV of one identity to another would be fascinating. It reminds me of wandering around inside the wayback machine, except it’s the present.

That would all be pretty easy to implement, even for a 3rd party using the Twitter API. It would be nice if Twitter were to implement it themselves. I could do the basics myself in a few hours, but I’d rather not. This is also something that could be accessed via a Firefox extension or Greasemonkey – install it and get an extra button next to every tweet. The button switches you to the POV of the tweeter.

All we need is someone to build it.

I have several more Twitter blog posts I’d love to write. The most interesting, to me, is all about evolutionary biology, sex, and the meaning of life itself. But no time, no time. I’ve finally added a Twitter category to this blog, and was surprised to find 14 posts that fit it. Am I obsessed?

As usual, make sure you follow me :-)

Passion and the creation of highly non-uniform value

Monday, November 10th, 2008

Here, finally, are some thoughts on the creation of value. I don’t plan to do as good a job as the subject merits, but if I don’t take a rough stab at it, it’ll never happen.

I’ll first explain what I mean by “the creation of highly non-uniform value”. I’m talking about ideas that create a lot of (monetary) value for a very small number of people. If you made a graph and on the X axis put all the people in the world, in sorted order of how much they make from an idea, and on the Y axis you put value they each receive, we’re talking about distributions that look like the image on the right.

In other words, a setting in which a very small number of people try to get extremely rich. I.e., startup founders, a few key employees, their investors, and their investors’ investors. BTW, I don’t want to talk about the moral side of this, if there is one. There’s nothing to stop the obscenely rich from giving their money away or doing other charitable things with it.

So let’s just accept that many startup founders, and (in theory) all venture investors, are interested in turning ideas into wealth distributions that look like the above.

I was partly beaten to the punch on this post by Paul Graham in his essay Why There Aren’t More Googles? Paul focused on VC caution, and with justification. But there’s another important part of the answer.

One of the most fascinating things I’ve heard in the last couple of years is an anecdote about the early Google. I wrote about it in an earlier article, The blind leading the blind:

…the Google guys were apparently running around search engine companies trying to sell their idea (vision? early startup?) for $1M. They couldn’t find a buyer. What an extraordinary lack of.. what? On the one hand you want to laugh at those idiot companies (and VCs) who couldn’t see the huge value. OK, maybe. But the more extraordinary thing is that Larry Page and Sergei Brin couldn’t see it either! That’s pretty amazing when you think about it. Even the entrepreneurs couldn’t see the enormous value. They somehow decided that $1M would be an acceptable deal. Talk about a lack of vision and belief.

So you can’t really blame the poor VCs or others who fail to invest. If the founding tech people can’t see the value and don’t believe, who else is going to?

I went on to talk about what seemed like it might be a necessary connection between risk and value.


Image: Lost Tulsa

Following on…

After more thought, I’m now fairly convinced that I was on the right track in that post.

It seems to me that the degree to which a highly non-uniform wealth distribution can be created from an idea depends heavily on how non-obvious the value of the idea is.

If an idea is obviously valuable, I don’t think it can create highly non-uniform wealth. That’s not to say that it can’t create vast wealth, just that the distribution of that wealth will be more widely spread. Why is that the case? I think it’s true simply because the value will be apparent to many people, there will be multiple implementations, and the value created will be spread more widely. If the value of an idea is clear, others will be building it even as you do. You might all be very successful, but the distribution of created value will be more uniform.

Obviously it probably helps if an idea is hard to implement too, or if you have some other barrier to entry (e.g., patents) or create a barrier to adoption (e.g., users getting positive reinforcement from using the same implementation).

I don’t mean to say that an idea must be uniquely brilliant, or even new, to generate this kind of wealth distribution. But it needs to be the kind of proposition that many people look at and think “that’ll never work.” Even better if potential competitors continue to say that 6 months after launch and there’s only gradual adoption. Who can say when something is going to take off wildly? No-one. There are highly successful non-new ideas, like the iPod or YouTube. Their timing and implementation were somehow right. They created massive wealth (highly non-uniformly distributed in the case of YouTube), and yet many people wrote them off early on. It certainly wasn’t plain sailing for the YouTube founders – early adoption was extremely slow. Might Twitter, a pet favorite (go on, follow me), create massive value? Might Mahalo? Many people would have found that idea ludicrous 1-2 years ago – but that’s precisely the point. Google is certainly a good example – search was supposedly “done” in 1998 or so. We had Alta Vista, and it seemed great. Who would’ve put money into two guys building a search engine? Very few people.

If it had been obvious the Google guys were doing something immensely valuable, things would have been very different. But they traveled around to various companies (I don’t have this first hand, so I’m imagining), showing a demo of the product that would eventually create $100-150B in value. It wasn’t clear to anyone that there was anything like that value there. Apparently no-one thought it would be worth significantly more that $1M.

I’ve come to the rough conclusion that that sort of near-universal rejection might be necessary to create that sort of highly non-uniform wealth distribution.

There are important related lessons to be learned along these lines from books like The Structure of Scientific Revolutions and The Innovator’s Dilemma.

Now back to Paul’s question: Why aren’t there more Googles?

Part of the answer has to be that value is non-obvious. Given the above, I’d be willing to argue (over beer, anyway) that that’s almost by definition.

So if value is non-obvious, even to the founders, how on earth do things like this get created?

The answer is passion. If you don’t have entrepreneurs who are building things just from sheer driving passion, then hard projects that require serious energy, sacrifice, and risk-taking, simply wont be built.

As a corollary, big companies are unlikely to build these things – because management is constantly trying to assess value. That’s one reason to rue the demise of industrial research, and a reason to hope that cultures that encourage people to work on whatever they want (e.g., Google, Microsoft research) might be able to one day stumble across this kind of value.

This gets me to a recent posting by Tim Bray, which encourages people to work on things they care about.

It’s not enough just to have entrepreneurs who are trying to create value. As I’m trying to say, practically no-one can consistently and accurately predict where undiscovered value lies (some would argue that Marc Andreessen is an exception). If it were generally possible to do so, the world would be a very different place – the whole startup scene and venture/angel funding system would be different, supposing they even existed. Even if it looks like a VC or entrepreneur can infallibly put their finger on undiscovered value, they probably can’t. One-time successful VCs and entrepreneurs go on to attract disproportionately many great companies, employees, funding, etc., the next time round. You can’t properly separate their raw ability to see undiscovered value from the strong bias towards excellence in the opportunities they are later afforded. Successful entrepreneurs are often refreshingly and encouragingly frank about the role of luck in their success. They’re done. VCs are much less sanguine – they’re supposed to have natural talent, they’re trying to manufacture the impression that they know what they’re doing. They have to do that in order to get their limited partners to invest in their funds. For all their vaunted insight, roughly only 25% of VCs provide returns that are better than the market. The percentage generating huge returns will of course be much smaller, as in turn will be those doing so consistently. I reckon the whole thing’s a giant crap shoot. We may as well all admit it.

I have lots of other comments I could make about VCs, but I’ll restrict myself to just one as it connects back to Paul’s article.

VCs who claim to be interested in investing in the next Google cannot possibly have the next Google in their portfolio unless they have a company whose fundamental idea looks like it’s unlikely to pan out. That doesn’t mean VCs should invest in bad ideas. It means that unless VCs make bets on ideas that look really good – but which are e.g., clearly going to be hard to build, will need huge adoption to work, appear to be very risky long-shots, etc. – then they can’t be sitting on the next Google. It also doesn’t mean VCs must place big bets on stuff that’s highly risky. A few hundred thousand can go a long way in a frugal startup.

I think this is a fundamental tradeoff. You’ll very frequently hear VCs talk about how they’re looking for companies that are going to create massive value (non-uniformly distributed, naturally), with massive markets, etc. I think that’s pie in the sky posturing unless they’ve already invested in, or are willing to invest in, things that look very risky. That should be understood. And so a question to VCs from entrepreneurs and limited partners alike: if you claim to be aiming to make massive returns, where are your necessary correspondingly massively risky investments? Chances are you wont find any.

There is a movement in the startup investment world towards smaller funds that make smaller investments earlier. I believe this movement is unrelated to my claim about non-obviousness and highly non-uniform returns. The trend is fuelled by the realization that lots of web companies are getting going without the need for traditional levels of financing. If you don’t get in early with them, you’re not going to get in at all. A big fund can’t make (many) small investments, because their partners can’t monitor more than a handful of companies. So funds that want to play in this area are necessarily smaller. I think that makes a lot of sense. A perhaps unanticipated side effect of this is that things that look like they may be of less value end up getting small amounts of funding. But on the whole I don’t think there’s a conscious effort in that direction – investors are strongly driven to select the least risky investment opportunities from the huge number of deals they see. After all, their jobs are on the line. You can’t expect them to take big risks. But by the same token you should probably ignore any talk of “looking for the next Google”. They talk that way, but they don’t invest that way.

Finally, if you’re working on something that’s being widely rejected or whose value is being widely questioned, don’t lose heart (instead go read my earlier posting) and don’t waste your time talking to VCs. Unless they’re exceptional and serious about creating massive non-uniformly distributed value, and they understand what that involves, they certainly wont bite.

Instead, follow your passion. Build your dream and get it out there. Let the value take care of itself, supposing it’s even there. If you can’t predict value, you may as well do something you really enjoy.

Now I’m working hard to follow my own advice.

I had to learn all this the hard way. I spent much of 2008 on the road trying to get people to invest in Fluidinfo, without success. If you’re interested to know a little more, earlier tonight I wrote a Brief history of an idea to give context for this posting.

That’s it for now. Blogging is a luxury I can’t afford right now, not that I would presume to try to predict which way value lies.

Expecting and embracing startup rejection

Sunday, November 9th, 2008

When I was younger, I didn’t know what to make of it when people rejected my ideas. Instead of fighting it, trying again, or improving my delivery, I’d just conclude that the rejector was an idiot, and that it was their loss if they didn’t get it.

For example, I put considerable time and effort into writing academic papers, several of which were rejected, to my surprise. I’d never considered that the papers might not be accepted. When this happened, I wouldn’t re-submit them or try to re-publish them. By then I would usually have moved on to doing something else anyway.

When I applied for jobs, it never entered my mind that I might not be wanted. How could anyone not want me? After a couple of years working on my current ideas, I applied for a computer science faculty position at over 40 US universities. I refused to emphasize my well-received and published Ph.D. work, of which I was and am still proud, because I was no longer working in that area.

I was convinced the new ideas would be recognized as being strong.

But guess what? I was summarily rejected by all 40+ universities. I only got one interview, at RPI. No other school even wanted to meet me. I kept all the rejection letters. I still have them. (Amusingly, I was swapping emails with Ben Bederson earlier this year and it transpired that he’d had the same experience, also with 40 universities, and he too kept all his rejection letters!)

You never learn more than when you’re being humbled.

I’ve now returned to those same ideas and have been working on them for the last 3 years. In January 2007 I went and met with a couple of the most appropriately visionary VCs to tell them what I was building. I was naïve enough to think they might back me at that early point. Wrong. They suggested I come back with a demo to concretely illustrate what the system would allow people to do. That was easier said than done – the system is not simple. I spent 2007 building the core engine, a 90% fully-functional demo of the major application, several smaller demo apps (including a Firefox toolbar extension built by Esteve Fernandez), and added about 20 sample data sets to further illustrate possibilities.

That’ll show ‘em, right? I went out in November 2007 armed with all this, and began talking to a variety of potential investors. I was sure VCs would be falling over themselves to invest, especially given that we were working on some mix of innovative search, cloud computation, APIs, and various Web 2.0 concepts, and that tons of VCs claimed to be looking for the Next Big Thing in search, and for Passionate Entrepreneurs tackling Hard Problems who wanted to build Billion Dollar Companies, etc., etc.

You guessed it. Over the next year literally dozens of potential investors all said no. The demo wasn’t enough. Would people use it? Could we build the real thing? Would it scale? Where was the team? What are you doing in Barcelona? “Looks fascinating, do please let us know when you’ve released it and are seeing adoption,” they almost universally told me. The standout exception to this was Esther Dyson, who agreed to invest immediately after seeing the demo, and whose courage I hope I can one day richly reward.

What to make of all this rejection?

One thing that became clear is that if you’re smarter than average, you’ll almost by definition be constantly thinking of things too early. Maybe many years too early. Your ideas will seem strange, doubtful, and perhaps plain wrong to many people.

This makes you realize how important timing is.

Being right with an idea too early and trying to build a startup around it is similar to correctly knowing a company is going to fail, and immediately rushing out to short its stock. Even though you’re right, you can be completely wiped out if the stock’s value rises in the short term. You were brilliant, insightful, and 100% correct – but you were too early.

Getting timing right can clearly be partly based on calculation and reason. But given that many startups are driven by founder passion, I think luck in timing plays an extremely important role in startup success. And the smarter and more far-sighted you are, the greater the chance that your timing will be wrong.

So the that’s the first thing to understand: if you’re smarter than average, your ideas will, on average, be ahead of their time. Some level of rejection comes with the territory.

But I’d go much further than that, and claim that if you are not seeing a very high level of rejection in trying to get a new idea off the ground, you’re probably not working on anything that’s going to change the world or be extremely valuable.

That might sound like an outrageous extrapolation (or even wishful thinking, given my history). Later tonight I plan to explain this claim in a post on the connections between passion, value, non-obviousness, and rejection. That’s the subject I really want to write about.

For now though, I simply want to say that I’ve come to understand that having one’s ideas regularly rejected is a good sign. It tells you you’re either on a fool’s errand, or that you’re doing something that might actually be valuable and important.

If you’re not going to let rejection get you down, you might content yourself by learning to ignore it. But you can do better. You can come to regard it as positive and affirming. Without becoming pessimistic or in any way accepting defeat, you can come to expect to be rejected and even to embrace it.

If you can do that, rejection loses its potential for damage. As Paul Graham pointed out, the impact of disappointment can destroy a startup. That’s an important observation, and a part of why startups can be so volatile and such a wild ride.

I don’t mean to suggest that you don’t also do practical things with rejection too – like learn from it. That’s very important and will help you shape your product, thoughts, presentation, expectations, etc. Again, see Paul’s posting.

But I think the mental side of rejection is more important than the practical. The mental side has more destructive potential. You have to figure out how to deal with it. If you look at it the right way you can turn it into something that’s almost by definition positive, as I’ve tried to illustrate.

In a sense I even relish it, and use it for fuel. There are little tricks I sometimes use to keep myself motivated. I even keep a list of them (and no, you can’t see it). One is imagining that some day all the people who rejected me along the way will wring their hands in despair at having missed such an opportunity :-)

I’ve not been universally rejected, of course. There are lots of people who know what we’re doing and are highly supportive (more on them at a later point). If I’d been universally rejected, or rejected by many well-known people whose opinions I value, I probably would have stopped by now.

I’ve had to learn to see a high level of rejection as not just normal but a necessary (but not sufficient!) component of a correct belief that you’re doing something valuable.

Stay tuned for the real point of this flurry of blogging activity.

Twitter’s amazing stickiness (with a caveat)

Friday, October 31st, 2008

I just followed a link to a site that shows the date of the first tweet of 50 early Twitter users. I wondered how many of these early users were still active users, and guessed many would be.

Instead of going and fetching each user’s last tweet by hand, I wrote a little shell script to do all the work:

for name in \
  `curl -s http://myfirsttweet.com/oldest.php |
   perl -p -e ‘s,<a href="http://myfirsttweet.com/1st/(\w+)">,\nNAME:\t$1\n,g’ |
   egrep ‘^NAME:’ |
   cut -f2 |
   uniq`
do
    echo $name \
      `curl -s "http://twitter.com/statuses/user_timeline/$name.xml?count=1" |
       grep created_at |
       cut -f2 -d\> |
       cut -f1 -d\<`
done
 

Who wouldn’t want to be a (UNIX) programmer!?

And the output, massaged into an HTML table:

User Last tweeted on
jack Thu Oct 30 03:41:49 +0000 2008
biz Thu Oct 30 22:24:12 +0000 2008
Noah Tue Oct 28 22:56:15 +0000 2008
adam Thu Oct 30 21:34:56 +0000 2008
tonystubblebine Fri Oct 31 00:53:38 +0000 2008
dom Thu Oct 30 20:36:31 +0000 2008
rabble Fri Oct 31 00:56:28 +0000 2008
kellan Fri Oct 31 00:32:44 +0000 2008
sarahm Thu Oct 30 22:45:37 +0000 2008
dunstan Thu Oct 30 23:59:57 +0000 2008
stevej Fri Oct 31 00:12:03 +0000 2008
lemonodor Thu Oct 30 18:21:43 +0000 2008
blaine Wed Oct 29 23:52:06 +0000 2008
rael Fri Oct 31 01:02:58 +0000 2008
bob Fri Oct 31 00:39:18 +0000 2008
graysky Fri Oct 31 00:23:21 +0000 2008
veen Thu Oct 30 19:47:40 +0000 2008
dens Fri Oct 31 00:13:12 +0000 2008
heyitsnoah Thu Oct 30 20:09:35 +0000 2008
rodbegbie Thu Oct 30 23:42:39 +0000 2008
astroboy Thu Oct 30 22:07:50 +0000 2008
alba Thu Oct 30 16:06:29 +0000 2008
kareem Thu Oct 30 20:20:14 +0000 2008
gavin Thu Oct 30 17:48:45 +0000 2008
nick Fri Oct 31 01:17:29 +0000 2008
psi Thu Oct 30 20:40:53 +0000 2008
vertex Fri Oct 31 00:44:09 +0000 2008
mulegirl Fri Oct 31 00:31:05 +0000 2008
thedaniel Thu Oct 30 20:00:31 +0000 2008
myles Thu Oct 30 15:50:31 +0000 2008
mike ftw Fri Oct 31 00:28:00 +0000 2008
stumblepeach Thu Oct 30 23:20:06 +0000 2008
bunch Sat Oct 25 20:46:42 +0000 2008
adamgiles com Thu Apr 10 17:22:52 +0000 2008
naveen Thu Oct 30 23:24:23 +0000 2008
nph Fri Oct 31 01:53:13 +0000 2008
caterina Tue Oct 28 18:07:32 +0000 2008
rafer Thu Oct 30 19:23:50 +0000 2008
ML Thu Oct 30 15:31:47 +0000 2008
brianoberkirch Thu Oct 30 20:21:43 +0000 2008
joelaz Thu Oct 30 22:03:59 +0000 2008
arainert Fri Oct 31 01:18:43 +0000 2008
tony Sun Oct 26 18:16:02 +0000 2008
brianr Fri Oct 31 01:57:27 +0000 2008
prash Tue Oct 28 22:14:24 +0000 2008
danielmorrison Thu Oct 30 21:37:41 +0000 2008
slack Fri Oct 31 01:26:08 +0000 2008
mike9r Thu Oct 30 21:17:29 +0000 2008
monstro Thu Oct 30 22:28:46 +0000 2008
mat Fri Oct 31 00:26:22 +0000 2008

Wow… look at those dates. Only one of these people has failed to update in the last week!

Here’s the caveat. We don’t know how many early Twitter users are in the My First Tweet database. The data looks suspicious: there are only 50 Twitter users in a 7 month period? That can’t be right. So it’s possible the My First Tweet database is built by finding currently active tweeters and then looking back to their first post. If so, my table doesn’t say much about stickiness.

But I find it fairly impressive in any case.

Digging into Twitter following

Monday, October 13th, 2008

TwitterThis is just a quick post. I have a ton of things I could say about this, but they’ll have to wait – I need to do some real work.

Last night and today I wrote some Python code to dig into the follower and following sets of Twitter users.

I also think I understand better why Twitter is so compelling, but that’s going to have to wait for now too.

You give my program some Twitter user names and it builds you a table showing numbers of followers, following etc. for each user. It distinguishes between people you follow and who don’t follow you, and people who follow you but whom you don’t follow back.

But the really interesting thing is to look at the intersection of some of these sets between users.

For example, if I follow X and they don’t follow me back, we can assume I have some interest in X. So if am later followed by Y and it turns out that X follows Y, I might be interested to know that. I might want to follow Y back just because I know it might bring me to the attention of X, who may then follow me. If I follow Y, I might want to publicly @ message him/her, hoping that he/she might @ message me back, and that X may see it and follow me.

Stuff like that. If you think that sort of thing isn’t important, or is too detailed or introspective, I’ll warrant you don’t know much about primate social studies. But more on that in another posting too.

As another example use, I plan to forward the mails Twitter sends me telling me someone new is following me into a variant of my program. It can examine the sets of interest and weight them. That can give me an automated recommendation of whether I should follow that person back – or just do the following for me.

There are lots of directions you could push this in, like considering who the person had @ talked to (and whether those people were followers or not) and the content of their Tweets (e.g., do they talk about things I’m interested or not interested in?).

Lots.

For now, here are links to a few sample runs. Apologies to the Twitter users I’ve picked on – you guys were on my screen or on my mind (following FOWA).

I’d love to turn these into nice Euler Diagrams but I didn’t find any decent open source package to produce them.

I’m also hoping someone else (or other people) will pick this up and run with it. I’ve got no time for it! I’m happy to send the source code to anyone who wants it. Just follow me on Twitter and ask for it.

Example 1: littleidea compared to sarawinge.
Example 2: swardley compared to voidspace.
Example 3: aweissman compared to johnborthwick.

And finally here’s the result for deWitt, on whose Twitter Python library I based my own code. This is the output you get from the program when you only give it one user to examine.

More soon, I guess.