Archive for the ‘companies’ Category

Google maps miles off on Barcelona hotel

Tuesday, April 22nd, 2008

hotel sofiaI’m a big fan of Google maps.

But sometimes they get things very very wrong. In January I posted this example of them getting the location of the San Francisco international airport way wrong.


The screenshot linked above is supposed to show the location of the hotel Princesa Sofia in Barcelona. They have the address right, the zip code looks about right, but the location is about 30 miles off.

Caveat turista.

Individuality, transparency, and the cult of impersonality

Thursday, April 3rd, 2008

entrepreneursI’ve been talking to people about raising money for Fluidinfo over the last 5 months. Along the way I’ve had plenty of time to reflect on the process. I have a series of blog posts saved up. They’re mainly about oddities and discrepancies between appearance and reality. I plan to write them up gradually. Here’s one I wrote earlier this year but which I never finished. It’s still unpolished – but what the hell. This is a blog, after all.

In September 2007, Fred Wilson posted asking whether VCs should blog. The first thing I thought about when I read his title was transparency.

Increased transparency is a side-effect of easier communication between people. There are many relatively opaque human institutions and professions that have persisted for decades or centuries, relying on the fact that their subjects or customers were unable to communicate easily, to self-organize, to be widely heard, etc. Exclusionary access to knowledge is the foundation of power. As barriers to communication begin to fall, openness and transparency increase. Cracks appear in the walls. At that point anything can happen. The typical response is a heavy-handed crackdown to maintain or regain control. Examples are so numerous and widespread that any small sample would be woefully inadequate. This never-ending dynamic is just a part of the human condition and the nature of power.

But in some arenas, especially when there’s a market or in repeated games (a rich area of game theory), there may be a competitive advantage to (usually) smaller players who act disruptively to deliberately increase transparency. Those players differentiate themselves by (often informally) defecting from the (often tacit) group of gatekeepers. Advantages may include potential clients tending to trust you more, wide attention, and better opportunities. If increased transparency gets a foothold, there can then follow a kind of race to the bottom as players reveal increasingly more formerly-inside knowledge. This is also a drama that has been played out many times, and it’s fascinating and educational to watch.

We’re now seeing the cracks open wide in the VC world. The rise of the VC blogger has provided us with hundreds of eye-holes through which we can get some view of the works. The VC bloggers are implicitly calling out their less open colleagues, challenging them to open up. An extreme example is Venture Hacks, written by VC industry insiders, whose aim is to “open source” VC strategy in order to aid entrepreneurs. Then there’s The Funded, which shook the VC world as formerly isolated entrepreneurs got together (and in relative privacy, no less!) to exchange opinions and experiences. While The Funded is unquestionably biased, and based on small sample sizes, part of the fuss was unquestionably about control.

I awoke yesterday with another thought about transparency, why VCs should blog, and the curious dynamics of the VC/entrepreneur dance.

VCs should also blog because it allows entrepreneurs to see who they are as people. That may sound trite, but I think it’s quite interesting.

I’ve attended probably 50 events where one or more VCs takes the stage and gives some kind of a presentation. The presentations are very often excruciatingly dull. That’s because they’re filled to bursting with VC clichés. Even when VCs make an effort to differentiate themselves they tend to use clichés! They’re active investors, they have deep experience, broad contacts, want to help management, etc. I sat in the audience at Le Web a couple of weeks ago while several investors were on stage doing their thing. I wound up laughing with the guy who sat next to me, who I’d never met before. We rolled eyes at each other, passed notes, and ended up whispering nasty and disrespectful comments during the presentation. We were obviously there because we were interested to learn more, but we were served up standard VC fare. Steak and eggs.

The interesting thing is that entrepreneurs are a wildly idiosyncratic bunch. One would therefore expect that they’d tend to highly appreciate signs of character and individuality in VCs. Meanwhile VCs tend to keep things buttoned down and insist on making dreary presentations.

If nothing else, the existing dynamics are amusing. Wild-eyed, power-hungry, idiosyncratic, unconventional, and often deeply weird entrepreneurs are trying to act straight, to project an image of reliability, stability, balance, good sense, etc., in order to get funded. Simultaneously, the VC companies the entrepreneurs are evaluating, and who partly rely on being attractive to entrepreneurs, go to lengths to homogenize themselves – in the process washing out the very thing that an entrepreneur might find most reassuring.

There’s opportunity in this discrepancy. VCs who blog about themselves, in addition to talking about their industry and flogging their portfolio companies, may have tapped into this. Allowing entrepreneurs to see what you’re like as a person is a differentiator.

Twitter dynamics: unfollowing guykawasaki, Scobleizer and cameronreilly

Saturday, March 22nd, 2008

cameronreillyI’ve only got so much time a day to read blogs, Twitters, etc.

With blogs I find that I tend to try to keep up with those that post at a frequency at or below what I can handle, irrespective of quality of content. There are lots of blogs that I really enjoy, but which post new material so often that I end up never going to their sites. E.g., BoingBoing or ReadWriteWeb. I tend to always go to new content at blogs I like that have about one new article a day. I have dozens of examples in both these categories.

With blogs it’s no problem if some of the sites you’re subscribed to have tons of content. If you never click through on the indicator that there are 500 unread postings, you never see them.

On Twitter though the dynamic is very different. I follow about 140 people. From time to time during the day – normally when I’m drinking a coffee like I am now, or eating food – I’ll go have a look at Twitter to see what’s up in the wider world.

Unlike with blogs, if someone posts hundreds of Twitter updates you’re going to see them all. You’re perhaps going to see something like the image above (click for larger version). That’s not what I want to see at all. I’m hoping to see a whole bunch of people posting a few things, not screen after screen of one person talking to many people I don’t know or follow. It’s worse than being in a room with someone talking loudly on a mobile phone, hearing just one side of the conversation – this is like being in a room with that same person, but they’re talking to multiple people at once.

So with some reluctance I have recently un-followed Scobelizer, guykawasaki and cameronreilly. I actually like much of their content, but they have much too much of an unbalancing effect on my overall Twitter experience.

Move along.

Another thought on Mahalo

Sunday, March 9th, 2008

super smash brothersBack in November I wrote some comments on Mahalo in an article titled The Mahalo-Wikipedia-Google love triangle.

I just read Jason‘s comment that they’re staying up late to make a Mahalo page on the Super Smash Bros Brawl Walkthrough.

That’s interesting.

Mahalo is supposed to be making things so easy that our grandparents can use it. But my grandparents are all dead, and they certainly wouldn’t be playing Super Smash Bros Brawl if they were alive.

Thinking about this, I was struck by another thought.

Who’s this page for? Why stay up til after midnight to buy and play a kids’ game and document it on Mahalo? Couldn’t it wait? What are those guys smoking up there in LA?

Then the penny dropped. Another penny. If you’re the first site on the web that’s got the Super Smash Bros Brawl walkthrough, kids are going to go to your page first thing tomorrow. And they’re not just going to your page, they’re going to Mahalo.

They’re also going to link to your page, and we all know what that means.

Meanwhile, the latte-sipping crowd who like the feel of Wikipedia being an online encyclopedia can look down their noses and have joint editing catfights over just what should be on the Wikipedia page.

If Mahalo keep it up, might they not look like the default cool destination for baby-chino-sipping teens and pre-teens to go to to find things about their popular culture? Just like you and I head to Wikipedia to look things up, might not kids make Mahalo their destination of choice to find current cultural stuff? Might not Wikipedia look to them as slow-moving, quaint and, dare I say it, even out of date as a print encyclopedia now looks to us grown ups?

I think I’m very slow on the uptake on this one. But I don’t spend much time thinking about Mahalo, so perhaps I can be excused. My kids are into Webkinz and of course there’s Webkinz stuff in Mahalo. My grandparents aren’t into that either.

Jason’s point, that Mahalo isn’t designed for geeks like us, is well taken. But I’ve only heard examples of how older folks want a simplified and more guided experience. Going for teens and pre-teens would be nice too. It might even be worth staying up after midnight for.

Obvious in retrospect, I guess, but I was slow to see it.

Anything for him but mindless good taste

Friday, March 7th, 2008

I have about ten things I could blog about today. Hopefully I wont.

I think I’m going to go out and make an impulse purchase a bit later. Can something be an impulse purchase if you blog about it first? I was in an Office Depot store today. Digital cameras are so cheap it’s ridiculous. Then throw in the value of the euro. For $129 I could pick up a 7M pixel Casio Exilim with a 3 inch screen. Why not? My old camera is a bit of a joke. Or there are nice Canon digital Elph cameras for $150. It’s hardly worth thinking about whether to buy one.

I’m getting glasses. Again. I had Lasik surgery in 2002 or so, and it’s been a wonderful 6 years. But my eyes are getting worse. I hated trying on glasses in the store today. I’ll hardly ever wear them I guess, but it’s clear (fuzzy?) to me that I’d be much better off with them.

I like women. They’re so much more interesting than men, not to mention a few other adjectives. I have a whole blog post on that one, but I’ll probably refrain.

I have two related postings: a book I’ll never write, and a Twitter  app I’ll never build. I should write them down. Twitter has so much interesting and valuable information in it. I wish their API was richer so that more things could be built. I hope they’re building some of them.

I still don’t understand why it’s considered valuable to have an API that many people build on, killing your service, if it can’t be easily monetized.

I’m booking yet another US trip. I went Silver on Delta in just 2 months this year, and am about to go Gold. This could be a Platinum year. BUT, I have to stop traveling and plant my ass on my chair in Barcelona and write more code. Have to. Must stop talking.

John Cleese’s speech at Graham Chapman’s funeral service is so moving. Can someone please do that for me?

Worst of the web award: Cheaptickets

Thursday, February 14th, 2008

Here’s a great example of terrible (for me at least) UI design.

I was just trying to change a ticket booking at Cheaptickets. Here’s the interface for selecting what you want to change (click to see the full image).

cheaptickets

As you can see, I indicated a date/time change on my return flight. When I clicked on the continue button, I got an error message:

An error has occurred while processing this page. Please see detail below. (Message 1500)

Please select flight attributes to change.

I thought there was some problem with Firefox not sending the information that I’d checked. So I tried again. Then I tried clicking a couple of the boxes. Then I tried with Opera. Then I changed machines and tried with IE on a windows box. All of these got me the exact same error.

I looked at the page several times to see if I’d missed something – like a check box to indicate which of the flights to change. I figured Cheaptickets must have an error server side. Then I thought come on, you must be doing something wrong.

Then I figured it out. Can you?

My twitter stats

Monday, February 11th, 2008

twitter statsI seem to be done with Twitter, at least for now. The graphic shows my monthly usage graph (courtesy of tweetstats) – click for the full-sized image.

I do find Twitter valuable, but I don’t want to spend time on it. It’s a bit like TV or video games for me – I quite enjoy those things, but there are almost always better things to do.

I’ll probably subscribe to some form of Twitter alert or digest at some point. I do find it useful to know when people are coming to Barcelona. But I don’t want to monitor Twitter. Like IM, I find it too distracting and waste too much time just going to check if anything’s new.

Etc.

Social Graph foo camp was a blast

Saturday, February 9th, 2008

foo camp logoI spent last weekend at the Open Social Foo camp held on the O’Reilly campus in Sebastopol, CA. The camp was organized by David Recordon and Scott Kveton, with sponsorship from various companies, especially including O’Reilly. I was lucky enough to have my airfare paid for, so lots of thanks to all concerned for that.

The camp was great. Very few people actually camped, almost everyone just found somewhere to sleep in the O’Reilly offices. Many of us didn’t sleep that much anyway.

There’s something about the modern virtual lifestyle that so many of us lead that leaves a real social hole. It’s been about 20 years since I really hung out at all hours with other coders. It’s something I associate most strongly with being an undergrad, with working at Micro Forté, and then in doing a lot of hacking as a grad student at The University of Waterloo in Canada.

So even though it was just 48 hours at the foo camp, it was really great. It’s not often I have the pleasurable feeling of being surrounded by tons of people who know way way more than I do about almost everything under discussion. That’s not meant to sound arrogant – I mean that I don’t get out enough, and I don’t live in SF, etc. It’s nice to have spent many years hanging around universities studying all sorts of relatively obscure and academic topics, and sometimes you wonder what everyone else was doing. Some of those people spent the years hacking really deeply on systems, and their knowledge appears encyclopedic next to the smattering of stuff I picked up along the way. It’s nice to bump into a whole bunch of them at once. It was extremely hard to get a word in in many of the animated conversations, which reminded me at times of discussions at the Santa Fe Institute. That’s a bit of a pain, but it’s still far better than some alternatives – e.g., not having a room full of super confident deeply knowledgeable people who all want to have their say, even if that means trampling all over others, ignoring what the previous speaker said, not leaving even 1/10th of a second conversational gap, and just plain old bull-dozering on while others try to jump in and wrest away control of the conversation.

I could write much more about all this.

I also played werewolf with up to 20 others on the Saturday night. In some ways I don’t really like the game, but it’s fun to sit around with a bunch of smart people of all ages who are all trying to convince each other they’re telling the truth when you know for sure some are lying. I was up until 4:30am that night. I went to the office I slept in on the Friday night, but found it had about 10 people still up, all talking about code. When I got up at 8am the next morning, they were all still there, still talking about code. I felt a bit guilty, like a glutton, for allowing myself three and a half hours sleep. Nice.

S3 numbers revisited: six orders of magnitude does matter

Tuesday, January 29th, 2008

OK…. I should have realized in my original posting that the Oct 2007 10,000,000,000,000 objects figure was the source of the problem. I knew S3 could not be doubling every week, and that Amazon could not be making $11B a month, but didn’t see the now-obvious error in the input.

So what sort of money are they actually making?

Don MacAskill pointed me to this article at Forbes which says the number of objects at the end of 2007 was up to 14B from 10B in October. So let’s suppose the number now stands at 15B (1.5e10) and that Amazon are currently adding about 1B objects a month.

I’ll leave the other assumptions alone, for now.

Amazon’s S3 pricing for storage is $0.15 per GB per month. Assume all this data is stored on their cheaper US servers and that objects take on average 1K bytes. So that’s roughly 1.5e10 * 1e3 / 1e9 = 1.5e4 gigabytes in storage, for which Amazon charges $0.15 per month, or $2250.

Next, let’s do incoming data transfer cost, at $0.10 per GB. That’s simply 2/3rds of the data storage charge, so we add another 2/3 * $2250, or $1500.

Then the PUT requests that transmit the new objects: 1B new objects were added in the last month. Each of those takes a PUT, and these are charged at $0.01 per thousand, so that’s 1e9 / 1e3 * $0.01, or $10,000.

Lastly, some of the stored data is being retrieved. Some will just be backups, and never touched, and some will simply not be looked at in a given month. Let’s assume that just 1% of all (i.e., not just the new) objects and data are retrieved in any given month.

That’s 1.5e10 * 1e3 * 0.01 / 1e9 = 150 GB of outgoing data, or 0.15K TB. That’s much less than 10TB, so all this goes out at the highest rate, $0.18 per GB, giving another $27 in revenue.

And if 1% of objects are being pulled back, that’s 1.5e10 * 0.01 = 1.5e8 GET operations, which are charged at $0.01 per 10K. So that’s 1.5e8 / 1e4 * $0.01 = $150 for the GETs.

This gives a total of $2250 + $1500 + $10,000 + $27 + $150 = $13,927 in the last month.

And that doesn’t look at all like $11B!

Where did all that revenue go? Mainly it’s not there because Amazon only added 1e9 objects in the last month, not 1e15. That’s six orders of magnitude. So instead of $11B in PUT charges, they make a mere $11K. That’s about enough to pay one programmer.

I created a simple Amazon S3 Model spreadsheet where you can play with the numbers. The cells with the orange background are the variables you can change in the model. The variables we don’t have a good grip on are the average size of objects and the percentage of objects retrieved each month. If you increase average object size to 1MB, revenue jumps to $3.7M.

BTW, the spreadsheet has a simplification: regarding all data as being owned by one user, and using that to calculate download cost. In reality there are many users, and most of them will be paying for all their download data at the top rate. Also note that my % of objects retrieved is a simplification. Better would be to estimate how many objects are retrieved (i.e., including objects being retrieved multiple times) as well as estimating the download data amount. I roll these both into one number.

Google maps gets SFO location waaaay wrong

Monday, January 28th, 2008

google-sfoBefore leaving Barcelona yesterday morning, I checked Google maps to get driving directions from San Francisco International airport (SFO) to a friend’s place in Oakland.

Google got it way wrong. Imagine trying to follow these instructions if you didn’t know they were so wrong. Click on the image to see the full sized map. Google maps is working again now.


Amazon S3 to rival the Big Bang?

Monday, January 28th, 2008

Note: this posting is based on an incorrect number from an Amazon slide. I’ve now re-done the revenue numbers.

We’ve been playing around with Amazon’s Simple Storage Service (S3).

Adam Selipsky, Amazon VP of Web Services, has put some S3 usage numbers online (see slides 7 and 8). Here are some numbers on those numbers.

There were 5,000,000,000 (5e9) objects inside S3 in April 2007 and 10,000,000,000,000 (1e13) in October 2007. That means that in October 2007, S3 contained 2,000 times more objects than it did in April 2007. That’s a 26 week period, or 182 days. 2,000 is roughly 211. That means that S3 is doubling its number of objects roughly once every 182/11 = 16.5 days. (That’s supposing that the growth is merely exponential – i.e., that the logarithm of the number of objects is increasing linearly. It could actually be super-exponential, but let’s just pretend it’s only exponential.)

First of all, that’s simply amazing.

It’s now 119 days since the beginning of October 2007, so we might imagine that S3 now has 2119/16.5 or about 150 times as many objects in it. That’s 1,500,000,000,000,000 (1.5e15) objects. BTW, I assume by object they mean a key/value pair in a bucket (these are put into and retrieved from S3 using HTTP PUT and GET requests).

Amazon’s S3 pricing for storage is $0.15 per GB per month. Assume all this data is stored on their cheaper US servers and that objects take on average 1K bytes. These seem reasonable assumptions. (A year ago at ETech, SmugMug CEO Don MacAskill said they had 200TB of image data in S3, and images obviously occupy far more than 1K each. So do backups.) So that’s roughly 1.5e15 * 1K / 1G = 1.5e9 gigabytes in storage, for which Amazon charges $0.15 per month, or $225M.

That’s $225M in revenue per month just for storage. And growing rapidly – S3 is doubling its number of objects every 2 weeks, so the increase in storage might be similar.

Next, let’s do incoming data transfer cost, at $0.10 per GB. That’s simply 2/3rds of the data storage charge, so we add another 2/3 * $225M, or $150M.

What about the PUT requests, that transmit the new objects?

If you’re doubling every 2 weeks, then in the last month you’ve doubled twice. So that means that a month ago S3 would have had 1.5e15 / 4 = 3.75e14 objects. That means 1.125e15 new objects were added in the last month! Each of those takes an HTTP PUT request. PUTs are charged at one penny per thousand, so that’s 1.125e15 / 1000 * $0.01.

Correct me if I’m wrong, but that looks like $11,250,000,000.

To paraphrase a scene I loved in Blazing Saddles (I was only 11, so give me a break), that’s a shitload of pennies.

Lastly, some of that stored data is being retrieved. Some will just be backups, and never touched, and some will simply not be looked at in a given month. Let’s assume that just 1% of all (i.e., not just the new) objects and data are retrieved in any given month.

That’s 1.5e15 * 1K * 1% / 1e9 = 15M GB of outgoing data, or 15K TB. Let’s assume this all goes out at the lowest rate, $0.13 per GB, giving another $2M in revenue.

And if 1% of objects are being pulled back, that’s 1.5e15 * 1% = 1.5e13 GET operations, which are charged at $0.01 per 10K. So that’s 1.5e13 / 10K * $0.01 = $15M for the GETs.

This gives a total of $225M + $150M + $11,250M + $2M + $15M = $11,642M in the last month. That’s $11.6 billion. Not a bad month.

Can this simple analysis possibly be right?

It’s pretty clear that Amazon are not making $11B per month from S3. So what gives?

One hint that they’re not making that much money comes from slide 8 of the Selipsky presentation. That tells us that in October 2007, S3 was making 27,601 transactions per second. That’s about 7e10 per month. If Amazon was already doubling every two weeks by that stage, then 3/4s of their 1e13 S3 objects would have been new that month. That’s 7.5e12, which is 100 times more transactions just for the incoming PUTs (no outgoing) than are represented by the 27,601 number. (It’s not clear what they mean by transaction – I mean what goes on in a single transaction.)

So something definitely doesn’t add up there. It may be more accurate to divide the revenue due to PUTs by 100, bringing it down to a measly $110M.

An unmentioned assumption above is that Amazon is actually charging everyone, including themselves, for the use of S3. They might have special deals with other companies, or they might be using S3 themselves to store tons of tiny objects. I.e., we don’t know that the reported number is of paid objects.

There’s something of a give away the razors and charge for the blades feel to this. When you first see Amazon’s pricing, it looks extremely cheap. You can buy external disk space for, e.g., $100 for 500GB, or $0.20 per GB. Amazon charges you just $0.18 per GB for replicated storage. But that’s per month. A disk might last you two years, so we could conclude that Amazon is e.g., 8 or 12 times more expensive, depending on the degree of replication. But you don’t need a data center or to grow (or shrink) a data center, cooling, employees, replacement disks—all of which have been noted many times—so the cost perhaps isn’t that high.

But…. look at those PUT requests! If an object is 1K (as above), it takes 500M of them to fill a 500GB disk. Amazon charges you $0.01 per 1000, so that’s 500K * $0.01 or $5000. That’s $10 per GB just to access your disk (i.e., before you even think about transfer costs and latency), which is about 50 times the cost of disk space above.

In paying by the PUT and GET, S3 users are in effect paying Amazon for the compute resources needed to store and retrieve their objects. If we estimate it taking 10ms for Amazon to process a PUT, then 1000 takes 10 seconds of compute time, for which Amazon charges $0.01. That’s nearly $26K per month being paid for machines to do PUT storage, which is 370 times more expensive than what Amazon would charge you to run a small EC2 instance for a month. Such a machine probably costs Amazon around $1500 to bring into service. So there’s no doubt they’re raking it in on the PUT charges. That makes the 5% margins of their retailing operation look quaint. Wall Street might soon be urging Bezos to get out of the retailing business.

Given that PUTs are so expensive, you can expect to see people encoding lots of data into single S3 objects, transmitting them all at once (one PUT), and decoding when they get the object back. That pushes programmers towards using more complex formats for their data. That’s a bad side-effect. A storage system shouldn’t encourage that sort of thing in programmers.

Nothing can double every two weeks for very long, so that kind of growth simply cannot continue. It may have leveled out in October 2007, which would make my numbers off by roughly 2119/16.5 or about 150, as above.

When we were kids they told us that the universe has about 280 particles in it. 1.5e15 is already about 250, so only 30 more doubling are needed, which would take Amazon just over a year. At that point, even if all their storage were in 1TB drives and objects were somehow stored in just 1 byte each, they’d still need about 240 disk drives. The earth has a surface area of 510,065,600 km2 so that would mean over 2000 Amazon disk drives in each square kilometer on earth. That’s clearly not going to happen.

It’s also worth bearing in mind that Amazon claims data stored into S3 is replicated. Even if the replication factor is only 2, that’s another doubling of the storage requirement.

At what point does this growth stop?

Amazon has its Q4 2007 earnings call this Wednesday. That should be revealing. If I had any money I’d consider buying stock ASAP.

Final straws for Mac OS X

Thursday, January 24th, 2008

I’ve had it with Mac OS X.

I’m going to install Linux on my MacBook Pro laptop in March once I’m back from ETech.

I’ve been thinking about this for months. There are just so many things I don’t like about Mac OS X.


Yes, it’s beautiful, and there are certainly things I do like (e.g., iCal). But I don’t like:

  • Waiting forever when I do a rm on a big tree
  • Sitting wondering what’s going on when I go back to a Terminal window and it’s unresponsive for 15 seconds
  • Weird stuff like this
  • Case insensitive file names (see above problem)
  • Having applications often freeze and crash. E.g. emacs, which basically never crashes under Linux

I could go on. I will go on.

I don’t like it when the machine freezes, and that happens too often with Mac OS X. I used Linux for years and almost never had a machine lock up on me. With Mac OS X I find myself doing a hard reset about once a month. That’s way too flaky for my liking.

Plus, I do not agree to trade a snappy OS experience for eye candy. I’ll take both if I can have them, but if it’s a choice then I’ll go back to X windows and Linux desktops and fonts and printer problems and so on – all of which are probably even better than they already were a few years back.

This machine froze on me 2 days ago and I thought “Right. That’s it.” When I rebooted, it was in a weird magnifying glass mode, in which the desktop was slightly magnified and moved around disconcertingly whenever I moved the mouse. Rebooting didn’t help. Estéve correctly suggested that I somehow had magnification on. But, how? WTF is going on?

And, I am not a fan of Apple.

In just the last two days, we have news that 1. Apple crippled its DTrace port so you can’t trace iTunes, and 2. Apple QuickTime DRM Disables Video Editing Apps so that Adobe’s After Effects video editing software no longer works after a QuickTime update.

It’s one thing to use UNIX, which I have loved for over 25 years, but it’s another thing completely to be in the hands of a vendor who (regularly) does things like this while “upgrading” other components of your system.

Who wants to put up with that shit?

And don’t even get me started on the iPhone, which is a lovely and groundbreaking device, but one that I would never ever buy due to Apple’s actions.

I’m out of here.

I just deactivated my Facebook account

Thursday, January 3rd, 2008

I just deactivated my Facebook account. This has nothing to do with Robert Scoble’s account being disabled earlier today, I’m just sick of Facebook. It does nothing whatsoever for me, except send messages that can and would otherwise have been sent in email. I don’t want to use a tool that encourages people to send me messages on a website that I then have to go log in to. I don’t want some website to hold my messages. I like them to be searchable with things like grep. I like to organize them my way. I like email. Apart from receiving messages in a totally unattractive way, Facebook is useless for me – just a steady stream of invitations to things I don’t want to attend from people I don’t know, plus a smattering of cream pies, flying sheep, etc. So I’m outta there. I wonder if I’ll manage to survive.

Amazon just billed me 14 cents

Wednesday, January 2nd, 2008

I’ve been messing around with Esteve setting up an Amazon EC2 machine.

We set up a machine the other day, ssh’d into it, took a look around, and then shut it down a little later. Amazon just sent me a bill:

Greetings from Amazon Web Services,

This e-mail confirms that your latest billing statement is available on the AWS web site. Your account will be charged the following:

Total: $0.14

Please see the Account Activity area of the AWS web site for detailed account information.

Isn’t that cool?

It would certainly cost more than 14 cents to get your hands on your own (virtual) Linux box any other way.

Pushing back on the elevator pitch

Saturday, December 1st, 2007

I’ve been out talking to people about raising money for Fluidinfo.

Over the last 7 years I’ve read literally thousands of articles on talking to potential investors, pitching, raising money, angels, VCs, dilution, control, rounds, boards, strategies, valuations, burn rates, equity, etc. I’ve bought and read dozens of related books. I’m a regular reader of about a dozen VC blogs and the blogs of several entrepreneurs. I’ve swapped stories in person and learned lessons from probably a hundred other entrepreneurs. I was CTO of Eatoni Ergonomics, a startup that raised $5M in NYC, and I sat on the board for 4 years.

I like to analyze things, to sit around thinking, to generalize, to look for lessons, to find patterns, etc. So I reckon I have a fairly good idea of what creating a startup and raising money is about.

Some aspects of doing that are relatively formulaic. But others have significant variation.

For example, what should you put in a business plan? You can spend many months working on business plans. It’s hard work to write well and concisely. Then you show it to VC A and they tell you they’d also like to see X and Y and Z, that are not in your plan. So you put them in. You show it to VC B, and they tell you the plan is way too long! That you should take out P, Q, R and S. That leaves you with a wholly new-looking plan and when you show that to VC C, they’ll tell you it’s incoherent and doesn’t flow and look at you like you’re some kind of innocent child who doesn’t even know how to structure its thoughts. When you tell them you actually already know all that and that you agree, they’ll think you’re even weirder. And so it continues.

Thinking is changing on the business plan front, though. Some entrepreneurs and some investors have realized that creating or insisting on a business plan too early is probably a waste of time. Everyone knows the market numbers and the financial projections are probably rubbish. People expect the business and the plan to change, etc., etc.

When someone asks me for a business plan, I (politely) tell them I don’t have one or intend to write one. I tell them I’m looking for someone who wants to understand what I’m doing and fund it, without needing to see a formal written business plan. I suggest that if I reach the stage of looking for someone who wants the comfort of a better-thought-out plan that I will get back to them.

I think that’s a good change all round. You have to push back a little. A tiny engineering team focused on building a product probably shouldn’t stop, or be stopped, to write a business plan. I’m certainly not going to do that. I could spend that time writing code, working with people I’m paying to create more of a product, to get more online, to have more to point to, etc.

Elevator pitches

There’s definitely been a change with respect to business plans.

And now to the meat of this post, to a place where a similar change has yet to penetrate: the blind insistence on having an elevator pitch.

Almost universally, potential investors will want or expect an elevator pitch. Tons of VC sites will advise you that if you can’t describe your idea in a couple of sentences, it’s probably a non-starter. If you don’t have a compelling elevator pitch they wont talk to you, wont reply to email (even if you have been introduced), and they certainly wont read any materials.

Some even go so far as to tell you that without an elevator pitch you wont be able to communicate your ideas to your employees to motivate them! Uh, excuse me? Since when did the intelligent, driven, dig-in, curious, thoughtful, dedicated people who join startups acquire the attention span of gnats?

Listen. Some ideas can’t be summarized and/or grasped in a two-minute elevator ride. Sometimes you don’t even know yourself what the outcome will be. The history of science and innovation is full of examples. Imagine what the world would be like if, in order to get seed resources to push a new project along, all ideas had to be pre-vetted, each in 2 minutes, by a fairly general audience (I’m being polite again).

Entrepreneurs have to push back—where necessary—on the demand for an elevator pitch.

I’ve tried to put my round ideas into the square hole of an elevator pitch for long enough. I haven’t managed to do it and I don’t want to spend any more time trying.

Until tonight I’ve just been telling people I don’t have an elevator pitch, sorry. I’ve even told them (hi Nivi!) that instead of robotically insisting that I shape my ideas to their expectations that they try being more open minded about the process and try working on their expectations.

From now on, I’m going to give the following elevator pitch:

Here is a list of people. Each of them has had the curiousity, time, and patience to listen to my ideas for at least an hour. Ask them if I’m worth talking to further.

(See below for my list.)

If that’s not ok, then I agree that 1) if I reach the stage where I need to talk to people who really need an elevator pitch, and 2) you’re still interested, then I’ll try again to work on getting you what you need. Same goes for a business plan.

Up to this point I’ve tried to only talk to people who are willing to put the time in, to listen and think, to talk among themselves and draw their own conclusions. But I’ve still run into a bunch of people who wont do that. That’s ok, of course. I also know what it’s like to be busy.

Here’s my list. I’m very happy and very thankful to have recently spent at least an hour, sometimes much more, with each of the following:

Bradley Allen,
Art Bergman,
Jason Calacanis,
Dick Costolo,
Daniel Dennett.
Esther Dyson (now an investor),
Brady Forrest,
Eric Haseltine,
David Henkel-Wallace,
Jim Hollan,
Steve Hofmeyr,
Mark Jacobsen,
Vicente Lopez,
Roger Magoulas,
Jerry Michalski,
Nelson Minar,
Roger Moody,
Ted Nelson,
Tim O’Reilly,
Norm Packard,
Jennifer Pahlka,
Andrew Parker,
Scott Rafer,
Clay Shirky,
Reshma Sohoni,
Graham Spencer,
Stefan Tirtey,
Mark Tluszcz,
David Weinberger, and
Fred Wilson.

That’s my new elevator pitch.

If you buy it, let’s talk properly sometime soon. If you don’t, but you’re still curious, talk to some of those folks. Take your pick.

And if you don’t know any of those people, maybe you should be sending me your elevator pitch.

Hacking Twitter on JetBlue

Saturday, November 24th, 2007

I have much better and more important things to do than hack on my ideas for measuring Twitter growth.

But a man’s gotta relax sometime.

So I spent a couple of hours at JFK and then on the plane hacking some Python to pull down tweets (is this what other people call Twitter posts?), pull out their Twitter id and date, convert the dates to integers, write this down a pipe to gnuplot, and put the results onto a graph. I’ve nothing much to show right now. I need more data.

But the story with Twitter ids is apparently not that simple. While you can get tweets from very early on (like #20 that I pointed to earlier), and you can get things like #438484102 which is a recent one of mine, it’s not clear how the intermediate range is populated. Just to get a feel for it, I tried several loops like the following at the shell:

i=5000

while [ $i -lt 200000 ]
do
  wget –http-user terrycojones –http-passwd xxx \
    http://www.twitter.com/statuses/show/$i.xml
  i=`expr $i + 5000`
  sleep 1
done

Most of these were highly unsuccessful. I doubt that’s because there’s widespread deleting of tweets by users. So maybe Twitter are using ids that are not sequential.

Of course if I wasn’t doing this for the simple joy of programming I’d start by doing a decent search for the graph I’m trying to make. Failing that I’d look for someone else online with a bundle of tweets.

I’ll probably let this drop. I should let it drop. But once I get started down the road of thinking about a neat little problem, I sometimes don’t let go. Experience has taught me that it is usually better to hack on it like crazy for 2 days and get it over with. It’s a bit like reading a novel that you don’t want to put down when you know you really should.

One nice sub-problem is deciding where to sample next in the Twitter id space. You can maintain something like a heap of areas – where area is the size of the triangle defined by two tweets: their ids and dates. That probably sounds a bit obscure, but I understand it :-) Gradient of the growth curve is interesting – you probably want more samples when the gradient is changing fastest. Adding time between tweets to gradient gives you a triangle whose area you can measure. There are simpler approaches too, like uniform sampling, or some form of binary splitting of interesting regions of id space. Along the way you need to account for pages that give you a 404. That’s a data point about the id space too.

Twitter creeps in

Wednesday, November 21st, 2007

I often notice little things about how I work that I think point out value. One sign that a piece of UI is right is when you start to look for it in apps that don’t have it. For example, after I had started using mouse gestures in Opera I’d find myself wanting to make mouse gestures in other applications. When mice first started to have a wheel, I was skeptical. Support of the mouse wheel was not universal across applications. When I found myself trying to scroll with the mouse wheel in applications that didn’t support it, I knew it was right.

Tonight I came home and went to my machine. The first thing I did was to check what was going on in Twitter. That’s pretty interesting, at least for someone like me. I’ve been sending email on pretty much a daily basis for 25 years. It’s pretty much always the first thing I look at when I come back to my machine. Occasionally these days I find myself first going into Google reader to see what’s new, but that’s pretty rare and I might be looking for something specific. Tonight, I think for the first time, Twitter was where I went to – and not just for the general news, but for communications between and about people I know or am interested in. Much more interesting than looking through my email.

I’m one of those that thought Twitter was pretty silly when I first signed up (Dec 2006). I only used it once, and also found it intolerably slow. But it’s grown on me. And I find definite value there.

A few examples:

  1. I’d mailed Dick Costolo a few times in the past. Then I saw him twittering that he was drinking cortados. So I figured he must be in Spain. I mailed him, and he was. As a result I ended up at the FOWA conference in London the next day and met a bunch of people.
  2. On Tuesday I went out and bought a Wii in Manhattan to take back to my kids in Spain. I twittered about heading out to do it. I got an email a bit later from @esteve telling me to take the Wii back as they are region-locked. So I did.
  3. A week or so ago I was reading some tweets and noticed that someone had just been out to dinner in Manhattan with someone else that I wanted to meet. So I sent a mail to the first person and was soon swapping mails with the second.
  4. I’ve noticed about 5 times that interesting people were going to be in Barcelona and so I’ve mailed them out of the blue. That’s really good – people on holiday are often happy to have a beer and a chat. I’d have had no idea they were going to literally be outside my door were it not for Twitter.

Not exactly Brownian motion in Manhattan

Monday, November 19th, 2007

Today after some meetings I went out for a walk. I’m staying on 12th Street between 5th and University in Manhattan.

I had intended to “just wander around” pretty much at random. That’s what I really felt like too. But in the back of my mind, not quite so far back that I wasn’t aware of it, my brain was making sure that, like it or not, I went to the Apple store on the corner of 5th Avenue and Central Park.

I really have no need of an Apple store. There’s nothing I would buy, nothing I need. But.

So off I wandered… Broadway, 6th Av, 5th Av. I stopped briefly in many stores, had a coffee and a muffin, tried to tell myself that I actually wasn’t going to the Apple store. But.

I saw iPods and iPhones aplenty along the way. Hundreds of them. All identically priced. Best buy, Comp USA, Circuit City, all the small electronic shops on 5th Av. No need, no need at all to go to the Apple store. None.

I’m walking up 5th Av in the boring super-rich area, Cartier, Dunhill, DeBeers. There can be no doubt whatsoever that I am heading to the Apple store. Most of my mind doesn’t want to go, but my legs and body seem determined. They know I need it.

And there it is. Amazing. I’ve been in several of these stores before, including this one, but there’s something you just have to see and feel. Maybe it’s the church of the 21st century… people are drawn in to worship an abstract god, to kneel at the altar and finger the icons.

It really is amazing. To me the Apple store is about the hippest place in Manhattan. Here you get to see all sorts of cool cats just hanging out with their favorite hardware. The place is full. Full of people from all over the world who’ve come to buy Apple gear. The place has a very definite atmosphere, and it’s not the atmosphere of a regular computer store. There are hundreds of Apple products out, they’re all on, and people are using them – surfing the web, reading email, listening to music, marveling. Spend half an hour in there people watching, and you want to run out and buy AAPL stock.

Apple and Nokia are two companies that really understand the importance of appearance, design, and fashion in technology. I think Nokia were the first company to see clearly that a phone is not just a phone – it’s a statement about yourself. It’s something you take out and leave on the table at the cafe, or casually flip open when you need to impress someone or get laid. Apple understands it even better. I’ll walk nearly 50 blocks just to get a fix – not to buy, just to look at the products, look at the people, be amazed at it all.

Fortunately I’m old enough to know that I don’t really need any of those shiny objects. I have a first generation iPod that I never use. I have a dead-simple phone that I don’t feel any need to upgrade. I haven’t bought myself a computer in I don’t know how long – maybe 10 years (I always get them through work). I’m not even sure that I’d own a computer if I didn’t work from home. But I sure do like to look at hardware. The new iPod nano is extraordinarily beautiful – dimensions, sleakness, feel, everything about it is divine – and at $149 (4GB) or $199 (8GB) it doesn’t feel expensive. But I know I simply wouldn’t use it. What a pity!

The Mahalo-Wikipedia-Google love triangle

Sunday, November 18th, 2007

Lots of people seem to like dumping on Mahalo and Jason Calacanis. For example, Andrew Baron recently posted about Why Mahalo is Fundamentally Flawed.

Try Googling Mahalo sucks and you’ll get about 232,000 hits. Take your pick of the highly critical coverage.

Some of the negative commentary on Mahalo is probably due to professional and personal jealousy. Some of it is due to the fact that it’s early days yet. And I think some of it may be due to Jason happily telling people to look left while he goes right.

How can Jason raise money for Mahalo at valuations north of $100M? Surely there must be a revenue plan that holds water? If you want to argue that Mahalo is a failure and that Jason is simply a ceaseless self-marketer full of hot air, you’ll need to argue that some of the same things are true of Mahalo’s investors. Or maybe we’re in a bubble and they’ve all simply lost it.

Here’s what I think is going on.

Firstly, I think Jason is using a little smoke and mirrors when he calls Mahalo a search engine and frequently compares Mahalo’s “search” results to Google’s. With few exceptions, everyone seems to be buying it! With few exceptions, people compare Mahalo with Google – presumably because Jason tells them to and because he talks about being a search engine. And, with few exceptions, the technorati tell us that Mahalo is a pretty crappy search engine.

I agree, because Mahalo is not a search engine. Putting a box labeled “Search” on your web site to dig hits out of your own content does not make you a search engine – if it did, millions of sites would qualify. Passing queries off to Google and showing the results does not make you a search engine, either. Telling people to compare your content with Google’s results does not make you a search engine. Nor does putting the words “search engine” in your company’s strapline.

Mahalo will never be a search engine, and almost certainly does not want to be a search engine. That would be suicidal.

I believe their strategy is entirely different and that the relevant comparison is not with Google, but with Wikipedia.

Mahalo is a rapidly growing collection of carefully curated content. Mahalo is Wikipedia with a different model of control, ownership, and content creation. It’s a benevolent dictator with a purchase agreement instead of a loose anarchy with the GNU Free Documentation License.

If you want to compare Mahalo to something, compare it to Wikipedia. Jason is a huge fan of Wikipedia. And here he is begging Jimbo Wales not to leave $100M/yr on the table. Interesting.

Right now Mahalo has roughly 25K pages. Google has information on, let’s say, 10 billion pages. By this simplistic measure, Google is about 400,000 times bigger than Mahalo. You’re not going to catch or compete with Google using people to make content. Yes, you can use Google for things you don’t have static pages for, as Mahalo does. But Mahalo is not a search engine. Never will be.

Now consider Wikipedia. Wikipedia has 1.2M English pages. That means that, in English, Wikipedia is a mere 48 times larger than Mahalo! Now we’re talking. Mahalo are currenly adding something like 1,000 pages a week. Suppose Jason manages to double that quite soon. That would be 100K pages a year, or about 8.3% of Wikipedia annually. So I think it’s conceivable that Mahalo could catch Wikipedia. Even if they keep a steady ship and only gain linearly they could easily be 35-40% the size of Wikipedia in 4 years’ time.

But sheer number of pages is only part of the story. Because the distribution of search requests will follow some kind of power law, you can pick up (say) half of all search requests by only covering a small number of them, and, as always, leave the long tail to Google.

So with a small finite amount of work, you can cover a very large chunk of Wikipedia. And I think that’s exactly what Mahalo are aiming to do.

A few weeks ago I pulled down all of Mahalo’s URIs for another project. Here’s a tiny sample – and I really did pick this out at random:


http://www.mahalo.com/Valerie_Plame_Affair

http://www.mahalo.com/Violence_on_Television

http://www.mahalo.com/Violent_Crime_Rate

http://www.mahalo.com/Virginia_Tech_Report

http://www.mahalo.com/Voting_Machine_Controversy

http://www.mahalo.com/Walter_Reed_Army_Medical_Center

http://www.mahalo.com/War_Wounded

http://www.mahalo.com/Washington_D.C._Lobbying_Scandal

http://www.mahalo.com/Abdullah_Gul

http://www.mahalo.com/Alan_Garcia

http://www.mahalo.com/Alex_Salmond

http://www.mahalo.com/Angela_Merkel

So what? you might ask. Well, let’s replace www.mahalo.com with en.wikipedia.org/wiki in the above. We get:


http://en.wikipedia.org/wiki/Valerie_Plame_Affair

http://en.wikipedia.org/wiki/Violence_on_Television

http://en.wikipedia.org/wiki/Violent_Crime_Rate

http://en.wikipedia.org/wiki/Virginia_Tech_Report

http://en.wikipedia.org/wiki/Voting_Machine_Controversy

http://en.wikipedia.org/wiki/Walter_Reed_Army_Medical_Center

http://en.wikipedia.org/wiki/War_Wounded

http://en.wikipedia.org/wiki/Washington_D.C._Lobbying_Scandal

http://en.wikipedia.org/wiki/Abdullah_Gul

http://en.wikipedia.org/wiki/Alan_Garcia

http://en.wikipedia.org/wiki/Alex_Salmond

http://en.wikipedia.org/wiki/Angela_Merkel

And guess what? All those URIs actually work! See below for a possible reason for this uncanny coincidence.

Research question: what percentage of Mahalo URIs work as Wikipedia URIs with the above simple substitution? I may do this test when I get a little more time. I bet the answer is high.

Ask yourself again: does Mahalo look more like Google or more like Wikipedia?

The idea of Mahalo-as-search-alternative-to-Google is just Jason operating Mahalo in stealth mode in broad daylight. “Hey, Rocky, watch me pull a search engine out of my hat! Oops! That’s not a search engine. I swear there was a search engine in there somewhere.”

How is Mahalo different from Wikipedia?

A big one is that Mahalo owns all its content. If Mahalo puts one of your pages on its site, you’ll first sign a purchase agreement in which the

Seller hereby irrevocably sells, grants, assigns, conveys and transfers to Mahalo, exclusively and forever, Seller’s entire right, title and interest in and to the SeRPs

and in which you warrant that the content is legit, in which you fully indemnify Mahalo, and in which you agree to let them be your agent and attorney should they need to take some action to obtain or protect the content.

In consideration you get $10-$15 which you can have in cash. Or, in a wonderfully ironic and masterful gesture, you can have your earnings donated to the Wikimedia Foundation! That’s just brilliant, I love it. How can you not be in awe of that? The guy’s a genius.

Talking of genius, just look at the language on the payment details page at Mahalo: “A Greenhouse Guide begins their career in the Greenhouse…”. You see? Writing articles for Mahalo is the beginning of a career. George Lakoff would probably count that as a classic example of framing (also see here).

Unlike Wikipedia, Mahalo owns every word of its content. That means they can sell it. That they can be acquired. But who would want to acquire Mahalo? Wait.

What other differences are there between Wikipedia and Mahalo?

Another big one is the millions of links on the internet that point to Wikipedia pages. Those little tubules that make up the internets, with Google’s PageRank worming its way down each and every one, assigning and passing on credit.

There are two things here: 1) the links themselves and 2) the high consequent position Wikipedia’s pages have on Google.

Can Mahalo get large numbers of people to link to their pages? If the pages are any good (and they are), then why not? Plus, it may be that Mahalo can catch Wikipedia in terms of how many people link to them.

According to the Netcraft October 2007 Web Server Survey, the number of servers on the net has been growing at an amazing 5% per month!

That’s just the rate of increase of new servers, not the rate of new pages being put onto existing sites. Let’s assume the Netcraft server number isn’t too far from the overall growth, and that the web roughly doubles in size every two years. That means if the size today is X, then in 4 years, towards the end of Jason’s horizon, it will be size 4X. If so, there are 3X pages yet to come into existence. The creators of these will have a choice to point links at Wikipedia or Mahalo. If popular momentum can be shifted to Mahalo, it can grab a large chunk of the link pie graph.

All of which brings us, inevitably, to Google.

Quick survey question: when you need to find something that you know you’d be happy to read in Wikipedia, do you first go to Wikipedia, find English (or your language), find their search box, enter your query, and click on the link? Or do you go to Google and take its Wikipedia link?

I thought so – you use Google. It’s a uniform way to get to things, it’s likely integrated into your browser, and they generally do a better and faster job of indexing sites’ content than the sites do themselves. So the existence and massive popularity of Wikipedia drives traffic to Google. And Google of course drives traffic to Wikipedia. The two of them are dating. But Wikipedia is not the perfect lover: they stubbornly refuse to put ads on their pages, to share the love. Along comes Jason Calacanis, then at AOL, to whom this is all very clear. He tells Wikipedia in no uncertain terms that with all that traffic they could make $100M per year from ads on just the home page. He points to a conservative estimate of the worth of Wikipedia at $600M, and his own estimate is $5B. Hmmmmm. What’s an entrepreneur to do when he sees someone leaving that much value on the table?

Back to Google. They would like to have more content. Traditionally, when you got back a page of their search results, you wouldn’t see links to pages on Google – that wouldn’t make sense: there were no pages on Google, after all. Google was supposed to point you to other pages. It was an index to help you find the things you actually wanted to look at. That was the old model. These days, Google is buying content (e.g., YouTube) and pointing their search results at their content, neatly taking the ad revenue in both places. All the better if the content comes with indemnification.

You can see where I’m going. Mahalo already does advertising with Google. In fact, they’re already a premium adsense publisher to the surprise of some. If ads on the single front page of Wikipedia could generate $100M annually, what could ads on all Mahalo pages generate if Mahalo grows to rival Wikipedia?

And… who weighs the importance of links (and other unknown factors) in Google’s results page? Yes, of course, Google does. According to this Fast Company article, Mahalo gets 65% of revenue Google makes when it sends its users into Google. And Google makes money when it sends its users into Mahalo.

If there’s really (say) $1B of value to be had by building a successful commercial version of Wikipedia, you can see why Google might have some interest in nudging links to Mahalo a little higher in its results. Maybe even higher than the equivalent page for Wikipedia. Now would be a good moment to remember that I illustrated above just how trivial it can be to match up equivalent Mahalo and Wikipedia pages… Got it? User enters a query, Google does the search and finds a highly-linked Wikipedia page, then in an instant they can make and instead display a link to the equivalent Mahalo page, optionally displaying the Wikipedia page below the fold. Would that qualify as evil?

All of which leads to a very clear answer to my “who would want to acquire Mahalo?” question. Interestingly, Google will want to wait until Mahalo is big (they will know exactly when, supposing Mahalo keeps using adsense). They want Mahalo to be independent and with strong momentum before they turn the corporate intake valve in the Mahalo direction.

Can Jason build a viable alternative to Wikipedia? I bet he can. He has the lessons of Wikipedia. He doesn’t have the anarchy factor. He has no spam. He knows what he’s doing, and he’s in control. It’s a content play, and Jason is a content guy. An editor with a track record of building valuable content in this way. He’s playing to his strengths. The engineering is not nearly as daunting as building a Google. He has the money. As he ramps it up he’s going to have more money.

Who’s going to stop him? Certainly not Google – that’s not in their interest at all. Almost certainly not Wikipedia – unless they start putting up ads and funneling large amounts of money back to Google. And Jason is unlikely to shoot himself in the foot either – quite the reverse.

So if that’s the strategy, and if he’s on track with content (as he seems to be), and if the content is passably good, or better (which it is), and if he has a good understanding with his “friends at Google” (which you can bet he does — let’s not forget the Sequoia factor either), and if the revenue numbers are about right, then a $175M valuation for an upcoming round to accelerate things might look like a steal.

Along the way, Jason gets to have a quiet inner smile at all the people whining about how Mahalo is a crap search engine. He feeds the fire all the while, telling them to go ahead, make his day, and compare Mahalo’s results to Google’s (but not Wikipedia’s). Misdirecting attention towards Google and having people write him off probably suits him just fine. Meanwhile, they’re getting on with the real mission.

Flakey Twitter and the use of consecutive ids

Friday, November 16th, 2007

Twitter was just inaccessible for maybe a couple of hours. Prior to that there was a 9-day gap in their timeline, noticed by at least a few people. I quite regularly have twitters I send not show up at all.

I wonder what could be going on over there? Things certainly don’t feel very stable.

A friend signed up tonight. Using the Twitter API you can see her id. It’s a bit over 10 million. You can also see the id of her first twitter, a bit over 417 million. The earliest twitter available on the system is number 20 “just setting up my twttr” sent at 20:50:14 on Tue Mar 21 2006 by Jack Dorsey who has user id 12 (the lowest user I’ve seen).

Given that Twitter seem to be using consecutive ids for users and twitters, and that you can pull dates out of their API, it would be pretty easy to make graphs showing growth in users and twitters over time. You could probably also infer downtime by looking for periods when no twitters appeared. This would be pretty easy too. Beyond a certain point in time it would be very accurate (i.e., when there are so many twitters arriving that a twittering gap is suspicious), and you could calculate confidence estimates.

I don’t have time for all that though.

But I wonder if Google did something like that as part of their competitive analysis when they decided to buy Jaiku, or if Twitter’s investors did it, and how the numbers would match up with whatever Twitter management might claim. I’ve no idea or opinion at all about any of that btw. But I don’t think I’d be exposing all that information by using consecutive ids for users and their twitters.