Archive for October, 2007

Stagnant email address arms race

Wednesday, October 31st, 2007

I like arms races. But in an interesting arms race there’s frequent movement on both sides.

So I’m often surprised that the measures people and programs take to obscure email addresses haven’t changed much over the last 5(?) years.

There are still many software packages and web sites that do the bare minimum to obscure email addresses. For example, here’s a recent interesting posting from Vaughan Pratt on Simple Turing machines, Universality, Encodings, etc. The mailing list software is the extremely popular Mailman system. Vaughan’s email is “obscured” as pratt at cs.stanford.edu. That approach is so old it can hardly be counted as more challenging for a spammer to harvest than if mailman had simply included the actual address.

And mailman is just one example. People do it too, using extremely transparent and repetitive schemes, like joe AT xyz DOT com.

Given how much people dislike spam, how easy the above examples are to extract, and how creative humans can be, I find it amazing that the practice of obscuring emails addresses has barely moved in the last years. Do you suppose the spammers are standing still? Well, maybe they are, given the lack of advance on the obscuring side.

There is some ingenuity, like using terryblah@flahmydomain.com accompanied by an instruction (in English) to remove all instances of blah and flah to get the real address. Given that humans are so creative with language and that NLP doesn’t stand a snowball’s chance in hell, you’d think the humans would have little trouble staying ahead in this race. But right now I expect the address harvesters have the upper hand.

Here’s another example.

To get my personal email address, join the second to the last four letters of strawberry, add an at sign, add the tenth letter, then put on “on”, then a period. You get the final part by dropping the last letter of the acronym for Eastern Standard Time.

I.e., that’s “terry” plus “@” plus “j” plus “on” plus “.” plus “es”. Yes, this is overkill, but it illustrates how easy it is to create highly personalized but simple instructions for a human to follow that no program is ever going to handle. Even if an attack on the above could be automated, it’s clearly not worth the cost just to get one email address.

Surely it’s time to move on.

Risk and reward, from the investor POV

Tuesday, October 30th, 2007

I’m very curious about the correlation of perceived early stage (seed, series A) startup risk and the eventual reward. That is, if you plotted a set of well-informed potential investors’ perception of a collection of startup companies’ risk against the eventual performance of those companies, what would the plot look like?

Would it be low-risk low-reward and high-risk high-reward? Would it be all over the place?

In an earlier post, the blind leading the blind?, I wrote about it being extremely hard (or impossible) to assess value. I’m sure that’s true in one-off cases, and that it’s true for entrepreneurs who by definition have relatively little practical experience with startups (sure, they may have read a lot, we’ve all read a lot – I mean in actually doing them).

But is the same true for investors? It’s certain that the bulk of investors get their results all over the map – some things that look safe end up worthless, some things that look like big risks end up paying off big time, and everything in between. If it were otherwise, the investment game wouldn’t be what it is – it would be much safer and more predictable.

One question I have relates to the relationship between extreme risk and extreme reward that I wrote about earlier. I.e., if risk and reward always go together, then you can’t reap a huge reward without taking huge risks.

But perception of risk is subjective. A run-of-the-mill VC might see something as hugely risky, not make the bet and so miss a huge payoff. But an exceptional VC, presented with the same investment opportunity, might see accurately that it in fact was destined to be big (i.e., consider it relatively low risk) and make the home-run investment. Do such VCs exist? If so, and we knew who they were, we’d be clamoring to pitch to them, to invest alongside them, to study there methods. Does Sequoia fall into this category? Or are they just lucky, or maybe they simply have access to better quality deals because of their track record, etc.

That’s why I’d love to see a plot of perceived risk against reward. I think in general my feelings about entrepreneurs would also hold with investors – that the really rewarding things are strongly correlated with the really risky. An investor with that sort of profile doesn’t really know any more than the rest of us. But might there exist a class of investor who can look at things that are going to be hugely rewarding and not think that they’re also high risk?

Here’s another way to put it: Maybe risk and reward always go hand in hand, are always proportional, that to get high rewards you must take high risk. If so, a VC firm cannot possibly be sitting on the next Google (or whatever) unless they have one or more companies in their portfolio that scare the shit out of them. Or, might it be the case that while risk and reward appear to go hand in hand to many, there are a few superb investors who perceive risk differently and looking at their plots we’d see that they made tons of money by taking what looked to them like low-risk bets? Or maybe the entire premise is wrong and risk and reward actually aren’t well correlated, let alone perceived risk and reward.

This is a bit of a rambling post, I know. But it’s what I’m thinking of these days. I would at least know how to make the scatter plot I’m imagining, and it’s fun to speculate on its shape – both across multiple VCs and for them individually.

Succinct Python

Monday, October 29th, 2007

Apropos of nothing…

Having passed beyond the macho need to write obscure code, I’m not fond of
coding constructs that make me scratch my head. But I found this yesterday
in the Python Cookbook (2nd Ed.) p705.

    from itertools import izip
    def chop(iterable, length=2):
        return izip(*(iter(iterable),) * length)

It took me a few minutes to figure out exactly how it does what it does. Talk about succinct. It’s probably very efficient too.

One thing I really don’t like, and which is a chronic problem in perl, is reusing symbols for multiple purposes. In the above, the first * is expanding a list into multiple arguments to izip, while the second * is multiplying (a list). Thankfully, Python is almost completely free of that problem.

There’s also the reliance on the precedence of the latter being higher than the former. I actually do approve of that – I think if you’re going to program seriously with a language you should at least have a fair grip on the precedence of its operators. Not doing so means your code winds up littered with unneeded parens. While it’s nice to be explicit, and “explicit is better than implicit” is one of the Python guidelines, the rules of precedence are already explicit. To put in parens where they’re not needed can make your code less easily to follow at a glance for someone who does know the language. That’s because when reading such code, you look at it more carefully, figuring that those parens must be there for some good reason, because the person who put them in obviously didn’t want the default precedence to apply. When you realize that they’re unnecessary, it’s frustrating and a worry, because you’ve just wasted time and you realize you’re reading the code of someone who either enjoys putting in unneeded syntax or doesn’t know the language well. And who wants to deal with either of those?

Anyway, even if you immediately know what the two * symbols are doing and about the precedence, it’s still nice to think all the way through the above. How/why does it work? When do the iterators stop? Who catches and deals with StopIteration, what happens if the length of iterable is not zero mod length?

More old-fashioned telephone confusion

Sunday, October 28th, 2007

My 7-year-old daughter was here the other day and needed to make a call. She decided to use my landline, but found the old-fashioned phone (the one Telefonica installed, new, about 18 months ago) very confusing.

The handset had no numeric keypad on it. Plus, very weird, it had a cable. Then, when you picked up the phone unit, it had a cable on it too. So she couldn’t bring it to me to explain her difficulty. So I gave her instructions. Put the handpiece to your ear. Do you hear a high-pitched noise? (No, a woman’s voice. OK, hang up, now pick it up again.) Now push the buttons on the main unit.

All so old fashioned, though at least there were buttons involved, unlike when she ran into an old rotary phone.

There’s also a nice symmetry. When I first got a mobile phone, in 1999, Ana noticed that I always stopped when I was making a call. She had to point out to me that you can just keep walking, no need to stop. Now I have my daughter here trying to make a call and getting into trouble when she tries to walk across the room holding the fixed phone – to her great surprise she found that you can’t just walk around.

Which reminds me of the time she pointed to a bulky (switched off) monitor on my desk and asked what it was. I was surprised – she’d seen me using a computer thousands of times, but only ever a laptop. Now that flat screens are everywhere, she may never really run into an old computer with a CRT monitor. But she will probably remember having seen one, just like I remember having to use punch cards for one programming task (which they made us do at Sydney University, just so we’d know what it was like)

OK, nothing deep today. But I must feed the blog.

Cereal, coffee, bread

Saturday, October 27th, 2007

My company is fueled in large part by 3 ingredients: cereal, coffee, and bread. Mainly cereal.

In my kitchen there are 6 empty boxes of cereal stacked end to end. Kelloggs All Bran Fruta y Fibra if you must know. Nuts, dried fruit, sultanas. It’s crunchy. I like my milk super cold, I use tons of it, and I eat it fast (can’t stand soggy cereal).

I’ve eaten cereal pretty much every day of my life since I was probably ten years old.

Hmmm….. let’s say 50 spoonfuls of cereal per bowl (I have big bowls). Say 3 bowls a day. Say 350 days per year. 34 years. That’s about 1.8M spoonfuls of cereal.

On Andreessen on platforms

Friday, October 26th, 2007

[This taken from my comment on Fred Wilson‘s posting Andreessen on Platforms, in which he discussed Marc Andreessen‘s posting The three kinds of platforms you meet on the Internet.]

I think Marc’s posting has two flaws. The first, which is serious, is that he didn’t put enough thought into it. The second, less of a problem, is that in several places it comes across as biased and a bit of a Level 3 sales pitch. I may be guilty of the former in what follows. Certainly my reply is a bit piecemeal – but there are only so many hours in the day.

In what follows, when I talk about “you”, I mean you the humble individual programmer.

Firstly, things become clearer if we categorize Marc’s Levels 1, 2 and 3 differently. Level 1 and 2 are two sides of the same coin:

  • Level 1: You write an app, and you call out to an API (a library of functions) that someone else has written.
  • Level 2: You write functions, and an app that someone else has written calls you (treats your code as a library function it can call).

To me these things are opposites. Within Level 2, there are two classes:

  • Level 2a: You write functions. An app that someone else has written calls your code, which runs on your server.
  • Level 2b: You write functions. An app that someone else has written calls your code, which runs on their server.

My Level 2b is what Marc calls Level 3. I’ll continue to use his terms.

Note that only in Level 1 are you really writing a full app. In level 2 and 3 you’re writing functions that are called from an existing application (like facebook or photoshop) that you almost certainly didn’t write. To make you feel better, they give your functions pleasing names like “plug in” (photoshop), “extension” (firefox), and even “app” (facebook).

To me that’s a more logical division of the 3 classes. I see no reason at all to call Level 1 a “platform”. You are writing an app. You’re calling someone else’s libraries – some of them will be local, some will be on the network. You’re not writing a platform. The only platform here is in the local OS of the machine your app is running on.

If we stop calling Level 1 a platform, it makes that word much less cloudy. That means that things like Photoshop, Firefox, and Facebook (Level 2), and Ning, Salesforce.com, and 2nd life (Level 3) all provide platforms for you. But Flickr, delicious, the Google maps API, etc., are not platforms and calling them that is just confusing. They’re just APIs or libraries that other apps can call (across the network, in these cases).

Next, virtually ALL applications in operation today are running in Level 3 platforms. Most of them run in the environment provided by operating systems.

Once you look at things that way, you see that the thing which is important is the runtime environment provided by the Level 3 platform you are already running on. Is it fast, secure, scalable, flexible, etc.? Can you write the kinds of things you want to write with it? Should you try something else?

I think Marc didn’t look at his Level 3 this way, or at least not clearly.

Now, traditionally in the field of computing, there has been a single main way of providing a platform. You provided a computer system — a mainframe, a PC operating system, a database, or even an ERP system or a game — that contained a programming environment that let people create and run code, plus an API that let them hook into the core system in various ways and do things.

The Internet — as a massive distributed system of many millions of internetworked computers running many different kinds of software — complicates things, and gives rise to three new models of platform that you see playing out in the Internet industry today.

I don’t think they’re all platforms, and I don’t think any of them are new :-)

But let me say up front — they’re all good. In no way to I intend to cast aspersions on what anyone I discuss is doing. Having a platform is always better than not having a platform, period. Platforms are good, period.

Hey, all platforms are great. But some are greater than others…

Level 1 is what I call an “Access API”.

This is undoubtedly a very useful thing and has now been proven effective on a widespread basis. However, the fact that this is also what most people think of when they think of “Internet platform” has been seriously confusing, as this is a sharply limited approach to the idea of providing a platform.

Do most people think of things like the Flickr API as being internet platforms? If it’s sharply limited (I agree), then please let’s not call it a platform.

What’s the problem? The entire burden of building and running the application itself is left entirely to the developer. The developer needs to provide her own runtime system, programming language, database, servers, storage, networking, bandwidth, and security, and needs to take responsibility for running all of the above — and then exposing the application to users. This is a very high bar in terms of both technical expertise and financial resources.

This is painting an overly bleak picture. Almost every application programmer on earth uses an off-the-shelf runtime system (e.g., an OS or a Java sandbox), off-the-shelf databases, servers, networking, etc. Yes they choose a programming language (as they do if they choose to use a Level 3 system). It’s work to pick these things out and combine them but that’s a very far cry from shouldering the _entire_ burden.

This is an example of what feels like salesmanship in Marc’s article. He’s right in general, but the way he puts it feels slanted.

As a consequence, you don’t see that many applications get built relative to what you’d think would be possible with these APIs — in fact, uptake of web services APIs has been nothing close to what you saw with previous widespread platforms such as Windows or the Mac.

And this isn’t a good comparison. It’s comparing use of a Level 1 API to use of what Marc later tells us is a Level 3 system (a traditional OS).

Because of this and because Level 1 platforms are still highly useful, notwithstanding their limitations, I believe we will see a lot more of them in the future — which is great. And in fact, as we will see, Level 2 and Level 3 platforms will typically all incorporate an Level 1-style access API as well.

Right. In fact Level 1 platforms (aka APIs) underpin all of Marc’s levels. Which is to say that even if he’s right, the Level 1 “platform” isn’t going away or lessening in importance – that’s because it’s not a platform at all. It’s a API, and libraries of functions exposed as APIs are useful things to have around. Likewise, APIs on the local OS aren’t about to go away either – in fact they’re crucial to the operation of the OS, just as they are to the operation of a level 3 platform (which is also running in a Level 3 OS).

So Level 1 isn’t going anywhere, or getting less important.

When you develop a Facebook app, you are not developing an app that simply draws on data or services from Facebook, as you would with a Level 1 platform. Instead, you are building an app that acts like a “plug-in” into Facebook — your app literally shows up within the Facebook user experience, often as a box in the middle of a page that Facebook otherwise defines, such as a user profile page.

Here (as with Photoshop or Firefox), your code is like a library function you write that is called by another app. In this case, your code runs on your server, and the calling app (usually on another server, if it’s a web app) takes your results and displays them (often to a web browser).

Level 3 is what I call a “Runtime Environment”.

In a Level 3 platform, the huge difference is that the third-party application code actually runs inside the platform — developer code is uploaded and runs online, inside the core system. For this reason, in casual conversation I refer to Level 3 platforms as “online platforms”.

And here, your code is like a library function you write that is called by another app. In this case, your code runs on the platform’s server, and the calling app (on their server) takes your results and displays them (often to a web browser).

Obviously this is a huge difference from Level 2. And this difference — and what makes it possible — is why I think Level 3 platforms are the future.

And the past.

There follow a number of breathless paragraphs that describe exactly why it’s hard to build an OS, and what the advantages are once you manage it.

Then it’s acknowledged that yes, this is all… just like having an OS!

So those long paragraphs feel like Marc is either completely blind to an _extremely_ obvious and almost perfect analogy, or, like he’s a salesman trying out a snow job on just how incredibly amazing these totally new Level 3 platforms will be. It’s impossible to think #1, so I’m left feeling #2.

The Level 3 Internet platform approach is ironically much more like the computer industry’s typical platform model than Levels 2 or 1.

Back to basics: with a traditional platform, you take a computer, say a PC, with an operating system like Windows. You create an application. The application code runs right there, on the computer. It doesn’t run elsewhere — off the platform somewhere — it just runs right there — technically, within a runtime environment provided by the platform. For example, an application written in C# runs within Microsoft’s Common Language Runtime, which is part of Windows, which is running on your computer.

At which point you note that basically all programs already run in a Level 3 platform:

I say this is ironic because I’m not entirely sure where the idea came from that an application built to run on an Internet platform would logically run off the platform, as with Level 1 (Flickr-style) or Level 2 (Facebook-style) Internet platforms. That is, I’m not sure why people haven’t been building Level 3 Internet platforms all along — apart from the technological complexity involved.

But nothing is running “off platform”. It’s all already Level 3. Yes, there are differences in environment… coming up.

So who’s building Level 3 Internet platforms now?

First, I am — Ning has been built from the start to be a Level 3 platform.

Second, in a completely different domain, Salesforce.com is also taking a Level 3 platform approach

Third, and again in a completely different domain, Second Life is a Level 3 platform.

Fourth, Amazon is — I would say — “sort of” building a Level 3 Internet platform with EC2 and S3. I say “sort of” because EC2 is more focused on providing a generic runtime environment for any kind of code than it is for building any specific kind of application — and because of that, there are no real APIs in EC2 that you wouldn’t just have on your own PC or server.

Ah, there’s a very interesting bias…

The generic traditional PC OS is a Level 3 platform, despite the fact that it’s not specifically geared towards any particular use. But EC2/S3 are somehow only sort of Level 3 precisely because they have the exact same property???

By this, I mean: Ning within our platform provides a whole suite of APIs for easily building social networking applications; Salesforce within its platform provides a whole suite of APIs for easily building enterprise applications; Second Life within its platform provides a whole suite of APIs for easy building objects that live and interact within Second Life. EC2, at least for now, has no such ambitions, and is content to be more of a generic hosting environment.

However, add S3 and some of Amazon’s other web services efforts to the mix, and you clearly have at least the foundation of a Level 3 Internet platform.

I might argue this the other way round. Things like Ning and 2nd life and Facebook are trying to be real Level 3 platforms to allow people to build a wide range of apps (i.e., 3rd party functions that they call), but they’re only “sort of” true Level 3 because they’re built for a specific purpose and so are only useful for that purpose – even if the purpose is broad, like “the” social network.

Things that are more generic, like EC2 and S3, are more like the generic computational environment provided by a traditional OS. And for that reason, one can expect them to be used for a wider range of applications (including standalone applications, not just code that lives within the Facebook or Ning world). For that reason you might expect that applications written against them will be longer-lived, as they will not die as fashion and coolness moves its fickle hand from MySpace to Facebook to Ning to…?

Would you buy a used Level 3 platform from this man?

Fifth and last, Akamai, coming from a completely different angle, is tackling a lot of the technical requirements of a Level 3 Internet platform in their “EdgeComputing” service — which lets their customers upload Java code into Akamai’s systems. The Java code then runs on the “edge” of the network on Akamai’s servers, and is distributed, managed, and secured so that it runs at scale and without stepping on other customers’ applications.

This is not a full Level 3 Internet platform, nor do I think Akamai would argue that it is, but there are significant similarities in the technical challenges, and it’s certainly worth watching what they do with their approach over time.

Why is it not a full Level 3 platform? Because it doesn’t have a particular focus?

I believe that in the long run, all credible large-scale Internet companies will provide Level 3 platforms. Those that don’t won’t be competitive with those that do, because those that do will give their users the ability to so easily customize and program as to unleash supernovas of creativity.

Oh my!

But having already said that Level 3 platforms will need underlying Level 2 and Level 1, it doesn’t seem like the Level 3 providers are driving the lesser levels out of the marketplace.

One might instead argue that it’s the Level 3 providers who are most likely to disappear. We’ve seen exactly that happen in the traditional Level 3 world (operating systems), while some applications and many great libraries hop happily from one Level 3 environment to the next.

I think there will also be a generational shift here. Level 3 platforms are “develop in the browser” — or, more properly, “develop in the cloud”. Just like Internet applications are “run in the browser” — or, more properly, “run in the cloud”. The cloud being large-scale Internet services run on behalf of users by large Internet companies and other entities. I think that kids coming out of college over the next several years are going to wonder why anyone ever built apps for anything other than “the cloud” — the Internet — and, ultimately, why they did so with anything other than the kinds of Level 3 platforms that we as an industry are going to build over the next several years — just like they already wonder why anyone runs any software that you can’t get to through a browser. Granted, I’m overstating the point but I’m doing so for clarity, and I’m quite confident the point will hold.

But everything _already_ runs “in the cloud” on a Level 3 platform. Your local OS has far more functionality, more speed, more libraries, more space, more flexibility, etc., for you to run your applications in. OK, I’m being a bit difficult, and understating the point. Maybe.

Now to the main point, which I think is valid, but which Marc doesn’t answer.

Before we had operating systems with all their benefits (see the long list of benefits Marc tells us will accrue from his Level 3 – ease of use! open source! buying and selling code that just runs!) a forward-looking person could have looked ahead and predicted the rise of the operating system. What sorts of programs, what supernovas of creativity might they have predicted?

Marc looks ahead…

A new platform typically enables a new set of applications that were not previously possible. Why else would there be a need for a new platform?

But: keep this in mind; look for the new applications that a new platform makes possible, as opposed to evaluating the new platform on the basis of whether or not you see older classes of applications show up on it right away.

But give us no examples at all.

I’m extremely interested in this. What will these applications be?

Is it true that what we can build with these future systems is not “possible” without them? Or just not feasible? Where does their extra power come from? I think it’s NOT principally from the great diversity of apps that can be written to run on these platforms, but from what you gain by having a large number of apps running in the _same environment_ – be it in an OS with a file system, a process subsystem and communicating processes, or a Level 3 internet platform with whatever it provides.

In the fullness of time, whenever that is, we may see the rise of truly open internet Level 3 platforms that will challenge the well-funded closed commercial ones. Meanwhile, I’m happy to _only_ be working away at Level 1.

The value of APIs to startups

Friday, October 26th, 2007

[This pulled from my comments and questions on Fred Wilson‘s posting Every Product Is A Platform on September 10, 2007]

My question to VCs and others is where you see value in having others build on an API. I can see some arguments – visibilty and branding, pushing maturity of the API, giving you an under-the-radar tap with which you can experiment with increasing traffic, maybe giving you ideas for products (if you’re the kind to take that route), finding (and then hiring) good hackers who love your product. These are all indirect benefits. I’m curious about why, from an investor’s POV, there’s value in having others build on the API. There are 250+ things built on the del.icio.us API. Were they of value? Did they increase revenue in any direct way? If you argue that there’s great direct value, can I therefore walk into your office, claim that thousands of people will write apps using my API and argue for a massive valuation? :-)

Do any of the companies offering an API have a strategy for monetizing it, or simply recouping costs for bandwidth, servers, etc.? Sure, the exposure is great. But, as I was once taught, you can die from over-exposure.

Here’s another way of looking at my question: if API traffic is 10x bigger than interactive web traffic, then just 1/11th of Twitter’s computing resources are being used to support their (arguably) most important customers. Maybe the site could have been many times faster if they had opened up API usage slower. I found the Twitter web interface unusably slow in the first 6 months after I heard about it – a feeling that many shared. Is that because they were actually using 90% of their resources supporting apps they didn’t write and didn’t benefit (directly, financially) from? That’s a very delicate line to choose to walk. At that level of diverting resources from normal users, there’s a huge risk blowing it. Hence my question about value. Sure, the 3rd party apps are cool and exciting – but are they so important that it makes sense to give you front-line customers a miserable time, making your service extremely slow.

To go to another extreme, imagine releasing an API that was so powerful that thousands of people wrote to it, but which had no user-facing component. How is that going to make you money unless you charge for it? E.g., Amazon’s S3. If you charge, like Amazon, I understand the model. If you don’t charge and the API is eating 90% of your resources, you may be shooting yourself in the foot rather severely.

It’s an interesting problem. As I said earlier, I agree with you that if you can do it, product should drive platform. Twitter could have followed that route, but apparently went the other way round. Or maybe things were just totally out of control and they unexpectedly found themselves in this 10:1 situation.

One thing’s for sure, if you’re using 10/11ths of your resources on your (non-paying) API customers, you should definitely make sure the rest of the world knows about it :-)

Promoting comments

Friday, October 26th, 2007

While this blog was out of action July-September 2007 I made some apparently “long” comments on other blogs. I’m going to pull them and post them here. While a comment in someone’s high-traffic blog is probably going to get more attention than a top-level posting here, I want to have my own thoughts in one place, not spread thinly over the blogosphere. Or, if you know me at all, I want them in both places at the same time.

Hence Fluidinfo.

Twisting the towel

Friday, October 26th, 2007

Russell and I met with Esther Dyson (a Fluidinfo investor) recently. After she’d listened to our presentation and seen the latest demo, she said that we’d “given the towel another half twist” and that we should carry on twisting.

She was referring to the process of tightening up and focusing company vision, strategy, business plan, etc.

I liked the analogy a lot. Twisting a wet towel is fun. It’s hard work, and it gets harder. But it’s surprising and satisfying to see just how much water you can get out of the thing before you let nature take its course and finish the job.

It also applies to writing documents. I spent most of 2005 writing a proposal to start a research institute for the computational study of infectious diseases (still in the works, though I’m no longer directly involved). Thanks to the repeated insistence of Derek Smith in Zoology at Cambridge, the document went through about 5 iterations, each more painful and difficult than the previous. It drove me nuts. But it was amazing how much better the thing became at each round, and the end result was hugely satisfying.

I’m going through the same process now with Fluidinfo as we prepare to raise our first round of outside financing. Putting together a slide show, executive summary, and demo is a ton of work. I’ve been round the loop a few times already. Earlier tonight I gave a presentation to Vicente López, general manager of the Barcelona Media Centre for Innovation. He poked holes in the presentation from start to finish. I took notes.

So I just spent the last 6 hours slowly twisting the towel. As a result the presentation is much improved. I figure we still have a couple of half twists left to do.

Meanwhile, I’ve paused to reward myself by knocking off today’s blog entry.

The opposite of shared nothing?

Thursday, October 25th, 2007

Shared nothing architectures are all the rage. And so, a little geeky joke to lighten the mood around here:

Q: What’s the opposite of shared nothing?

A: Shard nothing.

Look it up.

Target cheat sheet

Thursday, October 25th, 2007
B O H
C N K
R E W

I have a friend who sends me the Target puzzle from the Sydney Morning Herald every day. I’ve loved doing anagrams for as long as I can remember. I used to write many programs to process words for fun. At Waterloo I made lots of silly dictionaries with my friend Andrew Hensel. We used to make anagram dictionaries for fun reference and memorization.

Anyway, I decided to whip up an anagram dictionary maker in Python. Here’s the code:


import sys
from collections import defaultdict

words = defaultdict(list)

print ‘<html><head><title>Target cheat sheet.</title></head><body>’

for word in (line[:-1] for line in sys.stdin):
    words[.join(sorted(list(word.lower())))].append(word)

for letters in sorted(words.keys()):
    print ‘<strong>%s</strong> = ‘ % letters
    for word in sorted(words[letters], key=str.lower):
        print word

print ‘</body></html>’

I built the dictionary with the shell command

awk ‘length($0) == 9 {print}’ /usr/share/dict/web2 | ./anagram-dict.py > target.html

and you can see the result here. To use it, you take your anagram, sort its letters, and look up the result in that web page. For example, today’s anagram is “bohcnkrew”. Sorting those 9 letters we get “bcehknorw”. Looking at the results page (use the Find function in your browser!) we see two answers: “benchwork” and “workbench”.

Pocket battleship

Wednesday, October 24th, 2007

Last night I dreamed that Peter Roebuck, my cricket coach in 1981, had written an extremely long article about me, full of color pictures. Weirdly though, he had spelled my name “Terry Jun”, and so there were all these people trying to find out who the hell Terry Jun was.

What does this mean?

Roebuck had captained the Somerset county side, opened the batting, and was rumored (at least among our bunch of 17 year olds on the other side of the planet) to be under consideration for the England XI. He played with Viv Richards and Ian Botham and was full of stories.

Later I played with him in a local side. Other teams were always thrilled and despairing to hear that we had a borderline international batsman in our line up. In one case, this time not a dream, we played a side containing the then-famous Australian playwright Alex Buzo. As it turned out, Buzo was mad keen on cricket and had been long looking forward to the day when he might bowl to the famous Peter Roebuck. In the meantime, I had been looking forward to the day when I might bat against Alex Buzo, and was determined to hit him out of the park.

I prevailed, hitting him for consecutive sixes (yes, of course I tried for 3 in a row) and finishing the game – before Roebuck had a chance to bat! I was delighted, in fact am still delighted, that Buzo later wrote an article in a widely-read cricket magazine describing the time he almost bowled to the great Peter Roebuck. But, he wrote, the other team had sent in a “pocket battleship” who had given no quarter and denied him his chance.

Now I read with sadness that Alex Buzo is dead, and I wish things had been a little bit different. Maybe just one six, then a quick single, and then he could have bowled Roebuck for a golden duck.

Live from the Gatwick Express

Tuesday, October 23rd, 2007

I’ve flown in to the UK probably 30 times over the last 3 years. Here’s how
to cut a few corners:

  1. It should go without saying – don’t check baggage unless you have to.
  2. If you’re not checking baggage, print your boarding passes at home and
    go straight to the gate.
  3. Take a seat at the very front of the plane.
  4. Ask for and fill out a UK immigration form on the plane.
  5. On Easyjet you can buy Gatwick or Stansted (etc) Express train tickets
    on the plane. It’s faster and 20% cheaper.
  6. If arriving at Gatwick, take the stairs down to immigration –
    don’t folow the crowd walking down and around the gentle ramp – that’s
    about 4 times as long.
  7. If arriving at Stansted, when you go down the escalator look under and
    behind the elevator as early as you can – the shuttle train may be there
    already. You may be able to run for it.
  8. If arriving at Stansted, when you get on the shuttle train, position
    yourself in front of the crack between the doors on the far side of the
    train. That’s the side that will open when you arrive at immigration.
  9. If arriving at Gatwick without bags, go up the escalator to the
    left to get to baggage. It’s slightly closer to where you’ll exit
    at the top.
  10. If arriving at Gatwick, take the side lane “Arrivals from the European
    Union” out of the baggage area. It’s often empty.
  11. If arriving at Gatwick, there’s a small train ticket desk on your right
    immediately when you get into the terminal. Even if you have a ticket,
    check their screen for the time your next train. You may need to run for it.

Perhaps the best suggestion is something I haven’t done yet – have an iris
scan that will let you skip the queues to enter the UK. I could have done
it today if I hadn’t been wandering aimlessly around the stores.

Nova Spivak really gets it

Tuesday, October 23rd, 2007

Usually when I hear about the thinking behind new web technology I dismiss it pretty quickly. That’s not because I don’t like what people are doing or find it interesting, I just find that almost everything is some kind of application built on an old framework. I’m much more interested in trying to change the framework itself.

I’ve been aware of Radar Networks for some time. I talked to Tim O’Reilly about Fluidinfo in March 2007, and he compared what I was saying to Nova’s claims for Radar. Now that Radar have released Twine, I’ve gone and read some of Nova’s blog postings. I probably should have done that ages ago.

It turns out we agree on many things. Here’s one in particular, in an article entitled Understanding The Semantic Web: A Response to Tim O’Reilly’s Recent Defense of Web 2.0, he has a section entitled “THE SEMANTIC WEB IS THE DATA WEB” which corresponds nicely to my why data (information representation) is the key to the coming semantic web posting.

That’s pretty refreshing. And there’s more, including well-aligned and practical thinking about the word “semantic” and various other words.

I may say more in another posting.

Orwell on intellectuals

Monday, October 22nd, 2007

One of several things I admire about Orwell is that he doesn’t pull any punches and he turns his guns on all comers. Here’s a nice passage from a 1943 review of Beggar My Neighbour by Lionel Fielden.

In the last twenty years western civilization has given the intellectual security without responsibility, and in England, in particular, it has educated him in scepticism while anchoring him almost immovably in the privileged class. He has been in the position of a young man living on an allowance from a father whom he hates. The result is a deep feeling of guilt and resentment, not combined with any genuine desire to escape. But some psychological escape, some form of self-justification there must be, and one of the most satisfactory is transferring nationalism. During the nineteen-thirties the normal transference was to soviet Russia, but there are other alternatives, and it is noticeable that pacifism and anarchism, rather than Stalinism, are now gaining ground among the young. These creeds have the advantage that they aim at the impossible and therefore in effect demand very little. If you throw in a touch of oriental mysticism and Buchmanite raptures over Gandhi, you have everything that a disaffected intellectual needs. The life of an English gentleman and the moral attitudes of a saint can be enjoyed simultaneously.

And there’s more.

Resurrection

Monday, October 22nd, 2007

Today I resurrected as many of my old postings as I could find. I think I have about half. I’m still saddened by the loss of all those words. I can never believe it when I hear of writers who burn things, throw them away, etc. I even keep scraps of paper from 20 or more years ago that I wrote on. I don’t know why I place such value on simple words, but I do.

Anyway, I missed my blog. I miss some of the postings that are now gone forever.

I’m going to blog every single day, at least for today. Watch me.

I’m back

Sunday, October 21st, 2007

Well, I’m back.

The previous instantiation of this blog was washed away in a storm in early August. A server got hacked and in my hurry to have it decommissioned I forgot to pull out the MySql database for my blog. I’m still annoyed at myself – partly because it’s so public and basic an error, but mainly because I care so much about words and now all those words are gone. A recovery operation using google cache and the wayback machine got me about half of the posts back. I may add them here at some point. I’m pissed that I lost so much stuff, and there’s no-one to blame but myself.