Add to Technorati Favorites

Amants de Lulú

15:27 February 29th, 2008 by terry. Posted under barcelona, music. | 3 Comments »

I’ve not been blogging much recently, but that may change before long.

Today I was having a nap and awoke to hear music on the street downstairs. I went down to take a look, and took the camera along for your benefit. I should film more of this sort of thing, I like it so much, and there are lots of groups playing around here. But I’m often sitting around in my slippers working and I figure they may end before I get there.

Here’s Amants de Lulú playing about 50 meters from my front door. I hope they don’t mind appearing on YouTube. I bought their self-titled CD for €10 after filming them for these few minutes.


Keynote is good

19:15 February 15th, 2008 by terry. Posted under tech. | 2 Comments »

roman numerals in keynoteI’ve been playing with Keynote to make a presentation. There are a lot of things I don’t really like about using a Mac, but Keynote is not one of them.

It makes really attractive presentations. It’s easy to use. The help actually helps. You can export to multiple formats (Quicktime, Powerpoint, PDF, images, Flash, HTML, iPod).

And, it’s fun to use. I’m going to miss it when I head back to Linux.



Worst of the web award: Cheaptickets

16:22 February 14th, 2008 by terry. Posted under companies, me, tech. | 10 Comments »

Here’s a great example of terrible (for me at least) UI design.

I was just trying to change a ticket booking at Cheaptickets. Here’s the interface for selecting what you want to change (click to see the full image).

cheaptickets

As you can see, I indicated a date/time change on my return flight. When I clicked on the continue button, I got an error message:

An error has occurred while processing this page. Please see detail below. (Message 1500)

Please select flight attributes to change.

I thought there was some problem with Firefox not sending the information that I’d checked. So I tried again. Then I tried clicking a couple of the boxes. Then I tried with Opera. Then I changed machines and tried with IE on a windows box. All of these got me the exact same error.

I looked at the page several times to see if I’d missed something – like a check box to indicate which of the flights to change. I figured Cheaptickets must have an error server side. Then I thought come on, you must be doing something wrong.

Then I figured it out. Can you?


How the democrats could blow it. Again.

11:06 February 14th, 2008 by terry. Posted under politics. | 2 Comments »

The race to be the US democratic nominee is pretty interesting. As has been pointed out though, it’s going to come down to what the superdelegates decide to do. Here’s a timeline / recipe for the Democrats to really blow it.

  1. Obama’s popularity and momentum continue to climb. The popular vote is with him.
  2. Hillary is unable to accept that she’s not the chosen one, and refuses to concede.
  3. Instead, she and Bill pull strings and twist arms to get the superdelegates to vote for her.
  4. The superdelegates cave, and Hillary wins the nomination despite losing the popular vote. The irony is unbearable.
  5. Regular democratic voters are absolutely infuriated, the party is split, blacks, the young, and all sorts of other Obama voters don’t bother.
  6. John McCain is elected president.

I think that’s a not-unlikely scenario. And wouldn’t it be perfect? Once again, the triumph of politics-as-usual, money and influence over hope, change, fresh air, and the will of the people. Just what the jaded populace needs! The republicans, cashing in on the never-ending fumblings of the craven democrats. Hillary calling for party unity, for us to put our differences behind us, for one and all to support her, now that she’s been democratically elected as the nominee. Plus ça change.

I can see it all so vividly. It makes so much absurd sense and is so historically fitting that it must be inevitable.

One can always hope that the superdelegates will tell Hillary and Bill to take a hike. I wouldn’t bet on it though.

I think it’s more likely that by then Hillary’s campaign will have had the good sense to collapse around her.


The power of representation: Adding powers of two

17:42 February 13th, 2008 by terry. Posted under representation, tech. | 5 Comments »

decimalOn the left is an addition problem. If you know the answer without thinking, you’re probably a geek.

Suppose you had to solve a large number of problems of this type; adding consecutive powers of 2 starting from 1. If you did enough of them you might guess that 1 + 2 + 4 + … + 2n – 1 is always equal to 2n – 1. In the example on the left, we’re summing from 20 to 210 and the answer is 211 – 1 = 2047.

And if you cast your mind back to high-school mathematics you might even be able to prove this using induction.

But that’s a lot of work, even supposing you see the pattern and are able to do a proof by induction.

binary-addLet’s instead think about the problem in binary (i.e., base 2). In binary, the sum looks like the image on the right.

There’s really no work to be done here. If you think in binary, you already know the answer to this “problem”. It would be a waste of time to even write the problem down. It’s like asking a regular base-10 human to add up 3 + 30 + 300 + 3000 + 30000, for example. You already know the answer. In a sense there is no problem because your representation is so nicely aligned with the task that the problem seems to vanish.

Why am I telling you all this?

Because, as I’ve emphasized in three other postings, if you choose a good representation, what looks like a problem can simply disappear.

I claim (without proof) that lots of the issues we’re coming up against today as we move to a programmable web, integrated social networks, and as we struggle with data portability, ownership, and control will similarly vanish if we simply start representing information in a different way.

I’m trying to provide some simple examples of how this sort of magic can happen. There’s nothing deep here. In the non-computer world we wouldn’t talk about representation, we’d just say that you need to look at the problem from the right point of view. Once you do that, you see that it’s actually trivial.


Talking about Antigenic Cartography at ETech

13:34 February 12th, 2008 by terry. Posted under me, tech. | Comments Off on Talking about Antigenic Cartography at ETech

ETech 2008Blogs are all about self-promotion, right? Right.

I’m talking at ETech in the first week of March in San Diego. The talk is at 2pm on Wednesday March 3, and is titled Antigenic Cartography: Visualizing Viral Evolution for Influenza Vaccine Design.

You can find out more about Antigenic Cartography here and here.

Here’s my abstract:

Mankind has been fighting influenza for thousands of years. The 1918 pandemic killed 50-100 million people. Today, influenza kills roughly half a million people each year. Because the virus evolves, it is necessary for vaccines to track its evolution closely in order to remain effective.

Antigenic Cartography is a new computational method that allows a unique visualization of viral evolution. First published in 2004, the technique is now used to aid the WHO in recommending the composition of human influenza vaccines. It is also being applied to the design of pandemic influenza vaccines and to the study of a variety of other infectious diseases.

The rise of Antigenic Cartography is a remarkable story of how recent immunological theory, mathematics, and computer science have combined with decades of virological and medical research and diligent data collection to produce an entirely new tool with immediate practical impact.

This talk will give you food for thought regarding influenza, and move on to explain what Antigenic Cartography is, how it works, and exactly how it is used to aid vaccine strain selection—all in layman’s terms, with no need for a biological or mathematical background.

In case you’re wondering, no, I didn’t go so far as to make the “I’m speaking” image above. I chose it from the conference speaker resources. Self-promotion has its limits.


Talking at TTI/Vanguard Smart(er) Data conference

15:51 February 11th, 2008 by terry. Posted under me, representation. | 6 Comments »

tti


I’ve been invited to speak at a TTI/Vanguard conference in Atlanta on SMART(ER) DATA on Feb 20/21.

Here’s the abstract.

Representation, Representation, Representation

In this talk I will argue for the importance of information representation. Choice of representation is critical to the design of computational systems. A good choice can simplify or even eliminate problems by reducing or obviating the need for clever algorithms. By making better choices about low-level information representation we can broadly increase the power and flexibility of higher-level applications, and also make them easier to build. In other words, we can more easily make future applications smarter if we are smarter about how we represent the data they manipulate. Despite all this, representation choice is often ignored or taken for granted.

Key trends in our experience with online data are signalled by terms such as mashups, data web, programmable web, read/write web, and collective databases and also by the increasing focus on APIs, inter-operability, transparency, standards, openness, data portability, and data ownership.

To fully realize our hopes for future applications built along these lines, and our interactions with the information they will present to us, we must rethink three important aspects of how we represent information. These are our model of information ownership and control, the distinction between data and metadata, and how we organize information. I will illustrate why these issues are so fundamental and demonstrate how we are addressing them at Fluidinfo.


My twitter stats

09:45 February 11th, 2008 by terry. Posted under me, twitter. | Comments Off on My twitter stats

twitter statsI seem to be done with Twitter, at least for now. The graphic shows my monthly usage graph (courtesy of tweetstats) – click for the full-sized image.

I do find Twitter valuable, but I don’t want to spend time on it. It’s a bit like TV or video games for me – I quite enjoy those things, but there are almost always better things to do.

I’ll probably subscribe to some form of Twitter alert or digest at some point. I do find it useful to know when people are coming to Barcelona. But I don’t want to monitor Twitter. Like IM, I find it too distracting and waste too much time just going to check if anything’s new.

Etc.


Social Graph foo camp was a blast

00:11 February 9th, 2008 by terry. Posted under companies, travel. | Comments Off on Social Graph foo camp was a blast

foo camp logoI spent last weekend at the Open Social Foo camp held on the O’Reilly campus in Sebastopol, CA. The camp was organized by David Recordon and Scott Kveton, with sponsorship from various companies, especially including O’Reilly. I was lucky enough to have my airfare paid for, so lots of thanks to all concerned for that.

The camp was great. Very few people actually camped, almost everyone just found somewhere to sleep in the O’Reilly offices. Many of us didn’t sleep that much anyway.

There’s something about the modern virtual lifestyle that so many of us lead that leaves a real social hole. It’s been about 20 years since I really hung out at all hours with other coders. It’s something I associate most strongly with being an undergrad, with working at Micro Forté, and then in doing a lot of hacking as a grad student at The University of Waterloo in Canada.

So even though it was just 48 hours at the foo camp, it was really great. It’s not often I have the pleasurable feeling of being surrounded by tons of people who know way way more than I do about almost everything under discussion. That’s not meant to sound arrogant – I mean that I don’t get out enough, and I don’t live in SF, etc. It’s nice to have spent many years hanging around universities studying all sorts of relatively obscure and academic topics, and sometimes you wonder what everyone else was doing. Some of those people spent the years hacking really deeply on systems, and their knowledge appears encyclopedic next to the smattering of stuff I picked up along the way. It’s nice to bump into a whole bunch of them at once. It was extremely hard to get a word in in many of the animated conversations, which reminded me at times of discussions at the Santa Fe Institute. That’s a bit of a pain, but it’s still far better than some alternatives – e.g., not having a room full of super confident deeply knowledgeable people who all want to have their say, even if that means trampling all over others, ignoring what the previous speaker said, not leaving even 1/10th of a second conversational gap, and just plain old bull-dozering on while others try to jump in and wrest away control of the conversation.

I could write much more about all this.

I also played werewolf with up to 20 others on the Saturday night. In some ways I don’t really like the game, but it’s fun to sit around with a bunch of smart people of all ages who are all trying to convince each other they’re telling the truth when you know for sure some are lying. I was up until 4:30am that night. I went to the office I slept in on the Friday night, but found it had about 10 people still up, all talking about code. When I got up at 8am the next morning, they were all still there, still talking about code. I felt a bit guilty, like a glutton, for allowing myself three and a half hours sleep. Nice.


S3 numbers revisited: six orders of magnitude does matter

17:18 January 29th, 2008 by terry. Posted under companies. | 3 Comments »

OK…. I should have realized in my original posting that the Oct 2007 10,000,000,000,000 objects figure was the source of the problem. I knew S3 could not be doubling every week, and that Amazon could not be making $11B a month, but didn’t see the now-obvious error in the input.

So what sort of money are they actually making?

Don MacAskill pointed me to this article at Forbes which says the number of objects at the end of 2007 was up to 14B from 10B in October. So let’s suppose the number now stands at 15B (1.5e10) and that Amazon are currently adding about 1B objects a month.

I’ll leave the other assumptions alone, for now.

Amazon’s S3 pricing for storage is $0.15 per GB per month. Assume all this data is stored on their cheaper US servers and that objects take on average 1K bytes. So that’s roughly 1.5e10 * 1e3 / 1e9 = 1.5e4 gigabytes in storage, for which Amazon charges $0.15 per month, or $2250.

Next, let’s do incoming data transfer cost, at $0.10 per GB. That’s simply 2/3rds of the data storage charge, so we add another 2/3 * $2250, or $1500.

Then the PUT requests that transmit the new objects: 1B new objects were added in the last month. Each of those takes a PUT, and these are charged at $0.01 per thousand, so that’s 1e9 / 1e3 * $0.01, or $10,000.

Lastly, some of the stored data is being retrieved. Some will just be backups, and never touched, and some will simply not be looked at in a given month. Let’s assume that just 1% of all (i.e., not just the new) objects and data are retrieved in any given month.

That’s 1.5e10 * 1e3 * 0.01 / 1e9 = 150 GB of outgoing data, or 0.15K TB. That’s much less than 10TB, so all this goes out at the highest rate, $0.18 per GB, giving another $27 in revenue.

And if 1% of objects are being pulled back, that’s 1.5e10 * 0.01 = 1.5e8 GET operations, which are charged at $0.01 per 10K. So that’s 1.5e8 / 1e4 * $0.01 = $150 for the GETs.

This gives a total of $2250 + $1500 + $10,000 + $27 + $150 = $13,927 in the last month.

And that doesn’t look at all like $11B!

Where did all that revenue go? Mainly it’s not there because Amazon only added 1e9 objects in the last month, not 1e15. That’s six orders of magnitude. So instead of $11B in PUT charges, they make a mere $11K. That’s about enough to pay one programmer.

I created a simple Amazon S3 Model spreadsheet where you can play with the numbers. The cells with the orange background are the variables you can change in the model. The variables we don’t have a good grip on are the average size of objects and the percentage of objects retrieved each month. If you increase average object size to 1MB, revenue jumps to $3.7M.

BTW, the spreadsheet has a simplification: regarding all data as being owned by one user, and using that to calculate download cost. In reality there are many users, and most of them will be paying for all their download data at the top rate. Also note that my % of objects retrieved is a simplification. Better would be to estimate how many objects are retrieved (i.e., including objects being retrieved multiple times) as well as estimating the download data amount. I roll these both into one number.


Google maps gets SFO location waaaay wrong

22:16 January 28th, 2008 by terry. Posted under companies, tech. | 1 Comment »

google-sfoBefore leaving Barcelona yesterday morning, I checked Google maps to get driving directions from San Francisco International airport (SFO) to a friend’s place in Oakland.

Google got it way wrong. Imagine trying to follow these instructions if you didn’t know they were so wrong. Click on the image to see the full sized map. Google maps is working again now.



Amazon S3 to rival the Big Bang?

00:40 January 28th, 2008 by terry. Posted under companies, tech. | 4 Comments »

Note: this posting is based on an incorrect number from an Amazon slide. I’ve now re-done the revenue numbers.

We’ve been playing around with Amazon’s Simple Storage Service (S3).

Adam Selipsky, Amazon VP of Web Services, has put some S3 usage numbers online (see slides 7 and 8). Here are some numbers on those numbers.

There were 5,000,000,000 (5e9) objects inside S3 in April 2007 and 10,000,000,000,000 (1e13) in October 2007. That means that in October 2007, S3 contained 2,000 times more objects than it did in April 2007. That’s a 26 week period, or 182 days. 2,000 is roughly 211. That means that S3 is doubling its number of objects roughly once every 182/11 = 16.5 days. (That’s supposing that the growth is merely exponential – i.e., that the logarithm of the number of objects is increasing linearly. It could actually be super-exponential, but let’s just pretend it’s only exponential.)

First of all, that’s simply amazing.

It’s now 119 days since the beginning of October 2007, so we might imagine that S3 now has 2119/16.5 or about 150 times as many objects in it. That’s 1,500,000,000,000,000 (1.5e15) objects. BTW, I assume by object they mean a key/value pair in a bucket (these are put into and retrieved from S3 using HTTP PUT and GET requests).

Amazon’s S3 pricing for storage is $0.15 per GB per month. Assume all this data is stored on their cheaper US servers and that objects take on average 1K bytes. These seem reasonable assumptions. (A year ago at ETech, SmugMug CEO Don MacAskill said they had 200TB of image data in S3, and images obviously occupy far more than 1K each. So do backups.) So that’s roughly 1.5e15 * 1K / 1G = 1.5e9 gigabytes in storage, for which Amazon charges $0.15 per month, or $225M.

That’s $225M in revenue per month just for storage. And growing rapidly – S3 is doubling its number of objects every 2 weeks, so the increase in storage might be similar.

Next, let’s do incoming data transfer cost, at $0.10 per GB. That’s simply 2/3rds of the data storage charge, so we add another 2/3 * $225M, or $150M.

What about the PUT requests, that transmit the new objects?

If you’re doubling every 2 weeks, then in the last month you’ve doubled twice. So that means that a month ago S3 would have had 1.5e15 / 4 = 3.75e14 objects. That means 1.125e15 new objects were added in the last month! Each of those takes an HTTP PUT request. PUTs are charged at one penny per thousand, so that’s 1.125e15 / 1000 * $0.01.

Correct me if I’m wrong, but that looks like $11,250,000,000.

To paraphrase a scene I loved in Blazing Saddles (I was only 11, so give me a break), that’s a shitload of pennies.

Lastly, some of that stored data is being retrieved. Some will just be backups, and never touched, and some will simply not be looked at in a given month. Let’s assume that just 1% of all (i.e., not just the new) objects and data are retrieved in any given month.

That’s 1.5e15 * 1K * 1% / 1e9 = 15M GB of outgoing data, or 15K TB. Let’s assume this all goes out at the lowest rate, $0.13 per GB, giving another $2M in revenue.

And if 1% of objects are being pulled back, that’s 1.5e15 * 1% = 1.5e13 GET operations, which are charged at $0.01 per 10K. So that’s 1.5e13 / 10K * $0.01 = $15M for the GETs.

This gives a total of $225M + $150M + $11,250M + $2M + $15M = $11,642M in the last month. That’s $11.6 billion. Not a bad month.

Can this simple analysis possibly be right?

It’s pretty clear that Amazon are not making $11B per month from S3. So what gives?

One hint that they’re not making that much money comes from slide 8 of the Selipsky presentation. That tells us that in October 2007, S3 was making 27,601 transactions per second. That’s about 7e10 per month. If Amazon was already doubling every two weeks by that stage, then 3/4s of their 1e13 S3 objects would have been new that month. That’s 7.5e12, which is 100 times more transactions just for the incoming PUTs (no outgoing) than are represented by the 27,601 number. (It’s not clear what they mean by transaction – I mean what goes on in a single transaction.)

So something definitely doesn’t add up there. It may be more accurate to divide the revenue due to PUTs by 100, bringing it down to a measly $110M.

An unmentioned assumption above is that Amazon is actually charging everyone, including themselves, for the use of S3. They might have special deals with other companies, or they might be using S3 themselves to store tons of tiny objects. I.e., we don’t know that the reported number is of paid objects.

There’s something of a give away the razors and charge for the blades feel to this. When you first see Amazon’s pricing, it looks extremely cheap. You can buy external disk space for, e.g., $100 for 500GB, or $0.20 per GB. Amazon charges you just $0.18 per GB for replicated storage. But that’s per month. A disk might last you two years, so we could conclude that Amazon is e.g., 8 or 12 times more expensive, depending on the degree of replication. But you don’t need a data center or to grow (or shrink) a data center, cooling, employees, replacement disks—all of which have been noted many times—so the cost perhaps isn’t that high.

But…. look at those PUT requests! If an object is 1K (as above), it takes 500M of them to fill a 500GB disk. Amazon charges you $0.01 per 1000, so that’s 500K * $0.01 or $5000. That’s $10 per GB just to access your disk (i.e., before you even think about transfer costs and latency), which is about 50 times the cost of disk space above.

In paying by the PUT and GET, S3 users are in effect paying Amazon for the compute resources needed to store and retrieve their objects. If we estimate it taking 10ms for Amazon to process a PUT, then 1000 takes 10 seconds of compute time, for which Amazon charges $0.01. That’s nearly $26K per month being paid for machines to do PUT storage, which is 370 times more expensive than what Amazon would charge you to run a small EC2 instance for a month. Such a machine probably costs Amazon around $1500 to bring into service. So there’s no doubt they’re raking it in on the PUT charges. That makes the 5% margins of their retailing operation look quaint. Wall Street might soon be urging Bezos to get out of the retailing business.

Given that PUTs are so expensive, you can expect to see people encoding lots of data into single S3 objects, transmitting them all at once (one PUT), and decoding when they get the object back. That pushes programmers towards using more complex formats for their data. That’s a bad side-effect. A storage system shouldn’t encourage that sort of thing in programmers.

Nothing can double every two weeks for very long, so that kind of growth simply cannot continue. It may have leveled out in October 2007, which would make my numbers off by roughly 2119/16.5 or about 150, as above.

When we were kids they told us that the universe has about 280 particles in it. 1.5e15 is already about 250, so only 30 more doubling are needed, which would take Amazon just over a year. At that point, even if all their storage were in 1TB drives and objects were somehow stored in just 1 byte each, they’d still need about 240 disk drives. The earth has a surface area of 510,065,600 km2 so that would mean over 2000 Amazon disk drives in each square kilometer on earth. That’s clearly not going to happen.

It’s also worth bearing in mind that Amazon claims data stored into S3 is replicated. Even if the replication factor is only 2, that’s another doubling of the storage requirement.

At what point does this growth stop?

Amazon has its Q4 2007 earnings call this Wednesday. That should be revealing. If I had any money I’d consider buying stock ASAP.


The Black Swan

02:42 January 26th, 2008 by terry. Posted under books. | 4 Comments »

I got a copy of The Black Swan: The Impact of the Highly Improbable for xmas.

In London a couple of weeks ago I pointed it out to Russell as we wandered through a Waterstones. He picked it up, flipped it open, and immediately began to make deadly and merciless fun of it.

For me this is the kind of book I know I’ll want to read if it’s any good, and which I know I’ll (try to) read in any case because these days I’m meeting the kind of people who like to refer to this sort of book. Not wanting to look like I’m not up to speed on the latest popular science, I’ll read for as long as I can bear it.

There are lots of books in this category. E.g., The Tipping Point, which I enjoyed, Wisdom of the Crowds, which I found so annoying and bad that I had to stop reading it, and A Short History of Almost Everything which was semi-amusing and which I made myself finish despite having much better things to read. There’s also Everything is Miscellaneous, which I enjoyed a lot, and Predictably Irrational: The Hidden Forces That Shape Our Decisions, which I’ve yet to get hold of. You know the type.

I went to bed early (3am) the other night so I could read a bit of the Black Swan before I went to sleep.

I got about 2 pages in and found it so bad that I almost had to put it down. The prologue is a dozen pages long. I forced myself to read the whole thing.

It’s dreadful, it’s pretentious, it’s vague, it’s silly, it’s obvious, it’s parenthesized and qualified beyond belief, it’s full of the author’s made-up names for things (Black Swan, antiknowledge, empty suits, GIF, Platonicity, Platonic fold, nerdified, antilibrary, extremistan, mediocristan), it’s self-indulgent, it’s trite. It’s a painfully horrible introduction to what I’d hoped would be a good book.

It was so bad that I couldn’t believe it could go on, so I decided to keep reading. This is published by Random House, who you might hope would know better. But I guess they know a smash hit popular theme and title when they see it, and they’ll publish it, even if they know the style is appalling and for whatever reason they don’t have the leverage to force changes.

Fortunately though, the book improves.

The guy is obviously very smart and has been thinking about some of this for a long time, he has an unconventional take on many things, and he does offer insights. I am still finding the style annoying, but I have a feeling I will finish it and I know for sure I’ll take some lessons away. I’m up to page 56, with about 250 to go. I suppose I’ll blog about it again if it seems worthwhile.

If you’re contemplating reading it, I suggest jumping in at Chapter 3.

I’m off to read a bit more now.


Worst of the web award: MIT/Stanford Venture Lab

00:03 January 25th, 2008 by terry. Posted under tech. | 2 Comments »

vlabI’ve just awarded one of my coveted Worst of the Web awards to the MIT/Stanford Venture Lab.

Here’s why. They are hosting a video I’d like to watch. You can see it on their home page right now, Web 3.0: New Opportunities on the Semantic Web.

If you click on that link, wonderful things happen.

You get taken to a page with a Watch Online link. Clicking on it tells you that this is a “Restricted Article!” and that you need to register to see the video. Another click and you’re faced with a page that gives you four registration options: Volunteers, Board Members, Standard Members, or Sponsors. Below each of them it says “Rates: Membership price: $0.00”.

Ok, so we’re going to pay $0.00 to sign up for a free video. That takes me to a page with 15 fields, including “Billing Address”. If you leave everything blank and try clicking through, it tells you “A user account with the same email you entered already exists in the system.” But I left the email field empty.

When you fill in email and your name, you get to confirm your purchase: Review your order details. If all appears ok, click “Submit Transaction ->” to finalize the transaction. There’s a summary of the charges, with Price and Total columns, Sub-totals, Tax, Shipping, Grand Total – all set to $0.00. There’s a button labeled “Submit Transaction” and a warning: “Important: CLICK ONCE ONLY to avoid being charged twice.”

You then wind up on a profile page with no less than 54 fields! Scroll to the bottom, take yourself off the mailing list, then “Update profile”.

OK, so you’re registered. The top left of the screen has your user name, and the top right has a link labeled “Sign Out”. So you’re apparently logged in too.

Now you go back to the home page, and click on the link for the video. Then click on the Watch Online link. And it tells you this is a “Restricted Article!” and that if you’re already a member you can log in. But I thought I was logged in?

OK…. click to log in. There’s a field for email address and password. What password? Hmmm. I can click to have it reset, so I do that. A password and log-in link arrives in email.

I follow the link and log in. I go back to the home page. I click on the link to the video I want. I click on Watch Online.

Now I get a screen with a flash player in it. It says Please wait. Apparently forever. I wait ten minutes and begin to blog about my wonderful experience at the MIT/Stanford Venture Lab.

The video never loads.

I actually went through this process twice to verify the steps. The first time was a bit more complex, believe it or not, and involved a Captcha. Also, the two welcome mails I got from signing up were totally different! One looked like

Dear Terry,

Welcome to vlab.org. You are now ready to enjoy the many benefits our site offers its registered users.

Please login using:
Login: terry@xxxjon.es
Password: lksjljls

For your convenience, you can change your password to something more easily remembered once you sign in.

and the other also greeted me and finally, as a footnote, at the very end of the mail after the goodbye:

IMPORTANT: Your account is now active. To log in, go to http://www.vlab.org/user.html?op=login and use “i2nosjf3p” as your temporary password.

So weird.

And then, to top off the whole thing, I get a friendly email greeting which includes the following:

Dear Terry,

Thank you, and welcome to our community.

Your purchase of Standard Members for the amount of $0 entitles you to enjoy more of our activities, gain greater access to site functionality, and enhance your overall experience with us.

Your Standard Members is now valid and will expire on January 16th, 2038.

You couldn’t make this stuff up. It’s 2008. We’re trying to look at a free online video. Hosted by MIT/Stanford of all people. We’re prepared to jump through hoops! We’ll even risk being billed $0.00 multiple times! But no cigar.


Final straws for Mac OS X

16:54 January 24th, 2008 by terry. Posted under companies, tech. | 14 Comments »

I’ve had it with Mac OS X.

I’m going to install Linux on my MacBook Pro laptop in March once I’m back from ETech.

I’ve been thinking about this for months. There are just so many things I don’t like about Mac OS X.


Yes, it’s beautiful, and there are certainly things I do like (e.g., iCal). But I don’t like:

  • Waiting forever when I do a rm on a big tree
  • Sitting wondering what’s going on when I go back to a Terminal window and it’s unresponsive for 15 seconds
  • Weird stuff like this
  • Case insensitive file names (see above problem)
  • Having applications often freeze and crash. E.g. emacs, which basically never crashes under Linux

I could go on. I will go on.

I don’t like it when the machine freezes, and that happens too often with Mac OS X. I used Linux for years and almost never had a machine lock up on me. With Mac OS X I find myself doing a hard reset about once a month. That’s way too flaky for my liking.

Plus, I do not agree to trade a snappy OS experience for eye candy. I’ll take both if I can have them, but if it’s a choice then I’ll go back to X windows and Linux desktops and fonts and printer problems and so on – all of which are probably even better than they already were a few years back.

This machine froze on me 2 days ago and I thought “Right. That’s it.” When I rebooted, it was in a weird magnifying glass mode, in which the desktop was slightly magnified and moved around disconcertingly whenever I moved the mouse. Rebooting didn’t help. Estéve correctly suggested that I somehow had magnification on. But, how? WTF is going on?

And, I am not a fan of Apple.

In just the last two days, we have news that 1. Apple crippled its DTrace port so you can’t trace iTunes, and 2. Apple QuickTime DRM Disables Video Editing Apps so that Adobe’s After Effects video editing software no longer works after a QuickTime update.

It’s one thing to use UNIX, which I have loved for over 25 years, but it’s another thing completely to be in the hands of a vendor who (regularly) does things like this while “upgrading” other components of your system.

Who wants to put up with that shit?

And don’t even get me started on the iPhone, which is a lovely and groundbreaking device, but one that I would never ever buy due to Apple’s actions.

I’m out of here.


Understanding high-dimensional spaces

18:46 January 23rd, 2008 by terry. Posted under other, tech. | 10 Comments »


I’ve spent lots of time thinking about high-dimensional spaces, usually in the context of optimization problems. Many difficult problems that we face today can be phrased as problems of navigating in high-dimensional spaces.

One problem with high-dimensional spaces is that they can be highly non-intuitive. I did a lot of work on fitness landscapes, which are a form of high dimensional space, and ran into lots of cases in which problems were exceedingly difficult because it’s not clear how to navigate efficiently in such a space. If you’re trying to find high points (e.g., good solutions), which way is up? We’re all so used to thinking in 3 dimensions. It’s very easy to do the natural thing and let our simplistic lifelong physical and visual 3D experience influence our thinking about solving problems in high-dimensional spaces.

Another problem with high-dimensional spaces is that we can’t visualize them unless they are very simple. You could argue that an airline pilot in a cockpit monitoring dozens of dials (each dial gives a reading on one dimension) does a pretty good job of navigating a high-dimensional space. I don’t mean the 3D space in which the plane is flying, I mean the virtual high-dimensional space whose points are determined by the readings on all the instruments.

I think that’s true, but the landscape is so smooth that we know how to move around on it pretty well. Not too many planes fall out of the sky.

Things get vastly more difficult when the landscape is not smooth. In fact they get positively weird. Even with trivial examples, like a hypercube, things get weird fast. For example, if you’re at a vertex on a hypercube, exactly one half of the space is reachable in a single step. That’s completely non-intuitive, and we haven’t even put fitness numbers on the nodes. When I say fitness, I mean goodness, or badness, or energy level, or heuristic, or whatever it is you’re dealing with.

We can visually understand and work with many 3D spaces (though 3D mazes can of course be hard). We can hold them in our hands, turn them around, and use our visual system to help us. If you had to find the high-point looking out over a collection of sand dunes, you could move to a good vantage point (using your visual system and understanding of 3D spaces) and then just look. There’s no need to run an optimization algorithm to find high points, avoiding getting trapped in local maxima, etc.

But that’s not the case in a high-dimensional space. We can’t just look at them and solve problems visually. So we write awkward algorithms that often do exponentially increasing amounts of work.

If we can’t visually understand a high-dimensional space, is there some other kind of understanding that we could get?

If so, how could we prove that we understood the space?

I think the answer might be that there are difficult high-dimensional spaces that we could understand, and demonstrate that we understand them.

One way to demonstrate that you understand a 3D space is to solve puzzles in it, like finding high points, or navigating over or through it without crashing.

We can apply the same test to a high-dimensional space: build problems and see if they can be solved on the fly by the system that claims to understand the space.

One way to do that is the following.

Have a team of people who will each sit in front of a monitor showing them a 3D scene. They’ll each have a joystick that they can use to “fly” through the scene that they see. You take your data and give 3 dimensions to each of the people. You do this with some degree of dimensional overlap. Then you let the people try to solve a puzzle in the space, like finding a high point. Their collective navigation gives you a way to move through the high-dimensional space.

You’d have to allocate dimensions to people carefully, and you’d have to do something about incompatible decisions. But if you built something like this (e.g., with 2 people navigating through a 4D space), you’d have a distributed understanding of the high-dimensional space. No one person would have a visual understanding of the whole space, but collectively they would.

In a way it sounds expensive and like overkill. But I think it’s pretty easy to build and there’s enormous value to be had from doing better optimization in high-dimensional spaces.

All we need is a web server hooked up to a bunch of people working on Mechanical Turk. Customers upload their high-dimensional data, specify what they’re looking for, the data is split by dimension, and the humans do their 3D visual thing. If the humans are distributed and don’t know each other they also can’t collude to steal or take advantage of the data – because they each only see a small slice.

There’s a legitimate response that we already build systems like this. Consider the hundreds of people monitoring the space shuttle in a huge room, each in front of a monitor. Or even a pilot and co-pilot in a plane, jointly monitoring instruments (does a co-pilot do that? I don’t even know). Those are teams collectively understanding high-dimensional spaces. But they’re, in the majority of cases, not doing overlapping dimensional monitoring, and the spaces they’re working in are probably relatively smooth. It’s not a conscious effort to collectively monitor or understand a high-dimensional space. But the principle is the same, and you could argue that it’s a proof the idea would work – for sufficiently non-rugged spaces.

Apologies for errors in the above – I just dashed this off ahead of going to play real football in 3D. That’s a hard enough optimization problem for me.


Giselle is served an apple martini, but she doesn’t drink it

12:35 January 20th, 2008 by terry. Posted under other. | 2 Comments »

Well that’s a relief.


One email a day

01:58 January 19th, 2008 by terry. Posted under me, tech. | Comments Off on One email a day

I’ve got my email inbox locked down so tightly that only one email made it through today. That’s down from several hundred a day just a few weeks ago.

All the email that doesn’t make it immediately into my inbox gets filed elsewhere. I deal with it all quickly – either deleting stuff (mailing lists), saving, or replying and then saving.

I’m spending way less time looking at my inbox wondering what I didn’t reply to in a list of a few thousand emails. That’s good. I’m spending less time blogging. I haven’t been on Twitter for ages.

In the productivity corner, I somehow managed (with help) to get a 3 meter whiteboard up here and onto the wall. It’s fantastic. I spend 2+ hours every morning talking with Estéve, drawing circles, lines, trees, and random scrawly notes. Today I sat talking to him in my chair while using my laser pointer (thanks Derek!) to point to things on the whiteboard. Ah, the luxury.


Free wifi at Stansted

17:19 January 11th, 2008 by terry. Posted under travel. | Comments Off on Free wifi at Stansted

I’m at Stansted heading back to Barcelona. There’s free wifi here (on the merula network in the waiting area for gates 1-19), for the first time I’ve seen it. At first I didn’t understand their web page, then I read the login box which clearly says to enter merula as user name and password. It works.


Wifi on a bus

07:58 January 11th, 2008 by terry. Posted under travel. | Comments Off on Wifi on a bus

I’m on the X90 National Express bus from Oxford to London. At the bus stop before we left I pulled out my laptop to do some work on a presentation. I noticed there was an open wifi signal and thought I’d connect quickly to pick up my mail.

It turns out the wifi network is on the bus.

I’m now speeding down the motorway, it’s gray and raining outside, and I’m sitting here warm and online. I suppose all National Express buses have wifi. One day it will be a rarity not to have network access. Today is the first time I’ve had access from a bus. Nice.