Add to Technorati Favorites

Worst of the web award: MIT/Stanford Venture Lab

00:03 January 25th, 2008 by terry. Posted under tech. 2 Comments »

vlabI’ve just awarded one of my coveted Worst of the Web awards to the MIT/Stanford Venture Lab.

Here’s why. They are hosting a video I’d like to watch. You can see it on their home page right now, Web 3.0: New Opportunities on the Semantic Web.

If you click on that link, wonderful things happen.

You get taken to a page with a Watch Online link. Clicking on it tells you that this is a “Restricted Article!” and that you need to register to see the video. Another click and you’re faced with a page that gives you four registration options: Volunteers, Board Members, Standard Members, or Sponsors. Below each of them it says “Rates: Membership price: $0.00”.

Ok, so we’re going to pay $0.00 to sign up for a free video. That takes me to a page with 15 fields, including “Billing Address”. If you leave everything blank and try clicking through, it tells you “A user account with the same email you entered already exists in the system.” But I left the email field empty.

When you fill in email and your name, you get to confirm your purchase: Review your order details. If all appears ok, click “Submit Transaction ->” to finalize the transaction. There’s a summary of the charges, with Price and Total columns, Sub-totals, Tax, Shipping, Grand Total – all set to $0.00. There’s a button labeled “Submit Transaction” and a warning: “Important: CLICK ONCE ONLY to avoid being charged twice.”

You then wind up on a profile page with no less than 54 fields! Scroll to the bottom, take yourself off the mailing list, then “Update profile”.

OK, so you’re registered. The top left of the screen has your user name, and the top right has a link labeled “Sign Out”. So you’re apparently logged in too.

Now you go back to the home page, and click on the link for the video. Then click on the Watch Online link. And it tells you this is a “Restricted Article!” and that if you’re already a member you can log in. But I thought I was logged in?

OK…. click to log in. There’s a field for email address and password. What password? Hmmm. I can click to have it reset, so I do that. A password and log-in link arrives in email.

I follow the link and log in. I go back to the home page. I click on the link to the video I want. I click on Watch Online.

Now I get a screen with a flash player in it. It says Please wait. Apparently forever. I wait ten minutes and begin to blog about my wonderful experience at the MIT/Stanford Venture Lab.

The video never loads.

I actually went through this process twice to verify the steps. The first time was a bit more complex, believe it or not, and involved a Captcha. Also, the two welcome mails I got from signing up were totally different! One looked like

Dear Terry,

Welcome to vlab.org. You are now ready to enjoy the many benefits our site offers its registered users.

Please login using:
Login: terry@xxxjon.es
Password: lksjljls

For your convenience, you can change your password to something more easily remembered once you sign in.

and the other also greeted me and finally, as a footnote, at the very end of the mail after the goodbye:

IMPORTANT: Your account is now active. To log in, go to http://www.vlab.org/user.html?op=login and use “i2nosjf3p” as your temporary password.

So weird.

And then, to top off the whole thing, I get a friendly email greeting which includes the following:

Dear Terry,

Thank you, and welcome to our community.

Your purchase of Standard Members for the amount of $0 entitles you to enjoy more of our activities, gain greater access to site functionality, and enhance your overall experience with us.

Your Standard Members is now valid and will expire on January 16th, 2038.

You couldn’t make this stuff up. It’s 2008. We’re trying to look at a free online video. Hosted by MIT/Stanford of all people. We’re prepared to jump through hoops! We’ll even risk being billed $0.00 multiple times! But no cigar.

AddThis Social Bookmark Button

Final straws for Mac OS X

16:54 January 24th, 2008 by terry. Posted under companies, tech. 14 Comments »

I’ve had it with Mac OS X.

I’m going to install Linux on my MacBook Pro laptop in March once I’m back from ETech.

I’ve been thinking about this for months. There are just so many things I don’t like about Mac OS X.


Yes, it’s beautiful, and there are certainly things I do like (e.g., iCal). But I don’t like:

  • Waiting forever when I do a rm on a big tree
  • Sitting wondering what’s going on when I go back to a Terminal window and it’s unresponsive for 15 seconds
  • Weird stuff like this
  • Case insensitive file names (see above problem)
  • Having applications often freeze and crash. E.g. emacs, which basically never crashes under Linux

I could go on. I will go on.

I don’t like it when the machine freezes, and that happens too often with Mac OS X. I used Linux for years and almost never had a machine lock up on me. With Mac OS X I find myself doing a hard reset about once a month. That’s way too flaky for my liking.

Plus, I do not agree to trade a snappy OS experience for eye candy. I’ll take both if I can have them, but if it’s a choice then I’ll go back to X windows and Linux desktops and fonts and printer problems and so on – all of which are probably even better than they already were a few years back.

This machine froze on me 2 days ago and I thought “Right. That’s it.” When I rebooted, it was in a weird magnifying glass mode, in which the desktop was slightly magnified and moved around disconcertingly whenever I moved the mouse. Rebooting didn’t help. Estéve correctly suggested that I somehow had magnification on. But, how? WTF is going on?

And, I am not a fan of Apple.

In just the last two days, we have news that 1. Apple crippled its DTrace port so you can’t trace iTunes, and 2. Apple QuickTime DRM Disables Video Editing Apps so that Adobe’s After Effects video editing software no longer works after a QuickTime update.

It’s one thing to use UNIX, which I have loved for over 25 years, but it’s another thing completely to be in the hands of a vendor who (regularly) does things like this while “upgrading” other components of your system.

Who wants to put up with that shit?

And don’t even get me started on the iPhone, which is a lovely and groundbreaking device, but one that I would never ever buy due to Apple’s actions.

I’m out of here.

AddThis Social Bookmark Button

Understanding high-dimensional spaces

18:46 January 23rd, 2008 by terry. Posted under other, tech. 10 Comments »


I’ve spent lots of time thinking about high-dimensional spaces, usually in the context of optimization problems. Many difficult problems that we face today can be phrased as problems of navigating in high-dimensional spaces.

One problem with high-dimensional spaces is that they can be highly non-intuitive. I did a lot of work on fitness landscapes, which are a form of high dimensional space, and ran into lots of cases in which problems were exceedingly difficult because it’s not clear how to navigate efficiently in such a space. If you’re trying to find high points (e.g., good solutions), which way is up? We’re all so used to thinking in 3 dimensions. It’s very easy to do the natural thing and let our simplistic lifelong physical and visual 3D experience influence our thinking about solving problems in high-dimensional spaces.

Another problem with high-dimensional spaces is that we can’t visualize them unless they are very simple. You could argue that an airline pilot in a cockpit monitoring dozens of dials (each dial gives a reading on one dimension) does a pretty good job of navigating a high-dimensional space. I don’t mean the 3D space in which the plane is flying, I mean the virtual high-dimensional space whose points are determined by the readings on all the instruments.

I think that’s true, but the landscape is so smooth that we know how to move around on it pretty well. Not too many planes fall out of the sky.

Things get vastly more difficult when the landscape is not smooth. In fact they get positively weird. Even with trivial examples, like a hypercube, things get weird fast. For example, if you’re at a vertex on a hypercube, exactly one half of the space is reachable in a single step. That’s completely non-intuitive, and we haven’t even put fitness numbers on the nodes. When I say fitness, I mean goodness, or badness, or energy level, or heuristic, or whatever it is you’re dealing with.

We can visually understand and work with many 3D spaces (though 3D mazes can of course be hard). We can hold them in our hands, turn them around, and use our visual system to help us. If you had to find the high-point looking out over a collection of sand dunes, you could move to a good vantage point (using your visual system and understanding of 3D spaces) and then just look. There’s no need to run an optimization algorithm to find high points, avoiding getting trapped in local maxima, etc.

But that’s not the case in a high-dimensional space. We can’t just look at them and solve problems visually. So we write awkward algorithms that often do exponentially increasing amounts of work.

If we can’t visually understand a high-dimensional space, is there some other kind of understanding that we could get?

If so, how could we prove that we understood the space?

I think the answer might be that there are difficult high-dimensional spaces that we could understand, and demonstrate that we understand them.

One way to demonstrate that you understand a 3D space is to solve puzzles in it, like finding high points, or navigating over or through it without crashing.

We can apply the same test to a high-dimensional space: build problems and see if they can be solved on the fly by the system that claims to understand the space.

One way to do that is the following.

Have a team of people who will each sit in front of a monitor showing them a 3D scene. They’ll each have a joystick that they can use to “fly” through the scene that they see. You take your data and give 3 dimensions to each of the people. You do this with some degree of dimensional overlap. Then you let the people try to solve a puzzle in the space, like finding a high point. Their collective navigation gives you a way to move through the high-dimensional space.

You’d have to allocate dimensions to people carefully, and you’d have to do something about incompatible decisions. But if you built something like this (e.g., with 2 people navigating through a 4D space), you’d have a distributed understanding of the high-dimensional space. No one person would have a visual understanding of the whole space, but collectively they would.

In a way it sounds expensive and like overkill. But I think it’s pretty easy to build and there’s enormous value to be had from doing better optimization in high-dimensional spaces.

All we need is a web server hooked up to a bunch of people working on Mechanical Turk. Customers upload their high-dimensional data, specify what they’re looking for, the data is split by dimension, and the humans do their 3D visual thing. If the humans are distributed and don’t know each other they also can’t collude to steal or take advantage of the data – because they each only see a small slice.

There’s a legitimate response that we already build systems like this. Consider the hundreds of people monitoring the space shuttle in a huge room, each in front of a monitor. Or even a pilot and co-pilot in a plane, jointly monitoring instruments (does a co-pilot do that? I don’t even know). Those are teams collectively understanding high-dimensional spaces. But they’re, in the majority of cases, not doing overlapping dimensional monitoring, and the spaces they’re working in are probably relatively smooth. It’s not a conscious effort to collectively monitor or understand a high-dimensional space. But the principle is the same, and you could argue that it’s a proof the idea would work – for sufficiently non-rugged spaces.

Apologies for errors in the above – I just dashed this off ahead of going to play real football in 3D. That’s a hard enough optimization problem for me.

AddThis Social Bookmark Button

One email a day

01:58 January 19th, 2008 by terry. Posted under me, tech. Comments Off on One email a day

I’ve got my email inbox locked down so tightly that only one email made it through today. That’s down from several hundred a day just a few weeks ago.

All the email that doesn’t make it immediately into my inbox gets filed elsewhere. I deal with it all quickly – either deleting stuff (mailing lists), saving, or replying and then saving.

I’m spending way less time looking at my inbox wondering what I didn’t reply to in a list of a few thousand emails. That’s good. I’m spending less time blogging. I haven’t been on Twitter for ages.

In the productivity corner, I somehow managed (with help) to get a 3 meter whiteboard up here and onto the wall. It’s fantastic. I spend 2+ hours every morning talking with Estéve, drawing circles, lines, trees, and random scrawly notes. Today I sat talking to him in my chair while using my laser pointer (thanks Derek!) to point to things on the whiteboard. Ah, the luxury.

AddThis Social Bookmark Button

Tagging in the year 3000 (BC)

19:44 January 4th, 2008 by terry. Posted under books, representation, tech. 4 Comments »

Jimmy Guterman recently called Marcel Proust an Alpha Geek and asked for thoughts on “what from 100 years ago might be the hot new technology of 2008?”

Here’s something about 5000 years older. As a bonus there’s a deep connection with what Fluidinfo is doing.

Alex Wright recently wrote GLUT: Mastering Information Through the Ages. The book is good. It’s a little dry in places, but in others it’s really excellent. I especially enjoyed the last 2 chapters, “The Web that Wasn’t” and “Memories of the Future”. GLUT has a non-trivial overlap with the even more excellent Everything is Miscellaneous by David Weinberger.

In chapter 4 of GLUT, “The Age of Alphabets”, Wright describes the rise of writing systems around 3000 BC as a means of recording commercial transactions. The details of the transactions were written onto a wet clay tablet, signed by the various parties, and then baked. Wright (p50) continues:

Once the tablet was baked, the scribe would then deposit it on a shelf or put it in a basket, with labels affixed to the outside to facilitate future search and retrieval.

There are two comments I want to make about this. One is a throwaway answer to Jimmy Guterman’s request, but the other deserves consideration.

Firstly, this is tagging. Note that the tags are attached after the data is put onto the clay tablet and it is baked. This temporal distinction is important – it’s not like other mentions of metadata or tagging given by Wright (e.g., see p51 and p76). Tags could presumably have different shapes or colors, and be removed, added to, etc. Tags can be attached to objects you don’t own – like using a database to put tags on a physically distant web page you don’t own. No-one has to anticipate all the tag types, or the uses they might be put to. If a Sumerian scribe decided to tag the best agrarian deals of 3000 BC or all deals involving goats, he/she could have done it just as naturally as we’d do it today.

Secondly, I find it very interesting to consider the location of information here and in other systems. The tags that scribes were putting on tablets in 3000 BC were stored with the tablets. They were physically attached to them. I think that’s right-headed. To my mind, the tag information belongs with the object that’s being tagged. In contrast, today’s online tagging systems put our tags in a physically separate location. They’re forced to do that because of the data architecture of the web. The tagging system itself, and the many people who may be tagging a remote web page, don’t own that page. They have no permission to alter it.

Let’s follow this thinking about the location of information a little further…

Later in GLUT, Wright touches on how the card catalog of libraries became separated from the main library content, the actual books. Libraries became so big and accumulated so many volumes that it was no longer feasible to store the metadata for each volume with the volume. So that information was collected and stored elsewhere.

This is important because the computational world we all inhabit has similarly been shaped by resource constraints. In our case the original constraints are long gone, but we continue to live in their shadow.

I’ll explain.

We all use file systems. These were designed many decades ago for a computing environment that no longer exists. Machines were slow. Core and disk memory was tiny. Fast indexing and retrieval algorithms had yet to be invented. Today, file content and file metadata are firmly separated. File data is in one place while file name, permissions, and other metadata are stored elsewhere. That division causes serious problems. The two systems need different access mechanisms. They need different search mechanisms.

Now would be a good time to ask yourself why it has traditionally been almost impossible to find a file based simultaneously on its name and its content.

Our file systems are like our libraries. They have a huge card catalog just inside the front door (at the start of the disk), and that’s where you go to look things up. If you want the actual content you go fetch it from the stacks. Wandering the stacks without consulting the catalog is a little like reading raw disk blocks at random (that can be fun btw).

But libraries and books are physical objects. They’re big and slow and heavy. They have ladders and elevators and are traversed by short-limbed humans with bad eyesight. Computers do not have these characteristics. By human standards, they are almost infinitely fast and their storage is cheap and effectively infinite. There’s no longer any reason for computers to separate data from metadata. In fact there’s no need for a distinction between the two. As David Weinberger put it, in the real world “everything is metadata”. So it should be in the computer world as well.

In other words, I think it is time to return to a more natural system of information storage. A little like the tagging we were doing in 3000 BC.

Several things will have to change if we’re to pull this off. And that, gentle reader, is what Fluidinfo is all about.

Stay tuned.

AddThis Social Bookmark Button

More email customization

02:04 January 4th, 2008 by terry. Posted under tech. Comments Off on More email customization

My recent email changes are working out well. Yesterday morning I woke up and didn’t read email. That’s because I didn’t have any email!

Well, I did, but procmail had filed it all into mail/incoming/IN-20080103.spool because none of it needed immediate attention. I have set VM up so that it knows to look for an x.spool file if I ask it to visit a file called x. That’s one line of elisp in VM: (setq vm-spool-file-suffixes (list ".spool")).

I like this setup because 1) it keeps my main inbox almost empty, 2) it keeps non-essential emails out of my face, and 3) it puts pressure on me to quickly deal with stuff that collects in the daily file, because I know that if I don’t it’s going to be forgotten.

And how to get to the daily file when I do decide to go look? Yes, another little piece of code:

  (define-key vm-mode-map "i"
    '(lambda ()
       (interactive)
       (vm-visit-folder
        (expand-file-name
         (concat "~/mail/incoming/IN-"
                 (format-time-string "%Y%m%d"))))))

which simultaneously defines a function to take me (in VM, in emacs) to today’s file and puts that function on the “i” key in VM. So I just hit a single key and I’m automatically looking at the non-time-critical mail file for the day. I’ll probably write a little function to take me to yesterday’s too.

And yes, I guess this is all highly personalized, but these are things that I do many times a day every day of my life. So I’m happy to streamline them. And all the code is trivial. That’s the most interesting thing. With a tiny bit of code you can do so much and without it you can only do what other programmers thought you might want or need to be able to do.

AddThis Social Bookmark Button

Amazon just billed me 14 cents

00:35 January 2nd, 2008 by terry. Posted under companies, tech. 4 Comments »

I’ve been messing around with Esteve setting up an Amazon EC2 machine.

We set up a machine the other day, ssh’d into it, took a look around, and then shut it down a little later. Amazon just sent me a bill:

Greetings from Amazon Web Services,

This e-mail confirms that your latest billing statement is available on the AWS web site. Your account will be charged the following:

Total: $0.14

Please see the Account Activity area of the AWS web site for detailed account information.

Isn’t that cool?

It would certainly cost more than 14 cents to get your hands on your own (virtual) Linux box any other way.

AddThis Social Bookmark Button

My email setup

00:05 January 2nd, 2008 by terry. Posted under me, tech. 1 Comment »

I like customizing my environment. I’ve spent lots and lots of time doing that over the decades.

Some examples: My emacs environment has about 6000 lines of elisp that I’ve written to help me edit. I have over 500 shell scripts in my bin directory (30K lines of code), and certainly hundreds of other scripts around the place to help with other specific tasks. My bash setup is about 2000 lines of shell script.

That’s about 40K lines of code all written just to help me edit and work in the shell.

As a computer user, I’m damned happy I’m a programmer. I don’t think I can imagine what it would be like to be a computer user and not a programmer.

As a non-programmer you’re at the mercy of others. When you run into a problem you don’t have a solution for, you’re either out of luck, you have to spend often huge amounts of time solving it in some contorted semi- or fully-manual way, you have to find someone else’s (likely partial) solution and maybe pay for it, or you ask or pay someone to solve your problem, or you wait for the thing you need to appear in some product, etc. And all the while you’ve got a perfectly good high-speed general-purpose machine sitting right in front of you, likely with all the programming tools you’d need already installed for you…. but you don’t know how to use it!!

How weird is that?

As a programmer when you run into a problem you don’t have a solution for, you can just write your own.

One thing that always surprises me is how little time most other programmers tend to spend customizing their environments. Given 1) that programmers probably spend a large percentage of each day in their editor, in email, and in the shell, 2) that those things can all be programmed (assuming you use emacs :-)), and 3) that programmers usually don’t like repeating themselves, doing unnecessary work or being inefficient, you’d think that programmers would all be spending vast amounts of time getting things set up just so.

FWIW, here’s a description of the email setup I’ve built up over the years.

But first some stats.

I’ve been saving all my incoming and outgoing emails since Sept 19, 1989. I don’t know why I didn’t start earlier – I wish I had. My first 7 years of emailing is lost, almost certainly forever. I’ve sent 125K emails in that time and received 425K. I’ve got all my incoming email split into files by sender, with some overlap, in 6700 files. The total disk usage of all mails is just under 4G. I have 1.1G (compressed) of saved spam. I have 1250 mail aliases in my .mailrc file.

  1. I write mail in emacs, of course. Seeing as email is text, why would you use anything but your text editor to compose it? Not being able to use emacs to edit text is a show-stopper for me when it comes to using software products. Don’t try to make me use an inferior editor. Don’t ask me to edit text in my browser.
  2. All my outgoing mails get dumped into a single file. I occasionally move these files when they get too big. I keep things this way as it’s then really fast to look at stuff I’ve sent, which I do frequently. I have shell commands called o, oo, ooo etc., to show me the last (second last, etc) of these files (starting at bottom) instantly.
  3. I read mail in emacs (using VM). I could do that differently, but email is (usually) text and I want to copy it, paste it, edit it, reply to it, etc. I also use the emacs supercite package, smart paragraph filling, automatic alias expansion, etc. All that has been standard in emacs for at least 10 years, but it’s still not available in tons of “modern” email readers.
  4. VM recognizes the 37 email addresses I’ve used over the years as indicating a mail is from me (and so doesn’t put that address in any followup line).
  5. I do all my MIME decoding manually. VM knows how to handle most things, I just don’t let it do it until I want it done. That’s mainly a security thing – several years ago I predicted that PDF files would one day be used to trigger buffer overflows, as just happened. I don’t open any attachment of any form from anyone I don’t know (and don’t open them from some people I do know who like to pass along random crap from others).
  6. I have VM figure out exactly where each mail should be saved, based on sending email address. So I never have to make a decision about where to save anything.
  7. I have 154 virtual folders defined in VM. These let me dynamically make a mail folder based on fairly flexible rules (subject, sender, etc). They’re not folders on disk, but are composed from these on the fly. It’s a great feature of VM, highly useful. E.g., I have friends with multiple email addresses – my friend Emily has used 21 emails addresses in the last 15 years and I can see all her incoming mail in one virtual folder no matter where she sends it from. Virtual folders can be used for much more than that though.
  8. I have an emacs function that detects if the person sending me mail also uses VM and, if so, lets me know if their version of VM is newer than mine. That way I don’t have to think about upgrading VM – when a friend does it, emacs tells me automatically.
  9. I have VM keys set up to send messages to SpamBayes to teach it that things are spam or ham.
  10. I have an emacs hook function that looks at the mail I’m currently looking at in VM and sets my email address accordingly. So if I’m reading mail from Cambridge it sets my address to be my Cambridge one, and similarly for Fluidinfo, for my jon.es domain and a couple of others. That means I pretty much never reply to an email using an address I didn’t want to use on that email. That’s all totally automatic and I never have to think about email identity, except when mailing someone for the first time.
  11. VM also does a bunch of other things for me, like add attachments, encrypt and decrypt mail, etc. But that’s all fairly standard now.
  12. I use a script I wrote to repeatedly use fetchmail to pull my incoming mails from half a dozen mailboxes.
  13. I use grepmail to search for emails. It’s open source, so I was able to speed it up, fix some problems I ran into, and add some enhancements I wanted in versions 4.72 and 4.80.
  14. In front of grepmail I run my own mail-to program which knows where I store my outgoing mail, parses command line from and to dates to figure out the relevant files to pass to grepmail, etc.
  15. I use cron and some scripts to maintain a list of email addresses I’ve ever received/sent mail from/to (78500 of these) or just received from (40K of these). Cron updates these files nightly, using another program that knows how to pull things that look like emails out of mail files.
  16. I have a shell script which looks in the received mail address file to find email addresses. So if I am wondering about what someone’s address from, e.g., Siemens might be, I can run emails-of siemens and see 140 Siemens email addresses. Yes, I used to send a lot of mail to Siemens.
  17. I use procmail to filter my incoming mail. With procmail I do a bunch of things:
  18. Procmail logs basic info on all my incoming mail to a file.
  19. It looks for a special file in my home directory, and if it’s there it forwards mail to my mobile phone.
  20. It also looks for mail from me with a special subject, and when, found either creates or removes the above file. This allows me to turn forwarding to my mobile phone on and off when I’m away from my machine.
  21. It dumps some known spam addresses for me.
  22. With procmail I run incoming mail through a script I wrote that looks at the above file of all known (received) mail addresses. This adds a header to the mail to tell me it’s from a known former sender. Those mails then get favorable treatment as they’re very likely not spam.
  23. With procmail I run incoming mail through another program I wrote that looks at the From line and marks the mail as being something I want delivered immediately. If not it gets put aside for later viewing.
  24. With procmail I run incoming mail through another program I wrote that looks at the overall MIME structure of the mail and flags it if it looks like image spam (hint: don’t send me a GIF image attachment).
  25. Finally, I also use procmail to run incoming mail through both SpamBayes and Spam Assassin.
  26. I used to use procmail to auto-reply to anything considered spam (and then auto-drop the many bounces to this). But I turned that off as it was making too many mistakes replying to forged mails from mailing lists.
  27. I have a program that cron runs every night which goes through the day’s spam and summarizes the most interesting messages. It typically pulls out 15-20% of my spam into a summary mail which it sends me. The summary is sorted based on the mail address in the To line (my old mail addresses get scored very low). It also identifies common subjects (so I can kill them), and does some checks like tossing emails whose subjects are not composed of at least some recognizable words. This program is pretty severe – all these mails have already been classed as spam by one of the above programs, so this is just a safety check that I haven’t tossed anything I should keep. It generates a piece of emacs lisp for each message it pulls out so I can jump straight to the correct spam folder and message number in case I want to look at something. It also keeps a list of things to watch for that are definitely not spam. With this program in place I never go looking in my spam folders. I can also run this from the shell at any time.
  28. I have a program that summarizes the mail I’ve put aside (not for immediate delivery). Cron runs that nightly and mails me the result. I can also run this from the shell at any time.
  29. I have a simple program I use to grep for mail aliases in my .mailrc.
  30. I have a script which lists my received email files in reverse order of last update. I can pipe the output of that program into xargs grep to quickly search all incoming mail, in new-to-old order (for speed), mentioning any term.
  31. I have a script to send unrecorded mail (from the shell). That’s mail that doesn’t have my usual FCC line in it, in case I’m mailing out something large and don’t want a copy of it in my outgoing mail file.
  32. I have an emacs function to visit my current outgoing mail folder with backups disabled (the folders are often large and I rarely want to edit them).
  33. And I can’t resist pointing out that I wrote the Spamometer in 1997 to do probabilistic spam detection, and set up a Library of Spam (which attracted a hell of a lot of spam). This was 5 years before Paul Graham wrote his famous A Plan for Spam article about doing Bayesian filtering to detect spam. The Spam Assassin is very similar in approach and design.
AddThis Social Bookmark Button

User authentication in a world with no free will

02:43 December 13th, 2007 by terry. Posted under other, tech. 6 Comments »

I have a little background in user authentication. I wrote my undergrad CS honors thesis on Secrecy and Authentication. If you search Google hard enough you can even find mentions of the Seberry & Jones Scheme for implementing subliminal channels. I held a provisional patent with Sydney University on a biometric user authentication method based on typing style in 1985/6. The method turned out not to be original, has been re-invented multiple times since then, and was even somehow published as new years later in CACM.

I therefore feel eminently qualified to speculate on what user authentication might look like in a world with no free will.

Note that I don’t care whether free will exists or not, and I certainly don’t want to waste my time thinking or talking about it. But if it doesn’t exist, then the following user authentication algorithm does exist. We couldn’t implement it, but it would certainly exist and it’s fun to consider instead of doing real work.

When a computer needs to verify who you are, it tells you to move the mouse around randomly for as long as you like. Or to just bang on the keyboard. The kind of thing you do when you’re generating randomness for the construction of a PGP/GPG key.

But if there’s no free will then it’s not random.

So the algorithm can just look up what you did in a big table to see who you are. As two users could conceivably do the same thing, it probably needs a little more information, like the time of day and your IP address – neither of which you’d have any control over either.

That’s it. No need for anything fancy, just a lookup table. No-one would ever fail to be recognized, no-one would ever be mistaken for someone else, there’d be no identity theft, etc. Even if you just sat there and did nothing for a while the machine would know exactly who you were. You could always log in by just briefly doing nothing at all, and then continuing. The length of time you did nothing for would betray you.

All totally absurd, of course, and thinking about it quickly becomes highly circular. Just like the rest of the debate.

As you were.

AddThis Social Bookmark Button

Conference panels – self-indulgent, elitist, and smug

23:32 December 12th, 2007 by terry. Posted under tech. Comments Off on Conference panels – self-indulgent, elitist, and smug

I’ve been to too many conferences recently. FOWA in London, Web 2.0 in Berlin, and tonight I got back from Le Web in Paris.

Among various other dislikes, one thing I particularly don’t enjoy are the panels. I find panels very self-indulgent. Some small number of panelists sit on stage and have a conversation with each other, while the rest of us are supposed to sit there passively and lap it up. Then at the very end, the panelists take a question or two from the audience. Sometimes the questions are incoherent or require a bit of elaboration, and often the result is that the panelists end up being rude and dismissive. Most questions go unanswered simply due to lack of time.

That doesn’t feel right. I always wish the time balance were changed. And I wish I didn’t get the overwhelming feeling that the panelists are basking in their own glory, too busy for the common conference goer. It’s a pity, because I and I suppose many others, are genuinely interested in the individuals on the panels. But the format doesn’t work at all for me.

I could say much more, but that would probably get too specific.

AddThis Social Bookmark Button

As I please: pizza margarita & 2 beers

23:50 December 11th, 2007 by terry. Posted under books, me, tech, travel. 2 Comments »

I’m in Paris for the Le Web conference. Tonight is the party, at La Scala, which looks like exactly the kind of place I hate. I never understand why people go to loud clubs.

So instead, I went out wandering and found a pizza place, ordered a margarita, drank a couple of Italian beers and took my time savoring more of Orwell. It’s such a pleasure, as with Gore Vidal essays or Proust, to read his thoughts on all manner of things. I’ve been taking my time, slowly working through the 4 volumes of Collected Essays, Journalism and Letters (that link is to volume 1).

Here’s the last piece I read tonight, the May 19, 1944 As I Please column. Maybe you wont find it extraordinary, but I do. It probably helps to have the context, to have read the previous volumes (I’m in the middle of vol. 3).

AddThis Social Bookmark Button

reCaptcha added

19:31 November 30th, 2007 by terry. Posted under tech, travel. Comments Off on reCaptcha added

I’m stuck in the Oakland airport with a 3 hour delay on a flight to Vegas. Bambi, who steadfastly refuses to blog for reasons unknown, has dinner waiting for me there. Bummer.

Meanwhile, Russell, who does blog and makes a mean Irish coffee, tells me I need to add a Captcha to this blog, so I’ve installed the very clever reCaptcha. Enjoy.

All in all a pretty thrilling night here at the airport. Battery #2 is halfway done. Me too.

AddThis Social Bookmark Button

Can’t stand perl

23:34 November 22nd, 2007 by terry. Posted under python, tech. 4 Comments »

I’ve just spent the last 7 hours working on a bunch of old Perl code that maintains a company equity plan. It’s been pain, pain, pain the whole way. I can’t believe I ever thought Perl was cool and fun. I can’t believe I wrote that stuff. I can’t believe it’s almost midnight.

But, I’m nearly done.

AddThis Social Bookmark Button

Twitter creeps in

21:30 November 21st, 2007 by terry. Posted under me, tech, twitter. 3 Comments »

I often notice little things about how I work that I think point out value. One sign that a piece of UI is right is when you start to look for it in apps that don’t have it. For example, after I had started using mouse gestures in Opera I’d find myself wanting to make mouse gestures in other applications. When mice first started to have a wheel, I was skeptical. Support of the mouse wheel was not universal across applications. When I found myself trying to scroll with the mouse wheel in applications that didn’t support it, I knew it was right.

Tonight I came home and went to my machine. The first thing I did was to check what was going on in Twitter. That’s pretty interesting, at least for someone like me. I’ve been sending email on pretty much a daily basis for 25 years. It’s pretty much always the first thing I look at when I come back to my machine. Occasionally these days I find myself first going into Google reader to see what’s new, but that’s pretty rare and I might be looking for something specific. Tonight, I think for the first time, Twitter was where I went to – and not just for the general news, but for communications between and about people I know or am interested in. Much more interesting than looking through my email.

I’m one of those that thought Twitter was pretty silly when I first signed up (Dec 2006). I only used it once, and also found it intolerably slow. But it’s grown on me. And I find definite value there.

A few examples:

  1. I’d mailed Dick Costolo a few times in the past. Then I saw him twittering that he was drinking cortados. So I figured he must be in Spain. I mailed him, and he was. As a result I ended up at the FOWA conference in London the next day and met a bunch of people.
  2. On Tuesday I went out and bought a Wii in Manhattan to take back to my kids in Spain. I twittered about heading out to do it. I got an email a bit later from @esteve telling me to take the Wii back as they are region-locked. So I did.
  3. A week or so ago I was reading some tweets and noticed that someone had just been out to dinner in Manhattan with someone else that I wanted to meet. So I sent a mail to the first person and was soon swapping mails with the second.
  4. I’ve noticed about 5 times that interesting people were going to be in Barcelona and so I’ve mailed them out of the blue. That’s really good – people on holiday are often happy to have a beer and a chat. I’d have had no idea they were going to literally be outside my door were it not for Twitter.
AddThis Social Bookmark Button

No comment

14:14 November 20th, 2007 by terry. Posted under tech. Comments Off on No comment

For some reason I’m not receiving email notification of comments on this blog. I’ve just noticed a bunch of comments I’d not seen. I’m looking into it.

AddThis Social Bookmark Button

Not exactly Brownian motion in Manhattan

22:08 November 19th, 2007 by terry. Posted under companies, tech. 10 Comments »

Today after some meetings I went out for a walk. I’m staying on 12th Street between 5th and University in Manhattan.

I had intended to “just wander around” pretty much at random. That’s what I really felt like too. But in the back of my mind, not quite so far back that I wasn’t aware of it, my brain was making sure that, like it or not, I went to the Apple store on the corner of 5th Avenue and Central Park.

I really have no need of an Apple store. There’s nothing I would buy, nothing I need. But.

So off I wandered… Broadway, 6th Av, 5th Av. I stopped briefly in many stores, had a coffee and a muffin, tried to tell myself that I actually wasn’t going to the Apple store. But.

I saw iPods and iPhones aplenty along the way. Hundreds of them. All identically priced. Best buy, Comp USA, Circuit City, all the small electronic shops on 5th Av. No need, no need at all to go to the Apple store. None.

I’m walking up 5th Av in the boring super-rich area, Cartier, Dunhill, DeBeers. There can be no doubt whatsoever that I am heading to the Apple store. Most of my mind doesn’t want to go, but my legs and body seem determined. They know I need it.

And there it is. Amazing. I’ve been in several of these stores before, including this one, but there’s something you just have to see and feel. Maybe it’s the church of the 21st century… people are drawn in to worship an abstract god, to kneel at the altar and finger the icons.

It really is amazing. To me the Apple store is about the hippest place in Manhattan. Here you get to see all sorts of cool cats just hanging out with their favorite hardware. The place is full. Full of people from all over the world who’ve come to buy Apple gear. The place has a very definite atmosphere, and it’s not the atmosphere of a regular computer store. There are hundreds of Apple products out, they’re all on, and people are using them – surfing the web, reading email, listening to music, marveling. Spend half an hour in there people watching, and you want to run out and buy AAPL stock.

Apple and Nokia are two companies that really understand the importance of appearance, design, and fashion in technology. I think Nokia were the first company to see clearly that a phone is not just a phone – it’s a statement about yourself. It’s something you take out and leave on the table at the cafe, or casually flip open when you need to impress someone or get laid. Apple understands it even better. I’ll walk nearly 50 blocks just to get a fix – not to buy, just to look at the products, look at the people, be amazed at it all.

Fortunately I’m old enough to know that I don’t really need any of those shiny objects. I have a first generation iPod that I never use. I have a dead-simple phone that I don’t feel any need to upgrade. I haven’t bought myself a computer in I don’t know how long – maybe 10 years (I always get them through work). I’m not even sure that I’d own a computer if I didn’t work from home. But I sure do like to look at hardware. The new iPod nano is extraordinarily beautiful – dimensions, sleakness, feel, everything about it is divine – and at $149 (4GB) or $199 (8GB) it doesn’t feel expensive. But I know I simply wouldn’t use it. What a pity!

AddThis Social Bookmark Button

Flakey Twitter and the use of consecutive ids

05:54 November 16th, 2007 by terry. Posted under companies, tech, twitter. 2 Comments »

Twitter was just inaccessible for maybe a couple of hours. Prior to that there was a 9-day gap in their timeline, noticed by at least a few people. I quite regularly have twitters I send not show up at all.

I wonder what could be going on over there? Things certainly don’t feel very stable.

A friend signed up tonight. Using the Twitter API you can see her id. It’s a bit over 10 million. You can also see the id of her first twitter, a bit over 417 million. The earliest twitter available on the system is number 20 “just setting up my twttr” sent at 20:50:14 on Tue Mar 21 2006 by Jack Dorsey who has user id 12 (the lowest user I’ve seen).

Given that Twitter seem to be using consecutive ids for users and twitters, and that you can pull dates out of their API, it would be pretty easy to make graphs showing growth in users and twitters over time. You could probably also infer downtime by looking for periods when no twitters appeared. This would be pretty easy too. Beyond a certain point in time it would be very accurate (i.e., when there are so many twitters arriving that a twittering gap is suspicious), and you could calculate confidence estimates.

I don’t have time for all that though.

But I wonder if Google did something like that as part of their competitive analysis when they decided to buy Jaiku, or if Twitter’s investors did it, and how the numbers would match up with whatever Twitter management might claim. I’ve no idea or opinion at all about any of that btw. But I don’t think I’d be exposing all that information by using consecutive ids for users and their twitters.

AddThis Social Bookmark Button

Twittering from inside emacs

04:34 November 12th, 2007 by terry. Posted under python, tech, twitter. Comments Off on Twittering from inside emacs

I do everything I can from inside emacs. Lately I’ve been thinking a bit about the Twitter API and social graphs.

Tonight I went and grabbed python-twitter, a Python API for Twitter. Then I wrote a quick python script to post to Twitter:

import sys
import twitter
twit = twitter.Api(username='terrycojones', password='xxx',
                        input_encoding='iso-8859-1')
twit.PostUpdate(sys.argv[1])

and an equally small emacs lisp function to call it:

(defun tweet (mesg)
  (interactive "MTweet: ")
  (call-process "tweet" nil 0 nil mesg))

so now I can M-x tweet from inside emacs, or simply run tweet from the shell.

Along the way I wrote some simple emacs hook functions to tweet whenever I visited a new file or switched into Python mode. I’m sure that’s not so interesting to my faithful Twitter followers, but it does raise interesting questions. I also thought about adding a mail-send-hook function to Twitter every time I send a mail (and to whom). Probably not a good idea.

You can follow me in Twitter. Go on, you know you want to.

Anyway, Twitter is not the right place to publish information like this. Something more general would be nicer…

AddThis Social Bookmark Button

Multiplying with Roman numerals

18:14 November 10th, 2007 by terry. Posted under companies, representation, tech. 1 Comment »

I like thinking about the power of representation, particularly inside computers. I wrote about it earlier in the year and gave a couple of examples. Here’s another.

Think about how you might have done multiplication with Roman numerals. Why is it so difficult?

It’s not because multiplication is inherently so hard. Roman numerals were just a terribly awkward way to represent numbers. However, if you introduce the concept of a zero and use a positional representation, things become much easier.

Note that the problem hasn’t changed, only the representation did. A new representation can make things that look like problems go away.

I claim that we are still using Roman numerals to manage information online (and on the desktop for that matter). Until we do something about it, we’ll probably continue butting our heads against the same problems and they’ll probably continue to appear intractable.

At Fluidinfo, everything we do is based on a new way to represent information.

AddThis Social Bookmark Button

Learning to work on the road

11:57 November 7th, 2007 by terry. Posted under me, tech. Comments Off on Learning to work on the road

I’m not good at working away from my home setup. I like my Kinesis keyboard, my big flat-screen monitor, and even the convenience of an external mouse. I don’t really like working without them.

But I’m learning to deal with being away. For the last few days I’ve been sitting in talks at Web 2.0 Expo in Berlin. There’s wifi and the speed is quite good. So I sit here checking out code, writing and running unit tests, and generally getting a few things done. I’m also surprised at how easy it is to work but also keep one ear on the presentation. The most obvious manifestation is when a speaker asks for a show of hands – I am surprised to find my hand going up (or not), without my really being very conscious of what’s going on. Before I hear the question, I’d have thought I wasn’t really listening. But it seems that I am.

That’s all good. One reason I don’t much like going to conferences is that they’re mainly down time. The talks are not good enough or fast enough or don’t have enough content, so if you sit in one it’s often frustrating. But in a way, lightweight talks are good, because they let you work in parallel. And if the talk does happen to be good you can always pay more attention.

AddThis Social Bookmark Button