Add to Technorati Favorites

it’s long

02:00 June 14th, 2007 by terry. Posted under books, me. 2 Comments »

There are a few things that bug me on the internet.

One is that people often warn each other that articles are long, or apologize for writing long blog entries. There’s nothing inherently wrong with that. When it turns out though that these items are just a couple of screenfuls, you start to wonder what we’re all coming too. And yes, I know, it’s the 21st century, we’re all living at internet speed now, who’s got the time, etc.

OTOH, a word like “long” can be used to convey information. You can look at the word “long” and form some idea of just how long the long thing might be. And these days, it ain’t very long. Maybe we’re in the middle of a transition in which a word comes to mean its opposite.

Marc Andreessen recently began to blog, and the blogosphere is all abuzz. He writes tolerably well, and he’s got interesting comments on many things, but there’s a real down side: his posts are really long. Here’s a random example of someone who agrees.

That’s weird.

From where I sit, if someone writes well and is interesting or otherwise provocative, you wish they’d write more, not less. You want it to be long. Half a dozen web pages is not long. I read In Search of Lost Time last year. It took me 6 months and at 4300 pages or so, I think it qualifies as long. I’m reading Orwell’s letters, essays, and journalism. At 2200 pages, it seems fairly long too. I wished Proust was longer. I’ll probably wish Orwell was longer too. I tried reading The Decline and Fall of the Roman Empire (3500 pages), but the 7-volume “leatherette” set I bought stinks of old cigarette smoke and I couldn’t bear it.

How did we get from “long” meaning something like War and Peace (1100 pages) or Anna Karenin (850 pages) all the way to a 6-page (single narrow column) blog posting (with plenty of white space)?

What word should we now use for things that are longer than 6 pages or that require more than 5 minutes to read? Epic?

AddThis Social Bookmark Button

Orwell on T. S. Eliot and the path from existential angst to serial entrepreneur

18:06 June 7th, 2007 by terry. Posted under books, companies, me. 10 Comments »

I like George Orwell. A tired fool got me started on the four-volume collection of Orwell’s essays, journalism, and letters. It’s great. Among many things I could say, one is that you know you’re reading someone damned good if you’re fascinated by their thoughts on something you formerly had no interest or experience in. There’s the essay on Dickens that I mentioned earlier, essays on cheap vulgar postcards, boys magazines, and much else besides. Gore Vidal is similarly compelling, and I think I would take his collected essays even over those of Orwell. Christopher Hitchens is similarly provocative but not in the same class as a writer. Very few are.

Today I was reading an Orwell review of three T. S. Eliot poems. I’m not into Eliot and I’m not into poetry. Like Gore Vidal’s, Orwell’s reviews are wonderful – balanced and surgical skewerings. Anyway, I came across the following, which I enjoyed enormously and decided to post:

But the trouble is that conscious futility is something only for the young. One cannot go on ‘despairing of life’ into a ripe old age. One cannot go on and on being ‘decadent’, since decadence means falling and one can only be said to be falling if one is going to reach the bottom reasonably soon. Sooner or later one is obliged to adopt a positive attitude towards life and society. It would be putting it too crudely to say that every poet in our time must either die young, enter the Catholic Church, or join the Communist party, but in fact the escape from the consciousness of futility is along those general lines. There are other deaths besides physical death, and there are other sects and creeds besides the Catholic Church and the Communist Party, but it remains true that after a certain age one must either stop writing or dedicate oneself to some purpose not wholly aesthetic. Such a dedication necessarily means a break with the past:

every attempt
Is a wholly new start, and a different kind of failure

Because one has only learnt to get the better of words
For the thing one no longer has to say, or the way in which
One is no longer disposed to say it. And so each venture
Is a new beginning, a raid on the inarticulate
With shabby equipment always deteriorating
In the general mess of imprecision of feeling,
Undisciplined squads of emotion.

Apart from the fact that I am much too impatient to read poetry, one of my problems is that I never have any idea what it’s about. But at least the above is clear. It wonderfully captures the inevitable progression from the troubled search for meaning of existential youth to the amorphous struggles of the serial entrepreneur.

AddThis Social Bookmark Button

the blind leading the blind?

01:22 May 1st, 2007 by terry. Posted under companies, me. 2 Comments »

At some point I read a description of why entrepreneurs pitching VCs is a bad mix: because you have people who can’t explain anything meeting people who can’t understand anything. That’s unfair all round of course, but still…. Having done some of this and had people just not “get” it, you can’t but wonder why they don’t get it. Of course part of it can be about the pitch itself, the presentation, etc etc. But even if you suppose everything is ideal, investing (both as a founder and as a financial backer) are both an act of faith.

There’s lots of evidence for this. To begin with, if it were a science and there were quantifiable measures, those things would presumably be known and you’d have a lot fewer startups and a lot fewer investors, simply because failure would be rare.

Until recently I thought there was more of a lack of vision on the investor side. But now I’m not so sure. For example, the Google guys were apparently running around search engine companies trying to sell their idea (vision? early startup?) for $1M. They couldn’t find a buyer. What an extraordinary lack of….. what? On the one hand you want to laugh at those idiot companies (and VCs) who couldn’t see the huge value. OK, maybe. But the more extraordinary thing is that Larry Page and Sergei Brin couldn’t see it either! That’s pretty amazing when you think about it. Even the entrepreneurs couldn’t see the enormous value. They somehow decided that $1M would be an acceptable deal. Talk about a lack of vision and belief.

So you can’t really blame the poor VCs or others who fail to invest. If the founding tech people can’t see the value and don’t believe, who else is going to?

I usually enjoy Paul Graham’s essays. In a recent one, The Hacker’s Guide To Investors, he says:

Risk is always proportionate to reward. So the most successful startup of all is likely to have seemed an extremely risky bet at first, and that is exactly the kind VCs won’t touch.

Which is also pretty interesting. In some ways it’s like a horoscope – appealing to every dreamer who believes they’re sitting on a billion-dollar idea. But if the always is true in the above quote, then if it happens that you are in fact sitting on something that will bring huge rewards, then it by definition must appear hugely risky.

If you throw in the initial observation that even the founders cannot assess value, then I think you get three things. One: a feeling of taking huge risk is a necessary part of building something that’s hugely rewarding (i.e., if you don’t have that feeling, then you’re probably not building such a thing). Two: even if you are building such a thing, you cannot know it. You just have to believe. Three: if you cannot know it, but can see huge risk, you can’t expect investors to see things any differently. So to get them to invest you really have to make them believers too.

AddThis Social Bookmark Button

why data (information representation) is the key to the coming semantic web

01:51 March 19th, 2007 by terry. Posted under me, representation, tech. 5 Comments »

In my last posting I argued that we should drop all talk about Artificial Intelligence when discussing the semantic web, web 3.0, etc., and acknowledge that in fact it’s all about data. There are two points in that statement. I was scratching an itch and so I only argued one of them. So what about my other claim?

While I’m not ready to describe what my company is doing, there’s a lot I can say about why I claim that data is the important thing.

Suppose something crops up in the non-computational “real-world” and you decide to use a computer to help address the situation. An inevitable task is to take the real-world situation and somehow get it into the computational system so the computer can act on it. Thus one of the very first tasks we face when deciding to use a computer is one of representation. Given information in the real world, we must choose how to represent it as data in a computer. (And it always is a choice.)

So when I say that data is important, I’m mainly referring to information representation. In my opinion, representation is the unacknowledged cornerstone of problem solving and algorithms. It’s fundamentally important and yet it’s widely ignored.

When computer scientists and others talk about problem solving and algorithms, they usually ignore representation. Even in the genetic algorithms community, in which representation is obviously needed and is a required explicit choice, the subject receives little attention. But if you think about it, in choosing a representation you have already begun to solve the problem. In other words, representation choice is a part of problem solving. But it’s never talked about as being part of a problem-solving algorithm. In fact though, if you choose your representation carefully the rest of the problem may disappear or become so trivial that it can be solved quickly by exhaustive search. Representation can be everything.

To illustrate why, here are a couple of examples.

Example 1. Suppose I ask you to use a computer to find two positive integers that have a sum of 15 and a product of 56. First, let’s pick some representation of a positive integer. How about a 512-bit binary string for each integer? That should cover it, I guess. We’ll have two of them, so that will be 1,024 bits in our representation. And here’s an algorithm, more or less: repeatedly set the 1,024 bits at random, add the corresponding integer values, to see if they sum to 15. If so, multiply them and check the product too.

But wait, wait, wait… even my 7-year-old could tell you that’s not a sensible approach. It will work, eventually. The state search space has 21024 candidate solutions. Even if we test a billion billion billion of them per second, it’s going to take much longer than a billion years.

Instead, we could think a little about representation before considering what would classically be called the algorithm. Aha! It turns out we could actually represent each integer using just 4 bits, without risk of missing the solution. Then we can use our random (or an exhaustive) search algorithm, and have the answer in about a billionth of a second. Wow.

Of course this is a deliberately extreme example. But think about what just happened. The problem and the algorithm are the same in both of the above approaches. The only thing that changed was the representation. We coupled the stupidest possible algorithm with a good representation and the problem became trivial.

Example 2. Consider the famous Eight Queens problem (8QP). That’s considerably harder than the above problem. Or is it?

Let’s represent a chess board in the computer using a 64-bit string, and make sure that exactly 8 bits are set to one to indicate the presence of a queen. We’ll devise a clever algorithm for coming up with candidate 64-bit solutions, and write code to check them for correctness. But the search space is 264, and that’s not a small number. It could easily take a year to run through that space, so the algorithm had better be pretty good!

But wait. If you put a queen in row R and column C, no other queen can be in row R or column C. Following that line of thinking, you can see that all possibly valid solutions can be represented by a permutation of the numbers 1 through 8. The first number in the permutation gives the column of the queen in the first row, and so on. There are only 8! = 40,320 possible arrangements that need to be checked. That’s a tiny number. We could program it up, use exhaustive search as our algorithm, and have a solution in well under a second!

Once again, a change of representation has a radical impact on what people would normally think of as the problem. But the problem isn’t changing at all. What’s happening is that when you choose a representation you have actually already begun to solve the problem. In fact, as the examples show, if you get the representation right enough the “problem” pretty much vanishes.

These are just two simple examples. There are many others. You may not be ready to generalize from them, but I am.

I think fundamental advances based almost solely on improved representation lie just ahead of us.

I think that If we adopt a better representation of information, things that currently look impossible may even cease to look like problems.

There are other people who seem to believe this too, though perhaps implicitly. Web 3.0, whatever that is, can bring major advances without anyone needing to come up with new algorithms. Given a better representation we could even use dumb algorithms (though perhaps not pessimal algorithms) and yet do things that we can’t do with “smart” ones. I think this is the realization, justifiably exciting, that underlies the often vague talk of “web 3.0″, the “read/write web”, the “data web”, “data browsing”, the infinite possible futures of mash ups, etc.

This is why, to pick the most obvious target, I am certain that Google is not the last word in search. It’s probably not a smart idea to try to be smarter than Google. But if you build a computational system with a better underlying representation of information you may not need to be particularly intelligent at all. Things that some might think are related to “intelligence”, including the emergence of a sexy new “semantic” web, may not need much more than improved representation.

Give a 4-year-old a book with a 90%-obscured picture of a tiger in the jungle. Ask them what they see. Almost instantly they see the tiger. It seems incredible. Is the child solving a problem? Does the brain or the visual system use some fantastic algorithm that we’ve not yet discovered? Above I’ve given examples of how better representation can turn things that a priori seemed to require problem solving and algorithms into things that are actually trivial. We can extend the argument to intelligence. I suspect it’s easy to mistake someone with a good representation and a dumb algorithm as being somehow intelligent.

I bet that evolution has produced a representation of information in the brain that makes some problems (like visual pattern matching) non-existent. I.e., not problems at all. I bet that there’s basically no problem solving going on at all in some things people are tempted to think of as needing intelligence. The “algorithm”, and I hesitate to use that word, might be as simple as a form of (chemical) hill climbing, or something even more mundane. Perhaps everything we delight in romantically ascribing to native “intelligence” is really just a matter of representation.

That’s why I believe data (aka information representation) is so extremely important. That’s where we’re heading. It’s why I’m doing what I’m doing.

AddThis Social Bookmark Button

the semantic web is the new AI

03:30 March 18th, 2007 by terry. Posted under me, representation, tech. Comments Off on the semantic web is the new AI

I’m not a huge fan of rationality. But if you are going to try to think and act rationally, especially on quantitative or technical subjects, you may as well do a decent job of it.

I have a strong dislike of trendy terms that give otherwise intelligent people a catchy new phrase that can be tossed around to get grants, get funded, and get laid. I spent years trying to debunk what I thought was appalling lack of thought about Fitness Landscapes. At the Santa Fe Institute in the early 90s, this was a term that (with very few exceptions, most notably Peter Stadler) was tossed about with utter carelessness. I wrote a Ph.D. dissertation on Evolutionary Algorithms, Fitness Landscapes and Search, parts of which were thinly-veiled criticism of some of the unnecessarily colorful biological language used to describe “evolutionary” algorithms. I get extremely impatient when I sense a herd mentality in the adoption of a catchy new term for talking about something that in fact is far more mundane. I get even more impatient when widespread use of the term means that people stop thinking.

That’s why I’m fed up with the current breathless reporting on the semantic web. The semantic web is the new artificial intelligence. We’re on the verge of wonders, but everyone agrees these will take a few more years to realize. Instead of having intelligent robots to do our bidding, we’ll have intelligent software agents that can reason about stuff they find online, and do what we mean without even needing to be told. They’ll do so many things, coordinating our schedules, finding us hotels and booking us in, anticipating our wishes and intelligently combining disparate information from all over the place to…. well, you get the picture.

There are two things going on in all this talk about the semantic web. One is recycled rubbish and one is real. The recycled rubbish is the Artificial Intelligence nonsense, the visionary technologist’s wet dream that will not die. Sorry folks – it ain’t gonna happen. It wasn’t going to happen last century, and it’s not going to happen now. Can we please just forget about Artificial Intelligence?

It was once thought that it would take intelligence for a computer to play chess. Computers can now play grandmaster-level chess. But they’re not one whit closer to being intelligent as a result, and we know it. Instead of admitting we were wrong, or admitting that since it obviously doesn’t take intelligence to play chess that maybe Artificial Intelligence as a field was chasing something that was not actually intelligence at all, we move the goalposts and continue the elusive search. Obviously the development of computers that can play better than human-level chess (is it good chess? I don’t think we can say it is), and other advances, have had a major impact. But they’ve nothing to do with intelligence, beside our own ingenuity at building faster, smaller, and cooler machines with better algorithms (and, in the case of chess, bigger lookup tables) making their way into hardware.

And so it is with the semantic web. All talk of intelligence should be dropped. It’s worse than useless.

But, there has been real progress in the web in recent years. Web 2.0, whatever that means exactly, is real. Microsoft were right to be worried that the browser could make the underlying OS and its applications irrelevant. They were just 10 years too early in trying to kill it, and then, beautiful irony, they had a big hand in making it happen with their influence in getting asynchronous browser/server traffic (i.e., XmlHttpRequest and its Microsoft equivalent, the foundation of AJAX) into the mainstream.

Similarly, there is real substance to what people talk about as Web 3.0 and the semantic web.

It’s all about data.

It’s just that. One little and very un-sexy word. There’s no need to get all hot and bothered about intelligence, meaning, reasoning, etc. It’s all about data. It’s about making data more uniform, more accessible, easier to create, to share, to find, and to organize.

If you read around on the web, there are dozens of articles about the upcoming web. Some are quite clear that it’s all about the data. But many give into the temptation to jump on the intelligence bandwagon, and rabbit on about the heady wonders of the upcoming semantic web (small-s, capital-S, I don’t mind). Intelligent agents will read our minds, do the washing, pick up the kids from school, etc.

Some articles mix in a bit of both. I just got done reading a great example, A Smarter Web: New technologies will make online search more intelligent–and may even lead to a “Web 3.0.”

As you read it, try to keep a clean separation in mind between the AI side of the romantic semantic web and simple data. Every time intelligence is mentioned, it’s vague and with an acknowledgment that this kind of progress may be a little way off (sound familiar?). Every time real progress and solid results are mentioned, it’s because someone had the common sense to take a bunch of data and put it into a better format, like RDF, and then take some other routine action (usually search) on it.

I fully agree with those who claim that important qualitative advances are on their way. Yes, that’s a truism. I mean that we are soon going to see faster-than-usual advances in how we work with information on the web. But the advances will be driven by mundane improvements in data architecture, and, just like computers “learning” to “play” chess, they will have nothing at all to do with intelligence.

I should probably disclose that I’m not financially neutral on this subject. I have a small company that some would say is trying to build the semantic web. To me, it’s all about data architecture.

AddThis Social Bookmark Button

conspiracy of sleepers

17:42 February 14th, 2007 by terry. Posted under me. Comments Off on conspiracy of sleepers

I don’t seem to need to sleep as much as some. I’m perfectly happy with 4 hours a night, can probably go indefinitely on 6 hours a night, and anything more is just cream. That’s not to say I don’t like sleep. I love it. I’ll happily stay in bed for 12 hours if I feel like it. I think I’ve been like this my whole life – my parents told me that when I was a kid they’d just leave me in bed awake when they went to sleep. My son is perhaps also like this, he never wants to go to sleep – though he’s often determined to sleep in.

When I’ve slept 4 hours, I don’t feel impaired in any way. I’m happy to then get up and work 16 hours – certainly not as efficiently as some people, but that’s just me, it’s not because I’m tired.

So I always wonder about sleep advocates who insist that all humans need 8 hours of sleep a night, or else. It feels like a conspiracy of people who really do need that much sleep, trying to stop those of us who can work much longer hours from getting ahead.

After all, have you ever heard that you really need to sleep 8 hours from someone who only needs 4 hours a night?

I didn’t think so.

AddThis Social Bookmark Button

go right

02:20 February 14th, 2007 by terry. Posted under me, travel. Comments Off on go right

Large passenger planes have two aisles. When leaving the plane, the right hand side always moves much faster than the left.

I think this happens because the door is on the front at the left and at the moment when the two lines meet the people coming from the right side have some momentum up, they’re going straight ahead, and they don’t need to turn a corner and merge to get off. The people from the left side have to inject themselves into this stream. Everyone is tired and maybe the people from the right are less inclined to politely let someone in from the left.

Whatever it is, the effect is pronounced. On some flights the right side will drain completely while there are still dozens of people left on the left. I’ve watched this many times. I’ve asked a couple of stewards, and they agreed but hadn’t noticed or didn’t know why.

Random thought for the week.

AddThis Social Bookmark Button

unrewarding life as an airline customer

17:59 December 25th, 2006 by terry. Posted under me. Comments Off on unrewarding life as an airline customer

Why do all airlines treat their customers like shit?

I’ve wondered this from time to time, and I guess there are many reasons. I’ve been treated like shit so many times by so many airlines. It’s so common that it’s almost not worth mentioning. But, it’s Xmas and I’ve just had the usual run around, and seeing as that has left me away from my family as well as otherwise inconvenienced and out of pocket, I feel like a bit of a ramble.

Today wasn’t even a particularly egregious example, and I really don’t care that it’s Xmas. Today was just run-of-the-mill being treated like shit.

First of all, I manage to inconvenience several others in getting myself back to Barcelona last night – borrowing a car, filling it with gas, leaving the key somewhere exposed (risking theft), leaving Ana and the kids on Xmas eve, etc. All my choices. I set the alarm and get up early (for me) at 8am.

Taxi to the airport (25 euros). Check in. Coffee. Wait in departure lounge. Wait in departure lounge. Departure time comes and goes. Wait in departure lounge. Then there’s an announcement: flight canceled due to an unavailable part, no chance of a replacement, no chance of an alternate flight, maybe we can get you all on something tomorrow. It’s Xmas, you’ve got probably 150 people, many of whom are heading home to family, many of whom are leaving family.

Does a single Delta representative say a single word of apology, commiseration? Nope.

They announce a phone number to call in a few hours to check what’s going on. Everyone is to be herded to a hotel. Of course I head back to my apartment in a taxi (25 euros).

I call my hotel in NY and cancel the night. They charge me, in accordance with their published cancelation policy (USD 85.94).

I start calling the local Delta number, the one provided by the friendly folks at the gate who canceled the flight. A machine answers, telling me the normal working hours. It’s not normal working hours right now though. It’s xmas. This seems to be standard practice: hand out a phone number that does not work. Shift the blame and the responsibility.

Finally I call the US. After navigating through the voice mail system (Press here if your flight was just unceremoniously canceled – NOT), I get a human on the other end. I tell her calmly what happened. She is an idiot. She asks me if I have already flown. I say no, I repeat that I am still in Barcelona. She asks if I am in a hotel and I repeat that I am in my apartment. She asks if I am still at the airport. She tells me my flight was BCN to Newark and I say no, to JFK. She tells me the flight was definitely to Newark and asks me if the flight left. No, the flight was canceled. I am paying in time and money (same thing?) to talk to this woman. Fortunately it turns out I have been booked, through Newark, on a Continental flight, tomorrow.

I tell the woman I’m hoping to be reimbursed for two taxi trips and one hotel room. She asks a few questions – why did I take a taxi if I’m in the hotel? No, I’m in my apartment. Did I book through hotels.com? No. Well I should talk to the ticket office in Barcelona at the airport and they will sort out the reimbursement. Fat chance. They will no doubt refer me to Delta elsewhere.

Does this woman say “Gee, I’ve just noticed it’s xmas day, I’m really sorry for the inconvenience, it must be a bummer to have your plans ruined”? Nope. Does she say anything remotely human? Nope. In fact I had an easier time getting information into and out of the voice system that fielded the call.

I say that in my experience it’s not really worth the effort to try to get money back from airlines. She doesn’t answer. There’s just a silence on the other end of the line that lasts for about 20 seconds.

She’s one step above the voice mail system. She’s been trained not to say anything that might imply culpability. Someone might sue Delta if an employee were to admit that maybe they fucked up. So there’s just a silence on the line. I don’t remember who broke it, probably me saying goodbye.

And that’s it. I’m not even upset, and I far prefer an extra day in Barcelona instead of a day in NY in a hotel waiting for a flight to Chicago. This has all been extremely mild compared to some screw-ups.

Like I said, I think there are many reasons the airlines can and do treat their customers like shit. Blogging about it isn’t going to help much, but I don’t have the energy to do much more. And that’s part of it too.

In the US one very often runs across incredibly stupid “service” people on the phone (and in person – at the bank, buying fast food, etc). I don’t mean that in a nasty elitist way, even though I am nasty and elitist. These people are just dumb or bored or…. They speak great English, there’s no problem there. They’re just really dumb. On top of that, they are trained to be robots. Any attempt at humor or any unexpected remark is a deviation from the expected script and simply causes confusion. I’ve seen it so often, I can’t be bothered going into it here. I think legal fears also add a further level of bleaching anything human out of interactions. That’s ironic – I wonder if there would be fewer lawsuits if company representatives were a little more humane. I know I’d be a hell of a lot more forgiving if the representative were able to field a joke or make a self-deprecating remark about life as an airline customer, etc. Maybe we need a class-action lawsuit for being treated so poorly.

Blah.

AddThis Social Bookmark Button

fergus. beer.

01:03 December 7th, 2006 by terry. Posted under me. Comments Off on fergus. beer.

I’m sitting here drinking a beer with Fergus. That’s it.

AddThis Social Bookmark Button

another night down the tubes

04:23 October 4th, 2006 by terry. Posted under me. Comments Off on another night down the tubes

I’ve managed to somehow pass another night here not writing code. Ana told me once that unless I’m coding I always complain that I’m not working, not getting enough done. And added that I should change my idea of what work was. But I can’t shake it, nothing else feels like work, or at least doesn’t feel very productive.

I was hoping to do a bit of a Xapian versus PyLucene comparison tonight and perhaps begin to integrate one of them into my project. But that will have to wait. It was too important to set up a new blog.

AddThis Social Bookmark Button

more of the same

02:39 October 4th, 2006 by terry. Posted under me. Comments Off on more of the same

I decided I should blog more. I have another blog, which I update infrequently. It has various things about my kids and other random bits and pieces. I don’t really feel like polluting it with geeky or technical thoughts.

I spend time reading other people’s blogs, some of whom are my friends. It’s not a bad way to stay in touch – very minimal, you just read along if you want. There’s no need to address anyone in particular, to reply, to do anything. I enjoy it and so I figured others would probably be happy to read along as I do my thing.

I spend a great deal of time these days sitting up in the middle of the night hacking on obscure things. It’s fun, though I am too easily distracted (as you can see). Occasionally I wish I didn’t feel like I was laboring in obscurity, so I’m going to post random thoughts here and see what comes back, if anything. I’ll probably enjoy doing it no matter what.

More soon.

AddThis Social Bookmark Button