the semantic web is the new AI

I’m not a huge fan of rationality. But if you are going to try to think and act rationally, especially on quantitative or technical subjects, you may as well do a decent job of it.

I have a strong dislike of trendy terms that give otherwise intelligent people a catchy new phrase that can be tossed around to get grants, get funded, and get laid. I spent years trying to debunk what I thought was appalling lack of thought about Fitness Landscapes. At the Santa Fe Institute in the early 90s, this was a term that (with very few exceptions, most notably Peter Stadler) was tossed about with utter carelessness. I wrote a Ph.D. dissertation on Evolutionary Algorithms, Fitness Landscapes and Search, parts of which were thinly-veiled criticism of some of the unnecessarily colorful biological language used to describe “evolutionary” algorithms. I get extremely impatient when I sense a herd mentality in the adoption of a catchy new term for talking about something that in fact is far more mundane. I get even more impatient when widespread use of the term means that people stop thinking.

That’s why I’m fed up with the current breathless reporting on the semantic web. The semantic web is the new artificial intelligence. We’re on the verge of wonders, but everyone agrees these will take a few more years to realize. Instead of having intelligent robots to do our bidding, we’ll have intelligent software agents that can reason about stuff they find online, and do what we mean without even needing to be told. They’ll do so many things, coordinating our schedules, finding us hotels and booking us in, anticipating our wishes and intelligently combining disparate information from all over the place to…. well, you get the picture.

There are two things going on in all this talk about the semantic web. One is recycled rubbish and one is real. The recycled rubbish is the Artificial Intelligence nonsense, the visionary technologist’s wet dream that will not die. Sorry folks – it ain’t gonna happen. It wasn’t going to happen last century, and it’s not going to happen now. Can we please just forget about Artificial Intelligence?

It was once thought that it would take intelligence for a computer to play chess. Computers can now play grandmaster-level chess. But they’re not one whit closer to being intelligent as a result, and we know it. Instead of admitting we were wrong, or admitting that since it obviously doesn’t take intelligence to play chess that maybe Artificial Intelligence as a field was chasing something that was not actually intelligence at all, we move the goalposts and continue the elusive search. Obviously the development of computers that can play better than human-level chess (is it good chess? I don’t think we can say it is), and other advances, have had a major impact. But they’ve nothing to do with intelligence, beside our own ingenuity at building faster, smaller, and cooler machines with better algorithms (and, in the case of chess, bigger lookup tables) making their way into hardware.

And so it is with the semantic web. All talk of intelligence should be dropped. It’s worse than useless.

But, there has been real progress in the web in recent years. Web 2.0, whatever that means exactly, is real. Microsoft were right to be worried that the browser could make the underlying OS and its applications irrelevant. They were just 10 years too early in trying to kill it, and then, beautiful irony, they had a big hand in making it happen with their influence in getting asynchronous browser/server traffic (i.e., XmlHttpRequest and its Microsoft equivalent, the foundation of AJAX) into the mainstream.

Similarly, there is real substance to what people talk about as Web 3.0 and the semantic web.

It’s all about data.

It’s just that. One little and very un-sexy word. There’s no need to get all hot and bothered about intelligence, meaning, reasoning, etc. It’s all about data. It’s about making data more uniform, more accessible, easier to create, to share, to find, and to organize.

If you read around on the web, there are dozens of articles about the upcoming web. Some are quite clear that it’s all about the data. But many give into the temptation to jump on the intelligence bandwagon, and rabbit on about the heady wonders of the upcoming semantic web (small-s, capital-S, I don’t mind). Intelligent agents will read our minds, do the washing, pick up the kids from school, etc.

Some articles mix in a bit of both. I just got done reading a great example, A Smarter Web: New technologies will make online search more intelligent–and may even lead to a “Web 3.0.”

As you read it, try to keep a clean separation in mind between the AI side of the romantic semantic web and simple data. Every time intelligence is mentioned, it’s vague and with an acknowledgment that this kind of progress may be a little way off (sound familiar?). Every time real progress and solid results are mentioned, it’s because someone had the common sense to take a bunch of data and put it into a better format, like RDF, and then take some other routine action (usually search) on it.

I fully agree with those who claim that important qualitative advances are on their way. Yes, that’s a truism. I mean that we are soon going to see faster-than-usual advances in how we work with information on the web. But the advances will be driven by mundane improvements in data architecture, and, just like computers “learning” to “play” chess, they will have nothing at all to do with intelligence.

I should probably disclose that I’m not financially neutral on this subject. I have a small company that some would say is trying to build the semantic web. To me, it’s all about data architecture.

