Posted Wednesday, January 23rd, 2008 at 6:46 pm under other, tech.

Understanding high-dimensional spaces


I’ve spent lots of time thinking about high-dimensional spaces, usually in the context of optimization problems. Many difficult problems that we face today can be phrased as problems of navigating in high-dimensional spaces.

One problem with high-dimensional spaces is that they can be highly non-intuitive. I did a lot of work on fitness landscapes, which are a form of high dimensional space, and ran into lots of cases in which problems were exceedingly difficult because it’s not clear how to navigate efficiently in such a space. If you’re trying to find high points (e.g., good solutions), which way is up? We’re all so used to thinking in 3 dimensions. It’s very easy to do the natural thing and let our simplistic lifelong physical and visual 3D experience influence our thinking about solving problems in high-dimensional spaces.

Another problem with high-dimensional spaces is that we can’t visualize them unless they are very simple. You could argue that an airline pilot in a cockpit monitoring dozens of dials (each dial gives a reading on one dimension) does a pretty good job of navigating a high-dimensional space. I don’t mean the 3D space in which the plane is flying, I mean the virtual high-dimensional space whose points are determined by the readings on all the instruments.

I think that’s true, but the landscape is so smooth that we know how to move around on it pretty well. Not too many planes fall out of the sky.

Things get vastly more difficult when the landscape is not smooth. In fact they get positively weird. Even with trivial examples, like a hypercube, things get weird fast. For example, if you’re at a vertex on a hypercube, exactly one half of the space is reachable in a single step. That’s completely non-intuitive, and we haven’t even put fitness numbers on the nodes. When I say fitness, I mean goodness, or badness, or energy level, or heuristic, or whatever it is you’re dealing with.

We can visually understand and work with many 3D spaces (though 3D mazes can of course be hard). We can hold them in our hands, turn them around, and use our visual system to help us. If you had to find the high-point looking out over a collection of sand dunes, you could move to a good vantage point (using your visual system and understanding of 3D spaces) and then just look. There’s no need to run an optimization algorithm to find high points, avoiding getting trapped in local maxima, etc.

But that’s not the case in a high-dimensional space. We can’t just look at them and solve problems visually. So we write awkward algorithms that often do exponentially increasing amounts of work.

If we can’t visually understand a high-dimensional space, is there some other kind of understanding that we could get?

If so, how could we prove that we understood the space?

I think the answer might be that there are difficult high-dimensional spaces that we could understand, and demonstrate that we understand them.

One way to demonstrate that you understand a 3D space is to solve puzzles in it, like finding high points, or navigating over or through it without crashing.

We can apply the same test to a high-dimensional space: build problems and see if they can be solved on the fly by the system that claims to understand the space.

One way to do that is the following.

Have a team of people who will each sit in front of a monitor showing them a 3D scene. They’ll each have a joystick that they can use to “fly” through the scene that they see. You take your data and give 3 dimensions to each of the people. You do this with some degree of dimensional overlap. Then you let the people try to solve a puzzle in the space, like finding a high point. Their collective navigation gives you a way to move through the high-dimensional space.

You’d have to allocate dimensions to people carefully, and you’d have to do something about incompatible decisions. But if you built something like this (e.g., with 2 people navigating through a 4D space), you’d have a distributed understanding of the high-dimensional space. No one person would have a visual understanding of the whole space, but collectively they would.

In a way it sounds expensive and like overkill. But I think it’s pretty easy to build and there’s enormous value to be had from doing better optimization in high-dimensional spaces.

All we need is a web server hooked up to a bunch of people working on Mechanical Turk. Customers upload their high-dimensional data, specify what they’re looking for, the data is split by dimension, and the humans do their 3D visual thing. If the humans are distributed and don’t know each other they also can’t collude to steal or take advantage of the data – because they each only see a small slice.

There’s a legitimate response that we already build systems like this. Consider the hundreds of people monitoring the space shuttle in a huge room, each in front of a monitor. Or even a pilot and co-pilot in a plane, jointly monitoring instruments (does a co-pilot do that? I don’t even know). Those are teams collectively understanding high-dimensional spaces. But they’re, in the majority of cases, not doing overlapping dimensional monitoring, and the spaces they’re working in are probably relatively smooth. It’s not a conscious effort to collectively monitor or understand a high-dimensional space. But the principle is the same, and you could argue that it’s a proof the idea would work – for sufficiently non-rugged spaces.

Apologies for errors in the above – I just dashed this off ahead of going to play real football in 3D. That’s a hard enough optimization problem for me.

  • http://jon.es terry

    Jim – thanks for the paper, I’m looking at it now. I was also thinking about financial visualization :-)

  • http://jon.es/ terry

    Jim – thanks for the paper, I’m looking at it now. I was also thinking about financial visualization :-)

  • http://jon.es terry

    Hi Jim

    I typed too quickly too. Nick pointed out that I made no sense saying half the space is just one step away in an n-dimensional hypercube. I meant that half the space is within n/2 steps. Also, the human concept of distance is highly misleading as the diameter of the hypercube is n, so you’re always very close to the best (and worst) solutions. And if you walk for a long time you never get very far, etc.

    Plus I realized later that in order for the idea to work, you in some sense need the problem to be decomposable – at least if the humans are doing dumb jobs. I.e., if the humans are doing something simple, you could replace them each with a simple program, so the system could only solve decomposable problems – in which case we didn’t need the humans at all. It gets more interesting if the humans really are doing processing that we don’t know how to do well – like using their visual systems to navigate or say something about the space that our best algorithms are bad at.

    The distributed hunt for Jim Gray’s yacht comes to mind.

  • http://jon.es/ terry

    Hi Jim

    I typed too quickly too. Nick pointed out that I made no sense saying half the space is just one step away in an n-dimensional hypercube. I meant that half the space is within n/2 steps. Also, the human concept of distance is highly misleading as the diameter of the hypercube is n, so you’re always very close to the best (and worst) solutions. And if you walk for a long time you never get very far, etc.

    Plus I realized later that in order for the idea to work, you in some sense need the problem to be decomposable – at least if the humans are doing dumb jobs. I.e., if the humans are doing something simple, you could replace them each with a simple program, so the system could only solve decomposable problems – in which case we didn’t need the humans at all. It gets more interesting if the humans really are doing processing that we don’t know how to do well – like using their visual systems to navigate or say something about the space that our best algorithms are bad at.

    The distributed hunt for Jim Gray’s yacht comes to mind.

  • Jim

    Wrote too fast and left out the url http://hci.ucsd.edu/hollan/worlds-within-worlds.pdf

  • Jim

    Wrote too fast and left out the url http://hci.ucsd.edu/hollan/worlds-within-worlds.pdf

  • Jim

    Hi Terry,

    Cliff Beshers did some interesting work on this issue many years ago. I put a really bad copy of the paper he and his advisor, Steve Feiner, wrote. When Cliff visited awhile back he gave me a video of the system that is much more interesting than the paper. At that time I pointed him to your fitness landscape work.

    Jim

  • Jim

    Hi Terry,

    Cliff Beshers did some interesting work on this issue many years ago. I put a really bad copy of the paper he and his advisor, Steve Feiner, wrote. When Cliff visited awhile back he gave me a video of the system that is much more interesting than the paper. At that time I pointed him to your fitness landscape work.

    Jim

  • a
  • a