The Flattening of Almost Everything #2: Information Retrieval

The WorldWideWeb is where Moore’s Law met Metcalfe’s Law. Information management – the way we find out what we want to know – went from hierarchical to flat in just a few years as a result. We now assume – usually correctly – that we can find any particular piece of data from a railroad schedule in Estonia to a quote by an Argentine novelist on the Web within minutes of wanting it. We also rely on the web for cross–references (links) to interesting information related to whatever we originally searched for specifically.

Back when I was at Microsoft, Lotus Notes, written by the brilliant Ray Ozzie, was the competitor which worried Bill Gates (and, therefore, the rest of us) the most. Companies were building information management application in Notes. True, Notes ran under Windows, but the danger we saw was that Notes and not Windows would be the platform that developers wrote to. Many of the pundits were saying (hoping) this would happen.

There were several competing efforts under way in Redmond to build the Notes-killer. One of them was mine: Microsoft Exchange Server. Exchange was behind schedule for release when I took it over and slipped even further as we tired to shoehorn in features that would one-up or at least match the information-handling capabilities of Notes. Trouble was that Exchange was also the long-overdue replacement for DOS-based Microsoft Mail.

Another effort was Cairo (a future release of NT) championed by Jim Allchin of Banyan Vines fame. Here the information was managed at the operating system level rather than in the email server. The database guys had their own effort underway. “Ren and Stimpy” was the code name for Brian MacDonald’s brilliant concept in a personal information manager (PIM) which eventually became the Outlook client.

We all argued long and hard and as loudly as only Microsoft people can do about which of these was the correct solution, which should own the APIs used by Office for information management, and which ideas were brain dead. Bill kept the competition alive by not deciding between us. I think he wanted to see what emerged.

But every one of these solutions – including the bogyman, Notes – was hierarchical. There were folders within folders within folders. Sure, there were key word searches. And categories could be assigned. Different views could be produced. But we all assumed that most people would approach information through the categories they assigned the information to.

To put it mildly, we were all wrong!

The WorldWideWeb, the sea of information that we now can’t imagine living without, is flat! “Flat” isn’t even really a good metaphor – the WorldWideWeb is actually dimensionless. You can navigate directly from any page to any other page. Any page can point to any other page. And, although websites are nominally hierarchical, search engines and links point you directly to the page on the site that you are looking for.

The power of this horizontal approach to information doomed Notes to an increasingly irrelevant niche, sent Exchange back to its proper role as an email system, and Outlook to its role as an email/scheduling/tasks client.

People don’t think hierarchically – at least most people don’t. We think in terms of associations. Our dreams give this away as they hyperlink through experiences of the day and memories of the distant past. A conversation meanders horizontally from one topic to the next. “That reminds me of…” is the way we get from one place to another in our own brains. Some day we may understand this mechanically as an obvious consequence of the way neurons connect.

Hierarchies like Lotus Notes or the Dewey Decimal System were necessary when computing power was non-existent or very expensive. As computing power has become relentlessly cheaper thanks to Moore’s law, hierarchies of information have become unnecessary. Cheap MIPs made graphical browsers, higher-bandwidth modems, smart lights in fiber, and search engines possible. All that we needed then was the WorldWideWeb so that almost all information became available to the galloping bots and hierarchies of information became obsolete. So long as Google or its competitors can index almost everything I might ever want to find, why should any arbitrary order be imposed on information? In fact, Metcalfe’s law, which states that the value of a network is proportional to the square of the number of endpoints, may be an understatement when applied to a hyperlinked network where there can be value in multiple references between any pair of documents.

Once we didn’t need hierarchies to organize our approach to information, they became an impediment. It is very hard for one person to figure out which node in which folder tree another person would have put a particular piece of information. A document may be relevant to one researcher for entirely different reasons than it is relevant to another researcher. The creator of a document doesn’t know all the ways the information in the document may be used.

The relationship between documents is actually dynamic depending on the needs of the reader. Not incidentally, open tagging and hyperlinking are both ways to impose particular relationships on documents to meet the need of some subset of readers. These relationships, themselves, can and do evolve constantly on the web.

The flattening of the information space is part of an accelerating and self-reinforcing trend of change. The flattening was enabled by the two great inventions of the Internet and personal computers. But, with information now much more readily available than it has ever been before, innovation becomes easier and change continues to accelerate.

See Search Bees for educational implications of this change.

Previously, I blogged on the flattening of organizational hierarchies. Related to that is the demise of vertical integration and implications for telco mergers.

The flattening of bureaucracies is a particularly satisfying special case of hierarchies being flattened. I blogged about some hopeful signs in India.