Big Business and the Semantic Web

It’s clear that the unofficial policy of both Microsoft and Google is “we don’t believe in the semantic web”. It may not be clear why. The answer is unsurprising when you give it some thought, though: Big Business. Semantic search holds out the hope of users being able to compose meaningful queries and get relevant results. The price is that someone somewhere has to write down the answers in a meaningful way.

Meaning is a tricky word to play with, but here I mean complex structured data designed to adequately describe some domain. In other words - someone has to write and populate an ontology for each domain that the users want to ask questions about. It’s painstaking, specialized work that not just anyone can do. Not even a computer scientist - whilst they may have the required analysis and design skills, they don’t have the domain knowledge or the data. Hence the pace of forward progress has been slow as those with the knowledge are unaware of the value of an ontology or the methods to produce and exploit it.

Compare this to the modus operandi of the big search companies. Without fail they all use some variant on full-text indexing. It should be fairly clear why as well - they require no understanding of the domain of a document, nor do their users get any guarantees of relevance in the result sets. Users aren’t even particularly bothered when they get spurious results. It just goes with the territory.

Companies that hope or expect to maintain a monopoly in the search space have to use a mechanism that provides broad coverage across any domain, even if that breadth is at the expense of accuracy or meaningfulness. Clearly, the semantic web and monolithic search engines are incompatible. Not surprising then that for the likes of Microsoft and Google the semantic web is not on their radar. They can’t do it. They haven’t got the skills, time, money or incentive to do it.

If the semantic web is to get much of a toehold in the world of search engines it is going to have to be as a confederation of small search engines produced by specialized groups that are formed and run by domain experts. In a few short years Wikipedia has come to rival the likes of Encyclopedia Britannica. The value of its crowd-sourced content is obvious. This amazing resource came about through the distributed efforts of thousands across the web, with no thought of profit. Likewise, it will be a democratized, decentralized, grass-roots movement that will yield up the meaningful information we all need to get a better web experience.

Andrew Matthews 2007-08-14

Dialogue & Discussion