Web and Tags

Ken Norton writes:

Tagging isn’t new; the web is full of tags. But they’re not in meta keywords, they’re in the links. The text of the links pointing to other web pages are simply the web publisher’s best effort to describe the page she’s linking to. And it turns out those links are some of the most valuable metadata we have to work with in search. And you know what? They’re subject to all of the flaws people say will doom tagging. Spammers lie. The spelling is atrocious. And there’s ambiguity everywhere. But given a huge population of links, you can begin to make sense of the madness. Why? Well, there are humans on both ends of the search rope. There’s a person searching, and there’s a person who’s written some content. The job of the search engine is to simply connect the two. Traditional software engineers, in their endless pursuit of the elimination of ambiguity, sometimes forget this. Search engineers embraced it.

Clay Shirky adds:

There are humans at both ends of the rope. It seems so simple, but technologies that can rely on this fact have a huge advantage, since the human brain is terrific at signal extraction in environments that consistenly defeat machine strategies. This is one of the reasons that semweb-flavored approaches to metadata attempt to express data in an unambiguous format if there is a machine at the other end of the rope, even slight ambiguities defeat the recipients interpretive capabilities. As the man said, time flies like an arrow, but fruit flies like a banana.

Once you have humans at both ends of the rope, though, even purely contextual tags that are unextractable from the tagged content itself, tags like cool and toread, become valuable. This is why attempts to improve tagging by making it less ambiguous are missing the point the ambiguity allows for a huge reduction of both markup cost and conceptual brittlenes.

Published by

Rajesh Jain

An Entrepreneur based in Mumbai, India. View all posts by Rajesh Jain