Bus. Std: Tomorrows Search Technologies

My latest Business Standard column:

Search has become a window to the world wide web of data. Todays search is simplistic: type a few words in a box, get back zillions of results, and click on one or more of the results to see if we get what we are looking for. Think of todays search as the DOS era: a good start, but not enough to unleash the real power of what can be. It took a decade to go from DOS to Windows. It has taken us almost as long to start imagining and working towards the next generation of search technologies. Here are some key ideas which will help define tomorrows search:

Integration between Desktop and Internet Search: One of the great inconsistencies of our computing world has been the disconnect between being able to search the web much more easily than our own desktop computer. This is beginning to change with the emergence of desktop search utilities like Copernic, Blinkx and Hotbots desktop search toolbar. What is still missing is seamless integration between our personal data (spread across emails, attachments, IM logs and documents) and the web.

Better Visualisation and Navigation Tools: Is a set of pages, each with 10-20 results, the best way to navigate the millions of search results that can show up? It is only now that ideas from information visualisation (for example, Groxis) are making their way for viewing of search results. By clustering results and providing visualisation techniques, it should be possible to provide rich interfaces for navigation.

Real-time Search: The current model involve search engines crawling web pages every so often and including them in their search results. This means that at times it can be several days before new updates on a page can show up in search engine results. There are two ways to address this problem: websites can notify search engines via a ping when they are updated (much like weblogs do to sites like weblogs.com), and search engines can dynamically query specific databases via web services APIs to deliver results (much like Googles own API offers). A third approach has emerged with narrowly focused RSS- and XML-based search engines like Feedster, Technorati and PubSub.com.

Searchstreams Analysis: What searchstreams offer, according to John Battelle of Searchblog, is the ability to capture and record your search history as well as the things you looked at, all in one package. Battelle, who is also writing a book on search, explains further: What I really wish for, both to tell the story of my search, and to annotate my book, is the ability to take that searchstream and turn it into an object – a narrative thread of sorts, something I can hold and keep and refer to, a prop to aid in the telling and retelling of how I came to my answer. Tracks in the dust, so to speak, so others can follow and make their own, or follow mine and see (and question!) how I came to my conclusions. Imagine, I thought to myself, if instead of footnotes and citations, I could append searchstreams… Next-generation infoware tools like Furl (recently acquired by Looksmart) and del.icio.us are already working to enable some of this. Searchstream analysis could also be used to add context and personalisation to search. Searchstreams also offer a foundation to construct Vannevar Bushs vision of the Memex.

Multimedia Search: Our world of data has grown to beyond search. Even though there is image search available on search engines, it is restricted to the text and keywords associated with the images. What is needed is the ability to look inside sound files, images and videos and allow search based on the content. This becomes especially important in the context of the growing ease of generating and publishing multimedia content via cellphones.

Mobile Devices: Talking of cellphones, it is going to become increasingly important to support a quality search experience on these devices. The current search results are best shown on a web browser. Yet, in countries like India, the number of mobile devices computers. So, the focus for search needs to be on providing accurate results in the form of microcontent that can be sent and displayed on mobile devices. Not only does this mean a different format, but it could also mean providing location-based information.

Local Search: Much of our life is spent in neighbourhoods. It is quite hard to find local information in the vicinity of where we live and work. It is estimated that a quarter of the searches performed on search engines have a local flavour to them. Being able to integrate the results with maps, directions and other neighbourhood-specific information can make search engines much more relevant. This is where search engines like Yahoo and Google will compete head-on with classifieds and yellow pages.

Vertical Search: Chris Sherman wrote on SearchEngineWatch that searchers are becoming more sophisticated, and are learning that general purpose search engines are not always the best choice for every type of search. Because these search engines have a narrower focus and limited domain, they can offer specialized utilities for a richer experience. An example of vertical search engines is GlobalSpec, which focuses on the engineering industry.

The state of Search is very much like the way the scientific world was in the seventeenth century until Issac Newton came along and helped lay the foundation for the world ahead with his theories and inventions. A similar revolution is needed in the world of Search. Can we in India play a role, just as we did in some of the mathematical discoveries many centuries ago?

Data and Metadata

David Weinberger writes:

There used to be a difference between data and metadata. Data was the suitcase and metadata was the name tag on it. Data was the folder and metadata was its label. Data was the contents of the book and metadata was the Dewey Decimal number on its spine. But, in the Third Age of Order, everything is becoming metadata.

For example, imagine you’re at a large corporation doing a Third Order treatment of its digital library of research articles. Instead of (or, in addition to) designing a large, complex, hierarchical taxonomy, you focus on adding enough metadata to each article so that people will be able to sort and classify them any which way they want. If someone wants to find all the articles that talk about hydrocarbons written in Italian in 1965 and that have more than 30 footnotes, they’ll be able to. If someone wants to make a browsable hierarchy based not on topic but on gender or on the number of co-authors, they’ll be able to. You build enriched objects first so your users can forever after taxonomize the way they want to, instead of the way you think they’ll want to.

Now take a closer look at these information objects. They look like contents tagged with lots of metadata, but in fact they’re all metadata. If I’m looking for an article about hydrocarbons written by Barbara Rodriguez, then the article’s topic (“hydrocarbons”) and author’s name (“Rodriguez, Barbara”) are metadata, and the content is the data. But, I could just as well be trying to remember the name of the author who wrote an article that included the phrase “Hydrocarbons are the burros of the the cosmos” sometime in the 1960s, in which case the content and date are metadata and the author’s name is the data. What’s data and what’s metadata depends on the person doing the asking.

So, in the Third Age of Order, all data is metadata. Contents are labels. Data is all surface and no insides. It’s all handles and no suitcase. It’s a folder whose content is just another label. It’s all sticker and no bumper.

Why does this matter? It changes the primary job of information architects. It makes stores of information more useful to users. It enables research that otherwise would be difficult, thus making our culture smarter overall. But, most interestingly (at least to me), this does the ol’ Einsteinian reverse flip to Aristotle. Aristotle assumed that of the 10 categories by which one could understand a thing, one must be primary: Where that thing fits into the tree of knowledge. So, you could say that Alcibiades is made of flesh or lived in Greece, but if you really want to understand him, you have to say that he is an animal of a particular kind. But, now that everything is metadata, no particular way of understanding something is any more inherently valuable than any other; it all depends on what you’re trying to do. The old framework of knowledge and authority are getting a pretty good shake.

eCommerce Future

News.com writes about a panel discussion held recently to celebrate 10 years of Internet commerce:

The panelists offered an array of ideas about how e-commerce might evolve in positive ways during the next few years. Some of the ideas were new, and others have been discussed for years but have yet to take off. Most speakers agreed that the sales of music, movies, games and other digital products represent one of the most exciting and dynamic areas of e-commerce. Internet visionaries are also working on ratcheting up so-called personalization and localization technology to make Web sites anticipate a shopper’s every need wherever they happen to be.

Another holy grail is the prospect of luring consumers to shop over their cell phones–a big trend in Asian countries that hasn’t caught on as much in the United States.

Rosensweig and Bonnie predicted that Web logs and online communities such as Friendster would come to incorporate e-commerce features through “favorites” lists for music and games. The panelists agreed that online auctions and the migration of electronic transactions from proprietary Electronic Data Interchange networks to the Internet, will continue to grow and thrive.

Panelist Mary Meeker, an Internet stock analyst at Morgan Stanley, predicted that site outages would become more frequent during the next few years as the Web grows more complex. She also said the long-awaited rise of online “micropayments,” which are payments of only a few cents for goods and services bought online, is just around the corner.

OQO: PC in a Pocket

The New York Times writes:

Thanks to some of the very advances in miniaturization that make hand-held gadgets possible (bright indoor-outdoor screens, two-inch hard drives), these guys have devised the world’s smallest Windows XP computer: 4.9 by 3.4 inches and less than an inch thick. They pose an intriguing question: why would you buy a bunch of gadgets designed to liberate the data from your PC if you could just shove the entire PC into your pocket? It’s called the OQO.

OQO the company has big plans for OQO the computer. It claims to have generated wide interest in industries like insurance, field sales, public safety, manufacturing and health care. For example, doctors and nurses could call up patient records at home, on the road or, over a wireless network, anywhere in the hospital.

But if you can get over the lack of a CD drive, there’s a lot to be said for the OQO even for individuals. When your digital camera’s memory card gets full, no worries; just offload the photos to the PC in your purse or pocket and keep shooting. You don’t have to transfer your videos from your PC to one of those $500 video players for your train ride, because you’ll have the PC itself with you. And forget about printing out your MapQuest driving directions or your Travelocity travel itinerary from your PC. Why bother, when you can open the original electronic document at any time?

OQO’s claim that you could use the OQO as your sole computer is a tad far-fetched; its limited memory, speed and storage would probably put a crimp in your computing style. It’s not cheap, either, although it’s in line with laptop prices: $1,900 with Windows XP Home Edition installed, $2,000 for XP Professional. And the battery life is disappointing: about 2.5 hours per charge. At least the battery is removable, so you can swap in a fully charged spare.

Otherwise, though, OQO is the most elegant, versatile, solidly build miniature PC possible with current technology. Its creators have blown the concept of the digital hub to smithereens, and given whole new meaning to the term pocket PC.

Syncing and Mobility

Russell Beattie writes:

Syncing is *THE* most important piece of technology in the future of mobility. Voice is and will be the number one service, but after that it’s syncing.

I don’t care if you *never* use your mobile for internet data, you still want your address book backed up in case you lose your phone, right? That’s syncing. But then it goes from there to any piece of data you store on your phone. You want to not only back it up, but make sure it’s synced with the rest of your digital world. Calendar and PIM information is what Palm does best. Also apps (again Palm does it perfectly), then music files where the iPod shines. And then it goes on to every file you have on your mobile device. You want to make sure it’s the latest version, that if you change that data it’s reflected anywhere else you use that data and finally, that if in case you lose that data, it’s backed up somewhere you can get at it. It’s simple. I don’t care if this all happens over a USB Cable, A Bluetooth Connection or a Cellular Network. It just needs to happen, and seamlessly. If you have to think about syncing, it’s not syncing.

What are the benefits of syncing? Well, we’re seeing it now in the PodCasting meme, aren’t we? The simple act of grabbing the latest audio files and syncing it magically to your music player has created a whole new medium for people to broadcast audio. That’s huge. Right now it’s mostly focused on iPods, but that’s going to quickly change.

By making syncing “just work” Microsoft has enabled their Windows Mobile phones to become a player in this new Podcasting medium. Automagically. And today.

We’re now entering that next phase in the mobile revolution – where data services become a bigger and bigger portion of revenue stream to operators and developers like myself. It’s in this new world where syncing is going to play a huge part.

TECH TALK: Web 2.0 Conference: Observations (Part 4)

JD Lasica posted some quotable quotes from the conference [1 2]:

Jeff Jarvis during a panel on RSS: “Big media should not be scared of giving away their content for free. What big media should be scared of is the death of the centralized marketplace.”

Someone quoting a new pearl of wisdom: “RSS is the ultimate opt-in.”

Esther Dyson: “The dimension that matters is time.”

Esther again: “The tagging matters” — ie, metadata will become increasingly important in the years ahead.

Jeff Jarvis again: “[Google] AdSense took the cooties off of home pages and blogs” as a place for advertising.

Jeremy Zawodny of Yahoo: “All we have to do is get publishers to adopt it [RSS]. The readers don’t have to know what’s under the hood.”

John Battelle: “The force of 1 million sites with 1,000 users is far larger than 100 sites with a million users. … The tail has an incredible amount of power.”

Quick factoid from a powerpoint slide: 20% of all searchers account for 68% of all searches.

“More cell phones are sold in four days than all the Apple computers in history.”

English is no longer the majority language of weblogs. It’s still the plurality language, but all the other languages combined now outnumber English in the blogosphere. [Technorati slide]

Dan Rosensweig: “The Web is the most selfish medium ever invented — people want what they want, where they want it, when they want it.”

Steven Levy of Newsweek summed up the conference:

Are you ready for the new Web? It’s getting ready for you. It turns out that bidding on eBay, gathering with Meetup and Googling on, um, Google are only the opening scenes in a play whose running time will top “Mahabharata.” While we’ve been happily browsing, buying and blogging, the tech set has been forging clever new tools and implementing powerful standards that boost the value of information stored on and generated by the Net. Things may look the same as the old Web, but under the hood there’s been some serious tinkering, and after years of hype among propeller-heads, some of the effects are finally arriving.

“Web 1.0 was making the Internet for people,” said Amazon.com’s Jeff Bezos at the conference. “Web 2.0 is making the Internet better for computers.” That doesn’t mean that the two-legged set is left out. To the contrary, the new Web is based on what’s called an “architecture of participation.” Successful Net ventures draw strength from the activity of users, both on their own sites and on the Web in general. (Examples: the communities of reviewers on Amazon, and the publicly derived reputations on eBay.) As sites share their information, all the Web takes on new value. (Example: Google lets people run searches from other Web sites.) The machines get into the act by knowing how to act together, as if the Internet were one giant computer.

In Web 2.0, news items, blog entries, financial results and images are no longer locked on virtual pages, but easily detachable. This can be done because info-nuggets are now routinely “tagged” in a way that computers can identify, access and transfer. Then the nuggets can be used wherever you want.

The first Web is now 10 years old and for many of us, has transformed the way we live and work. Let us see what the second Web delivers.

Continue reading