Jon Udell looks ahead to how the “Google supercomputer” can transform Internet-scale software:
On the Google PC, you wouldnt need third-party add-ons to index and search your local files, e-mail, and instant messages. It would just happen. The voracious spider wouldnt stop there, though. The next piece of low-hanging fruit would be the Web pages you visit. These too would be stored, indexed, and made searchable. More ambitiously, the spider would record all your screen activity along with the underlying event streams. Even more ambitiously, it would record phone conversations, convert speech to text, and index that text. Although speech-to-text is a notoriously imperfect art, even imperfect results can support useful search.
Here are some of the ways the Google PC could exploit this data:
Bayesian categorization: My SpamBayes-enhanced e-mail program learns continuously about what I do and dont find interesting, and helps me organize messages accordingly. A systemwide agent thats always building categorized views of all your content would be a great way to burn idle CPU cycles.
Context reassembly: When writing a report, youre likely to refer to a spreadsheet, visit some Web pages, and engage in an IM chat. Using its indexed and searchable event stream, the system would restore this context when you later read or edited the document. Think browser history on steroids.
Screen pops: When you receive an e-mail, IM, or phone call, the history of your interaction with that person would pop up on your screen. The message itself could be used to automatically refine the query.
With managed metadata, these things are easy to do, and thats a key motivation for Longhorns WinFS storage system. But we dont have a lot of metadata now, and we wont have much anytime soon. So its worth reflecting on what Google has accomplished by brute force. Instead of idly slacking most of the time, our PCs ought to be indexing, analyzing, correlating, and classifying.