Google’s Supercomputer

May 13th, 2004

Steven Levy adds to the discussion that has been happening recently about Google’s distributed computing platform:

As Rich Skrenta, CEO of news aggregator Topix.net, wrote in a widely discussed blog posting last month: “Google is a company that has built a single, very large custom computer.” He’s among those who believe that Google’s sweeping search technology works within a complicated infrastructure, so speedy and efficient in handling unimaginably huge chunks of information that it’s a single, massive entity in 100,000 interconnected pieces. Think of the super-duper computer in the Isaac Asimov short story that ends with the machine saying, “Let there be light.” Google’s aggregate device, organized by a sophisticated proprietary file system, holds all of the Web and performs so seamlessly that the whole shebang becomes what geeks call a “platform”: a reliable underpinning on which people can build further innovations. (The classic example of a platform is Windows, software that provides the foundation for applications like spreadsheets and digital jukeboxes.) “I always thought of the Internet as a big, decentralized operating system,” says Tim O’Reilly, CEO of O’Reilly Media. “Google made me realize that it could be hosted by one player.”

Why is this crucial for the future of Google’s business? Because the firm’s success depends on its ability to withstand challenges from prime competitors like Yahoo and Microsoft. Both have weapons in their arsenals that Google can’t easily match. Yahoo is blessed with a base of registered customers; this will enable it to deliver personalized search results. And Microsoft currently makes the desktop and the applications that make your computer its computer. Eventually, Bill Gates and his crew hope to build search functions into their ubiquitous creations, Windows and Office, and push Google out of the picture.

Though Google’s cofounders aren’t commenting (they’re in the pre-IPO “quiet period”), their actions suggest a strategy to combat this structural disadvantage: move people’s activities away from Microsoft’s computer and onto Planet Google’s mega-search machine. “Our goal is to search the world’s information and organize it,” cofounder Larry Page once told me, and why wouldn’t that mission involve personal information that’s not on the Web?

Adds John Battelle:

So, what if Google becomes an application server cum platform for business innovation? I mean, a service, a platform service, that any business can build upon? In other words, an ecologic potentiality – “Hey guys, over here at Google Business Services Inc. we’ve got the entire web in RAM and the ability to mirror your data across the web to any location in real time. We’ve got plug in services like search, email, social networking, and commerce clearing, not to mention a shitload of bandwidth and storage, cheap. So…what do you want to build today?”

If I had that opportunity, I’d take a percentage of revs or profits on the businesses that got built, rather than just service fees. it’s Google as incubator to Web 2.0.

Yahoo is already doing this, though for a fee and in the SMB market. So is MSN. The traces are laid. Both of them were also doing mail. But neither of them have more than 100,000 servers and the GFS.

InfoWorld has more:

Centralization and decentralization are the yin and yang of computing. Witness Microsoft, a company whose dedication to the personal computer seems radically at odds with the idea of the Google supercomputer. Microsofts IT operation takes justifiable pride in running only Windows software on x86 PCs. But I was fascinated to learn, on a recent visit, that its entire worldwide business operation is serviced by a single instance of SAP R/3.

So should we say that the computer is the network, or that the network is the computer? Both statements are true. A supercomputer, operating at global or merely enterprise scale, creates its own internal network of services. But supercomputers also federate with their peers and converse with their myriad clients to enact computation on a grander scale. Theres no single right architecture or topology. Within and across enterprises, well deploy systems that embody all of these patterns.

Network theorists believe that all networks inevitably form hubs. The services fabric that enterprise architects are now weaving may sound egalitarian, but its not immune to this law. Googles supercomputer or supernode gives it a leg up on the competition. Yours, however you define it, will too.

Says Jon Udell on his blog: “”Echoes of the Google-as-supercomputer meme are everywhere lately. Sun’s new chief, Jonathan Schwartz, invoked it when we met with him recently. His take: Sure, Google runs its search and mail applications on Google, but it runs its business applications on Solaris. Coherent symmetric multiprocessing scales in a different direction, Schwartz said, and that’s where Solaris 10 is headed with its revamped and highly granular partitioning. The network is the computer. And the computer is the network. We live in interesting times!”

On a related note, News.com had an interview recently with Craig Silverstein, Google’s technology director and employee No. 1.

Steve Gillmor writes on how “Gmail would make a great container for an RSS information router”:

Gmail shifts the basis for organizing an in-box from metadata and hard-coded folders to interactive searches and virtual folders. You can attach multiple labels to messages and trigger rules that automatically apply those labels to similar incoming content. In addition, Brin has been talking to the Google development team about adding macro capabilities to run favorite searches.

Adding an API for macros would go a long way toward converting Gmail from a frontal attack on Yahoo and MSN mail offerings to a powerful enterprise platform. “We initially wanted to make sure we had something that was definitely better than all Web mail services,” Brin said. “And perhaps, just perhaps, it will also be good enough for a lot of people to use instead of a corporate mail service.”

By the time the Gmail beta period ends in three to six months, Brin and his team have promised to enable forwarding and POP3 access. However, more is required of a corporate mail service. Those capabilities must be extended to allow Gmail to provide disconnected operation and IDE for packaged applications. Even better would be a link between Gmail’s Conversation View, where threaded messages are collected and stacked together, and related RSS affinity groups.

In fact,

Another add-on. Search Engine Watch has a nice formula on the emerging era of search engine personalisation:

Search is hot — and even hotter is the idea of search personalization. This is where by knowing more about you, a search engine may potentially provide better results because it understands your preferences.

To date, the only released service I know of that’s actually actively doing serious search personalization is Eurekster, which I wrote about back in January. It’s a social networking service that lets who you know influence what you see in search results.

Social networking services are also hot. Now do the math:

social network
search personalization
super hot!

That equation means you can expect to hear more about social networking services combined with search. But be wary — this doesn’t mean that true search personalization is offered.

