eBay’s Grid

eWeek has an interview with Marty Abbott, senior vice president of technology of eBay, on its technology infrastructure:

A good way to think about it is that it’s one of the first examples of grid computing. It’s an array of systems, each of which has a service component that answers to another system: fault tolerance meant to allow for scale. As a matter of fact, we would have potential vendors and partners come in and try to sell us on the idea of grid computing and we’d say, “It sounds an awful lot like what we were doing. We didn’t know there was a name for it.”

We went from one huge back-end system and four or five very large search databases. Search used to update in 6 to 12 hours from the time frame in which someone would place a bid or an item for sale. Today, updates are usually less than 90 seconds. The front end in October ’99 was a two-tiered system with [Microsoft Corp.] IIS [Internet Information Services] and ISAPI [Internet Server API]. The front ends were about 60 [Windows] NT servers. Fast-forward to today. We have 200 back-end databases, all of them in the 6- to 12-processor range, as opposed to having tens of processors before. Not all those are necessary to run the site. We have that many for disaster recovery purposes and for data replication.

We have two data centers in Santa Clara County [Calif.], one data center in Sacramento [Calif.] and one in Denver. When you address eBay or make a request of eBay, you have an equal chance of hitting any of those four.

We’ve taken a unique approach with respect to our infrastructure. In a typical disaster recovery scenario, you have to have 200 percent of your capacity100 percent in one location, 100 percent in another locationwhich is cost-ineffective. We have three centers, each with 50 percent of the traffic, actually 55 percent, adding in some bursts.

We use Sun [Microsystems Inc.] systems, as we did before. We use Hitachi Data Systems [Corp.] storage on Brocade [Communications Systems Inc.] SANs [storage area networks] running Oracle [Corp.] databases and partner with Microsoft for the [Web server] operating system. IBM provides front and middle tiers, and we use WebSphere as the application server running our J2EE codethe stuff that is eBay. The code is also migrated from C++ to Java, for the most part. Eighty percent of the site runs with Java within WebSphere.

We believe the infrastructure we have today will allow us to scale nearly indefinitely. There are always little growth bumps, new things that we experience, and not a whole lot of folks from whom we can learn. But using the principles of scaling out, rather than scaling up; disaggregating wherever possible; attempting to avoid state, because state is very costly and increases your failure rate; partnering with folks like Microsoft and IBM, Sun, Hitachi Data Systems, where they feel they have skin in the game and are actually helping us to build something; and then investing in our people, along with commodity hardware and softwareapplying those principles, we think we can go indefinitely.

We deliver the content for most countries from the U.S. The exceptions are Korea and China, which have their own platforms. In the other 28 countries, when you list an item for sale or when you attempt to bid or buy an item, that comes back to the U.S. We distribute the content around the world through a content delivery network. We put most of the content that’s downloadedexcept for the dynamic piecesin a location near where you live. That’s about 95 percent of the activity, making the actions or requests that come back to eBay in the U.S. very lightweight. A page downloads in the U.K. in about the same time that it downloads in the U.S., thanks to our partner Akamai [Technologies Inc.], whose content delivery network resides in just about every country, including China.

Published by

Rajesh Jain

An Entrepreneur based in Mumbai, India.