TECH TALK: CommPuting Grid: Grid Computing (Part 3)

Anurag Shankar compares the computing grid with a power grid, and then discusses it in the context of the Web that we are so familiar with:

Grid computing is a way to use many computers connected via a network simultaneously to solve a single scientific or technical problem. In the most common cases, these problems usually require substantial amounts of CPU cycles (i.e. computer power) and/or produce or access massive amounts of data.

The word grid is borrowed from the power grid context. Just so we are clear and in sync, a power grid is a system that encompasses:
1. a physical hardware layer of
a network of wires that run across the country and carry electricity, and
a large number power generation stations,
2. a power distribution system that detects overloads and underloads and diverts electricity accordingly to different parts of the country or shuts it off entirely in case of problems, and
3. a large number of users around the country that use electricity.
A computing grid is relatively similar, except electric power is replaced by compute power. When fully realized, the computing grid will consist of
1. a physical hardware layer of
a network of optical fiber that run across the world and carry data bits, and
a large number of computers, data storage systems, communication systems, global positioning systems, live instruments, etc.,
a computing power distribution system that knows where compute power is available and diverts work accordingly to different compute resources around the world, or shuts a user or a resource off entirely if it finds security or other problems, and
a VERY LARGE number of users around the world that use computing and communications.

Let me thus define grid as it pertains to computing very precisely as follows:

A computing grid is a collection of some or all of the following resources: computer networks (optical fiber, routers, switches), CPUs (PCs/Macs, other servers/computers), data storage systems, scientific/medical instruments (X-ray, CAT scanners, etc.) feeding live or accumulated data, sensor networks (for example a thousand RFID tags placed in a rainforest to measure temperature, humidity, light exposure, etc. in very fine detail), visualization systems (PCs or fancier viz gear like virtual reality), data collections (scientific, demographic, medical, etc.), housed either in or out of databases, communication systems (cell phones, Blackberry like devices, etc.), global positioning systems, and the like. All the resources are connected at high speeds by a computer network and mostly used in parallel to solve a single problem. In addition, the resources are
made visible to a user via some sort of software that presents the resources holistically as a single, coherent entity, or
accessed via a web portal that uses the grid at the back end but which hides its complexity from the end user.
As you can see immediately, grid computing is really a superset of today’s World Wide Web (WWW). The similarity is relatively obvious, especially in scenario b) above. While all of the computers involved in the WWW are connected by a computer network, your web browser in a majority of cases connects ultimately to a single web server. Clearly, two computers connected over the network is not a terribly exciting grid. On the other hand, it is entirely the case that a WWW address (URL) you enter in your web browser may involve many, many WWW servers, located either in the same room or across the continent or world from each other, simultaneously and in parallel via the magic of software.

Anurag discusses computing grids that we are in contact almost daily as we browse the Web: Google and Akamai.In a sense, the grid is the future Internet. He added: I am assuming that CPU power and network bandwidth will soon be completely commodity and infinite in extent (in particular computers will become essentially throw away items every eighteen months or sooner when a new generation of CPUs twice as fast as the one you are currently using come out). The future is completely data-centric; i.e. it’s ALL ABOUT DATA, moving it, mining it, and delivering it on demand (as Google and Akamai are showing already).

Tomorrow: Recent Developments


TECH TALK CommPuting Grid+T

Published by

Rajesh Jain

An Entrepreneur based in Mumbai, India.