Internet Archive’s Petabox

The Petabox is a large scale data repository. It “is a machine designed to safely store and process one petabyte of information (a petabyte is a million gigabytes).” Here is a paper by Bruce Baumgart and Matt Laue that describes the architecture.

