How to achieve 1 GByte/sec I/O throughput with commodity IDE disks

Jens Mache, Joshua Bower-Cooley, Jason Guchereau, Paul Thomas, and Matthew Wilkinson
Lewis & Clark College
Portland, OR 97219

The Problem

In order to compete with custom-made systems, PC clusters have to provide not only fast computation and communication, but also high-performance disk access. I/O performance can play a critical role in the completion times of many applications that transfer large amounts of data to and from secondary storage, for example simulations, computer graphics, file serving, data mining or visualization.

An I/O throughput of 1 GByte/sec was first achieved on ASCI Red with I/O hardware costing over one million dollars. We set out to achieve similar I/O performance on our PC cluster by harnessing the power of commodity IDE disks on remote nodes.

The Approach

We set out to achieve an I/O throughput of 1 GByte/sec on a PC cluster that (1) has as few as 32 nodes and (2) uses less than ten thousand dollars worth of I/O hardware. In order to reach this goal, each node must be able to access data at a rate of at least 32 MBytes/sec.

The novelty of our approach is (A) on each node to use two commodity IDE disks (not SCSI disks) in a software RAID configuration and (B) to configure the parallel file system such that each nodes acts as both I/O node and compute node.

In our first experiment, we measured the local read and write performance of our two IDE drives (IBM 20GB ATA100 7200rpm costing $112 each), configured as a software RAID 0. Using the Bonnie disk benchmark, we measured up to 68.23 MBytes/sec.

In our second experiment, we measured the performance of a concurrent read/ write test program that sits on top of PVFS, an open-source parallel file system. Parallel file systems allow transparent access to disks on remote nodes. We configured each machine as both an I/O and a compute node to best make use of our limited number of nodes. Using MPI and the native PVFS API, I/O throughputs were well above 1 GByte/sec. We achieved up to 2007.199 MBytes/sec read throughput and 1698.896 MBytes/sec write throughput (with appropriate file view and stripe size such that most disk accesses were local).

In additional experiments, we measured the I/O performance of a ray tracing application and studied how I/O performance is sensitive to configuration and programming choices.

Our conclusions are as follows:

Impact, Importance, Interest, Audience

Interest in cluster computing is at an all time high. While there is no I/O category in the top500 ranking (nor for SC awards) yet, I/O performance is getting more and more attention ("the I/O bottleneck").

The impact of our work is
(A) showing how commodity IDE disks on remote nodes can be harnessed,
(B) reporting of I/O performance sensitivities.
(C) reporting on extremely good price/performance (factor of 100 better)
Thus, parallel I/O now seems affordable, even for small businesses and colleges.

Our sensitivity results are highly valuable
(1) to give performance recommendations for application development,
(2) as a guide to I/O benchmarking (which will play an important role in compiling the new "clusters @ top500" ranking), and
(3) as a guide to further improvement of parallel file systems.

Visual Presentation

First, we'll have a traditional color poster display (32"x40"), describing the problem, our approach (IDE disks in RAID configuration, PVFS with overlapped nodes), our experiments (graphs and tables) and our conclusions.

Second, we plan to show the performance of application and benchmark runs "on demand". (It only takes a laptop and an Internet connection for us to start programs on our cluster from Denver and get the performance results back.)