Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads

Osama Khan, Computer Science Department, Johns Hopkins University,
Randal Burns, Computer Science Department, Johns Hopkins University,
James S. Plank, EECS Department, University of Tennessee,
William Pierce, EECS Department, University of Tennessee,
Cheng Huang, Microsoft Research.

Appearing in FAST 2012: 10th USENIX Conference on File and Storage Technologies, San Jose, CA, February, 2012.

PDF of the paper.

Abstract

To reduce storage overhead, cloud file systems are transitioning from replication to erasure codes. This process has revealed new dimensions on which to evaluate the performance of different coding schemes: the amount of data used in recovery and when performing degraded reads. We present an algorithm that finds the optimal number of codeword symbols needed for recovery for any XOR-based erasure code and produces recovery schedules that use a minimum amount of data. We differentiate popular erasure codes based on this criterion and demonstrate that the differences improve I/O performance in practice for the large block sizes used in cloud file systems. Several cloud systems [Ford10,Calder11] have adopted Reed-Solomon (RS) codes, because of their generality and their ability to tolerate larger numbers of failures. We define a new class of rotated Reed-Solomon codes that perform degraded reads more efficiently than all known codes, but otherwise inherit the reliability and performance properties of Reed-Solomon codes.

Citation Information

Plain Text:

.inproceedings	kbp:12:rec
author		O. Khan and R. Burns and J. S. Plank and W. Pierce and C. Huang
title		Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads
booktitle       FAST-2012: 10th Usenix Conference on File and Storage Technologies
month           February
year            2012
address		San Jose
where           http://web.eecs.utk.edu/~jplank/plank/papers/FAST-2012.html

Bibtex:

@INPROCEEDINGS{kbp:12:rec,
        author = "O. Khan and R. Burns and J. S. Plank and W. Pierce and C. Huang",
        title = "Rethinking Erasure Codes for Cloud File Systems: Minimizing {I/O} for Recovery and Degraded Reads",
        booktitle = "FAST-2012: 10th Usenix Conference on File and Storage Technologies",
        month = "February",
        year = "2012",
        address = "San Jose",
        where = "http://web.eecs.utk.edu/~jplank/plank/papers/FAST-2012.html"
}