Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications

James S. Plank and Lihao Xu

The 5th IEEE International Symposium on Network Computing and Applications (IEEE NCA06), Cambridge, MA, July, 2006.

Award Winner: Best Paper in Network Computing

PDF: http://web.eecs.utk.edu/~jplank/plank/papers/NCA-2006.pdf

NOTE: NCA's page limit is rather severe: 8 pages. As a result, the final paper is pretty much a hatchet job of the original submission. I would recommend reading the technical report version of this paper, because it presents the material with some accompanying tutorial material, and is easier to read. Please cite this one, however. If this work get journalized, I will put a link to that here.


Abstract

In the past few years, all manner of storage applications, ranging from disk array systems to distributed and wide-area systems, have started to grapple with the reality of tolerating multiple simultaneous failures of storage nodes. Unlike the single failure case, which is optimally handled with RAID Level-5 parity, the multiple failure case is more difficult because optimal general purpose strategies are not yet known.

Erasure Coding is the field of research that deals with these strategies, and this field has blossomed in recent years. Despite this research, the decades-old strategy of Reed-Solomon coding remains the only space-optimal (MDS) code for all but the smallest storage systems. The best performing implementations of Reed-Solomon coding employ a variant called Cauchy Reed-Solomon coding developed in the mid 1990's [Blomer et al, 1995].

In this paper, we present an improvement to Cauchy Reed-Solomon coding that is based on optimizing the Cauchy distribution matrix. We detail an algorithm for generating good matrices and then evaluate the performance of encoding using all manners of Reed-Solomon coding, plus the best MDS codes from the literature. The improvements over the original Cauchy Reed-Solomon codes are as much as 83% in realistic scenarios, and average roughly 10% over all cases that we tested.


Citation Information