January, 2013
http://web.eecs.utk.edu/~jplank/plank/papers/CS-13-704.html
The source code is available in http://web.eecs.utk.edu/~jplank/plank/papers/CS-13-704/sd_code.tar.
.techreport p:13:ossd author J. S. Plank title Open Source Encoder and Decoder for SD Erasure Codes institution University of Tennessee month January year 2013 number UT-CS-13-704 where http://web.eecs.utk.edu/~jplank/plank/papers/CS-13-704.html @TECHREPORT{p:13:ossd, author = "J. S. Plank", title = "Open Source Encoder and Decoder for {SD} Erasure Codes", institution = "University of Tennessee", month = "January", year = "2013", number = "UT-CS-13-704", where = "http://web.eecs.utk.edu/~jplank/plank/papers/CS-13-704.html" }
The library itself is protected by the New BSD License.
The second program is generate_random_data_file.c, which generates a random input file for sd_code. These input files need to be in a specific format. One uses sd_code to encode the data in the input file.
The last program is erase_data.c. When you call it on the encoded file, it will erase data, so that sd_code may decode it.
Finally, there is a text file called All-Coefficients.txt. This file contains valid SD code coefficients for all values of n ≤ 24, m ≤ 3, s ≤ 3 and r ≤ 24. More explanation is given below.
Parameter | Description |
n | The total number of disks |
m | The number of disks devoted to coding |
s | The number of extra blocks devoted to coding |
r | The number of blocks per stripe on each disk (rows) |
w | The symbol (word) size for the Galois Field |
size | The number of bytes in a block |
a0, a1, ..., am+s-1, | m+s coding coefficients |
All three programs in this directory create and manipulate files in the following format. The files contain n*r*size words, which are bytes in hexadecimal. The first size words define the bytes in row 0 of disk 0. The next size words define the bytes in row 0 of disk 1. And so on.
To be precise, word i (zero indexed) in a file is byte (i % size) in row (i/(size*n)) of disk (i/size) % n.
For example, the file example_6_2_2_4_data.txt contains data for an SD code with n=6, m=2, s=2, r=4, size=8. Thus, there are 24 blocks on 6 disks, four blocks per disk, eight bytes per block. This file is formatted with eight words per line, so each block is on a separate line:
UNIX> awk '{ printf "%s Block %2d: Disk %d, Row %d\n", $0, n, n%6, n/6; n++ }' < example_6_2_2_4_data.txt 6e c0 0a 91 d0 f2 69 ae Block 0: Disk 0, Row 0 ba b4 48 3b 0b 5c ea 66 Block 1: Disk 1, Row 0 83 81 c8 15 52 b9 29 7d Block 2: Disk 2, Row 0 3a 7d 5a d4 63 29 8b 1e Block 3: Disk 3, Row 0 00 00 00 00 00 00 00 00 Block 4: Disk 4, Row 0 00 00 00 00 00 00 00 00 Block 5: Disk 5, Row 0 bb 20 46 24 07 0d 8c 84 Block 6: Disk 0, Row 1 8b 44 8e e1 52 46 05 3a Block 7: Disk 1, Row 1 1b 63 2e 59 e0 7a 66 62 Block 8: Disk 2, Row 1 04 dc 5d 8b 18 4f fa ea Block 9: Disk 3, Row 1 00 00 00 00 00 00 00 00 Block 10: Disk 4, Row 1 00 00 00 00 00 00 00 00 Block 11: Disk 5, Row 1 4a 2e cb 69 6a ae 27 d7 Block 12: Disk 0, Row 2 9a 3b e3 19 92 01 ad 7e Block 13: Disk 1, Row 2 8d 86 08 30 d4 74 e4 c5 Block 14: Disk 2, Row 2 e5 75 d9 f5 40 37 00 67 Block 15: Disk 3, Row 2 00 00 00 00 00 00 00 00 Block 16: Disk 4, Row 2 00 00 00 00 00 00 00 00 Block 17: Disk 5, Row 2 cd 8f 6e a4 ed b9 4e 2d Block 18: Disk 0, Row 3 1b bd 9a a7 3f ee 76 34 Block 19: Disk 1, Row 3 00 00 00 00 00 00 00 00 Block 20: Disk 2, Row 3 00 00 00 00 00 00 00 00 Block 21: Disk 3, Row 3 00 00 00 00 00 00 00 00 Block 22: Disk 4, Row 3 00 00 00 00 00 00 00 00 Block 23: Disk 5, Row 3 UNIX>
generate_random_data_file n m s r blocksize seed |
It then generates a random data file in the above format. The seed is a seed to srand48(). If set to -1, srand48() is seeed by time(0). The data file represents an unencoded SD codeword. Thus, all of the blocks on disks n-m through n-1 contain zeros. Additionally, of the remaining blocks, the s blocks with the highest numbers also contain zeros.
The file above (example_6_2_2_4_data.txt) was created with:
UNIX> generate_random_data_file 6 2 2 4 8 0 > example_6_2_2_4_data.txtAs such, the blocks on disks 4 and 5 are all zero, as are blocks 20 and 21.
sd_code n m s r w size IO(0|1) a0 a1 .. am+s-1 d0 .. dm-1 s0 .. ss-1 |
The parameters n, m, s, r, and size are all the same as generate_random_data_file. w is the symbol size (the word size of the Galois Field). If IO equals 1, then it will read a file in the format specified above. If IO equals 0, then it will generate a random stripe with the given parameters (which is better for timing). The ai parameters are the coding coefficients (more on these below). The di parameters specify erased disks, and the the si parameters specify erased sectors.
sd_code erases the mr+s blocks specified by the di and si, and then decodes them using the SD code defined by the coding coefficients. It prints the entire stripe on standard output in the same format as generate_random_data.
For example, we can use sd_code to encode the data from example_6_2_2_4_data.txt by specifying that disks 4 and 5, plus blocks 20 and 21 have been erased. We use 1, 42, 26 and 61 as the coefficients in GF(28).
UNIX> sd_code 6 2 2 4 8 8 1 1 42 26 61 4 5 20 21 < example_6_2_2_4_data.txt > example_6_2_2_4_encoded.txt Timer: 0.000060 UNIX> awk '{ printf "%s Block %2d: Disk %d, Row %d\n", $0, n, n%6, n/6; n++ }' < example_6_2_2_4_encoded.txt 6e c0 0a 91 d0 f2 69 ae Block 0: Disk 0, Row 0 ba b4 48 3b 0b 5c ea 66 Block 1: Disk 1, Row 0 83 81 c8 15 52 b9 29 7d Block 2: Disk 2, Row 0 3a 7d 5a d4 63 29 8b 1e Block 3: Disk 3, Row 0 f2 77 9e 08 c7 97 0a b2 Block 4: Disk 4, Row 0 9f ff 4e 63 2d a9 2b 19 Block 5: Disk 5, Row 0 bb 20 46 24 07 0d 8c 84 Block 6: Disk 0, Row 1 8b 44 8e e1 52 46 05 3a Block 7: Disk 1, Row 1 1b 63 2e 59 e0 7a 66 62 Block 8: Disk 2, Row 1 04 dc 5d 8b 18 4f fa ea Block 9: Disk 3, Row 1 06 5d c5 3f 32 b6 8d 9e Block 10: Disk 4, Row 1 29 86 7e 28 9f c8 98 a8 Block 11: Disk 5, Row 1 4a 2e cb 69 6a ae 27 d7 Block 12: Disk 0, Row 2 9a 3b e3 19 92 01 ad 7e Block 13: Disk 1, Row 2 8d 86 08 30 d4 74 e4 c5 Block 14: Disk 2, Row 2 e5 75 d9 f5 40 37 00 67 Block 15: Disk 3, Row 2 24 55 5c f1 99 dc ab 40 Block 16: Disk 4, Row 2 9c b3 a5 44 f5 30 c5 4b Block 17: Disk 5, Row 2 cd 8f 6e a4 ed b9 4e 2d Block 18: Disk 0, Row 3 1b bd 9a a7 3f ee 76 34 Block 19: Disk 1, Row 3 23 44 a9 cb 78 49 d9 11 Block 20: Disk 2, Row 3 36 b4 b8 c2 f9 14 71 ac Block 21: Disk 3, Row 3 7d 83 b2 ac 66 c6 6a 5b Block 22: Disk 4, Row 3 be 41 57 a6 35 cc fa ff Block 23: Disk 5, Row 3 UNIX>You'll note that the original data blocks are unchanged. However, now the coding blocks have been calculated according to the SD code.
sd_code only supports w = 8, 16 or 32.
erase_data n m s r blocksize d0 .. dm-1 s0 .. ss-1 |
It reads a file in the proper format on standard input, and erases all of the blocks on disks d0 to dm-1, plus blocks s0 to ss-1. For example, the command below erases disks 0 and 2 plus blocks 1 and 10 from the encoded example in example_6_2_2_4_encoded.txt:
UNIX> erase_data 6 2 2 4 8 0 2 1 10 < example_6_2_2_4_encoded.txt > example_6_2_2_4_erased.txt UNIX> awk '{ printf "%s Block %2d: Disk %d, Row %d\n", $0, n, n%6, n/6; n++ }' < example_6_2_2_4_erased.txt 00 00 00 00 00 00 00 00 Block 0: Disk 0, Row 0 00 00 00 00 00 00 00 00 Block 1: Disk 1, Row 0 00 00 00 00 00 00 00 00 Block 2: Disk 2, Row 0 3a 7d 5a d4 63 29 8b 1e Block 3: Disk 3, Row 0 f2 77 9e 08 c7 97 0a b2 Block 4: Disk 4, Row 0 9f ff 4e 63 2d a9 2b 19 Block 5: Disk 5, Row 0 00 00 00 00 00 00 00 00 Block 6: Disk 0, Row 1 8b 44 8e e1 52 46 05 3a Block 7: Disk 1, Row 1 00 00 00 00 00 00 00 00 Block 8: Disk 2, Row 1 04 dc 5d 8b 18 4f fa ea Block 9: Disk 3, Row 1 00 00 00 00 00 00 00 00 Block 10: Disk 4, Row 1 29 86 7e 28 9f c8 98 a8 Block 11: Disk 5, Row 1 00 00 00 00 00 00 00 00 Block 12: Disk 0, Row 2 9a 3b e3 19 92 01 ad 7e Block 13: Disk 1, Row 2 00 00 00 00 00 00 00 00 Block 14: Disk 2, Row 2 e5 75 d9 f5 40 37 00 67 Block 15: Disk 3, Row 2 24 55 5c f1 99 dc ab 40 Block 16: Disk 4, Row 2 9c b3 a5 44 f5 30 c5 4b Block 17: Disk 5, Row 2 00 00 00 00 00 00 00 00 Block 18: Disk 0, Row 3 1b bd 9a a7 3f ee 76 34 Block 19: Disk 1, Row 3 00 00 00 00 00 00 00 00 Block 20: Disk 2, Row 3 36 b4 b8 c2 f9 14 71 ac Block 21: Disk 3, Row 3 7d 83 b2 ac 66 c6 6a 5b Block 22: Disk 4, Row 3 be 41 57 a6 35 cc fa ff Block 23: Disk 5, Row 3 UNIX>We can now use sd_code to decode from this scenario. The final diff statement will prove that decoding was successful!
UNIX> sd_code 6 2 2 4 8 8 1 1 42 26 61 0 2 1 10 < example_6_2_2_4_erased.txt > example_6_2_2_4_decoded.txt Timer: 0.000047 UNIX> diff example_6_2_2_4_encoded.txt example_6_2_2_4_decoded.txt UNIX>
UNIX> sd_code 6 2 2 4 8 1048576 0 1 42 26 61 4 5 20 21 Timer: 0.031875 UNIX>Since there are 14 MB of data in this stripe (24 total blocks, 10 devoted to coding), the performance of coding is 14/0.031875 = 439 MB/s (this is on my 2.4 GHz Macbook Pro with iTunes, Safari, Photoshop and OpenOffice tooling along in the background).
As mentioned above, sd_code uses the alternate mapping of words to memory for GF(216) and GF(32). I would recommend using the slower, default versions of the Galois Fields when you go about tweaking this code for yourselves (just call gf_init_easy(&gfm, w) for all values of w. That, or read the "GF Complete" documentation on the alternate memory layouts.
If we found valid coefficients for w=8, those are in the file. Otherwise, if we found valid coefficients for w=16, those are in the file. Otherwise, we show the coefficients for w=32. Each line contains a different set of parameters in the following format:
UNIX> grep "^16 3 2 9" All-Coefficients.txt 16 3 2 9 16 1 3561 27596 44295 42660 UNIX>That shows you that you must use GF(216), and the 5 coefficients are 1, 3561, 27596, 44295 and 42660.
Four related codes that have different fault-tolerance properties but address similar target areas are PMDS codes [Blaum-Hafner-Hetzler-2012], LRC codes [Huang-et-al-2012], Pyramid Codes [Huang-Chen-Li-2013], and Intradisk Redundancy [Dholakia-et-al-2008].
This material is based upon work supported by the National Science Foundation under grant CSR-1016636 and its REU supplements. In particular, Will, Adam and Andrew were all funded by REU's for their part in this work.
Finally, the I would like to acknowledge the support of IBM's Faculty Award program, which granted me an award in December, 2012.