GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding

There is a new library for Galois Field arithmetic that is much faster and more general purpose than this one. It may be found at
http://web.eecs.utk.edu/~jplank/plank/papers/CS-07-593/ GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding

GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding

James S. Plank
Logistical Computing and Internetworking (LoCI) Laboratory
Department of Computer Science
University of Tennessee

June 4, 2003. $Revision: 1.2 $

Here is gflib_1.1.shar if you need an old version.

This site: http://web.eecs.utk.edu/~jplank/plank/gflib/.

This web site contains C procedures for limited Galois Field arithmetic and Reed-Solomon coding. To make best use of this code, please read the following tutorial and accompanying note:

James S. Plank, ``A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems,'' Software -- Practice & Experience, 27(9), September, 1997, pp. 995-1012.

James S. Plank and Ying Ding, ``Note: Correction to the 1997 Tutorial on Reed-Solomon Coding,'' Technical Report UT-CS-03-504, University of Tennessee, April, 2003.

This material is based upon work supported by the National Science Foundation under Grant No. 0204007. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

The Files

The code is available in the following two formats:

Tar file: gflib.tar
Shar file: gflib.shar

These contain the following files:

gflib.h: Header file for the C procedures.
gflib.c: Implementation file for the C procedures.
gf_mult.c: Simple program for multiplying two numbers in GF(2^8) or GF(2^16).
gf_div.c: Simple program for dividing two numbers in GF(2^8) or GF(2^16).
xor.c: Simple program for adding/subtracting two numbers in GF(2^8) or GF(2^16). Note, addition and subtraction in Galois Fields are both implemented with exclusive-or.
parity_test.c: A program to test the speed of parity operations.
rs_encode_file.c: A program that uses gflib.c to encode a file using Reed-Solomon coding.
rs_decode_file.c: A program that uses gflib.c to decode a file from the chunks created by rs_encode_file.
makefile: The makefile.
index.html: This file.

Limitations

Above, it says ``limited'' Galois Field arithmetic and Reed-Solomon coding. This is because the arithmetic and coding are limited to GF(2^8) and GF(2^16). Using GF(2^8) is much faster than GF(2^16), since the internal data structures are smaller, and this allows for an optimization for multiplication/division. However, using GF(2^8) limits the coding to 255 total chunks (data plus coding). While this should be adequate for most usage scenarios, GF(2^16) may be employed for larger numbers of chunks (up to 65536).

Compilation

Compile the code with one of two arguments to make:

make w8 -- for GF(2^8).
make w16 -- for GF(2^16).

This will create the object file gflib.o, and the programs gf_mult, gf_div, xor, parity_test, rs_encode_file and rs_decode_file.

To use the procedures in gflib.c, include gflib.h and compile with gflib.o.

There should be no other dependencies in using this code.

Thread Safety

The only call that should be protected is gf_modar_setup. After that, there should be no race conditions in any of the calls.

The Procedures In gflib.c

Note, with the exception of gf_add_parity(), none of these routines check their input values. This is for speed. If input checks are desired, you will have to write them yourself.

void gf_modar_setup(): This sets up the internal data structures. It is called automatically by all the procedures below, and therefore does not need to be called explicitly. However, as it does do some processing (not much), and should be protected in a threaded system, it may be called explicitly.
int gf_single_multiply(int a, int b): This multiplies two numbers and returns the result. Note, as stated above, this does not check the input values to see if they are in the correct range (0-255 for GF(2^8) and 0-65535 for GF(2^16)).
int gf_single_divide(int a, int b): This divides a by b and returns the result.
void gf_add_parity(void *to_add, void *to_modify, int size): This works just This calculates the parity of two memory regions, to_add and to_modify, both consisting of size bytes, and puts the result in to_modify.
Important: gf_add_parity() assumes that both to_add and to_modify are aligned on the same byte boundary with respect to longs. In other words, to_add%sizeof(long) must equal to_modify%sizeof(long). If both are pointers that have been allocated with malloc(), then they will be fine. If they are not aligned on the same byte boundary, gf_add_parity() will flag an error and exit the program. If gf_add_parity(), does not flag an error, then you may assume that it worked correctly.
The reason that to_add and to_modify must be aligned to each other is that gf_add_parity() calls gf_fast_add_parity() below, and if they are not aligned, the parity operation will be brutally slow. Thus, it is not allowed.
void gf_fast_add_parity(void *to_add, void *to_modify, int size): This calculates the parity of two memory regions, to_add and to_modify, both consisting of size bytes, and puts the result in to_modify.
Important: gf_fast_add_parity() assumes that both to_add and to_modify are aligned on long boundaries, and that size is a multiple of sizeof(long). Otherwise, you won't get the results that you expect. The reason gf_fast_add_parity() is fast is that it performs exclusive-or on long words rather than bytes. To perform exclusive-or on bytes, use gf_add_parity(), which calls this routine where possible.
void gf_mult_region(void *region, int size, int factor): This multiplies every word (1 byte in GF(2^8), 2 bytes in GF(2^16)) in region by factor. Size defines the number of bytes in region. Region is overwritten. However, if factor is not zero, then you may restore the region by calling:
```
gf_mult_region(region, size, gf_single_divide(1, factor))
```
Cool, no?
int *gf_make_vandermonde(int rows, int cols): This allocates and returns a rows by cols Vandermonde matrix. You do not need to call this explicitly to perform Reed-Solomon coding, but in case you want to see a Vandermonde matrix, you can use this. Rows and cols must be less than 256 for GF(2^8) and 65536 for GF(2^16).
The matrix returned is a rows*cols array. You may access element (i,j) at matrix element i*cols+j.
int *gf_make_dispersal_matrix(int rows, int cols): This allocates and returns a rows by cols dispersal matrix for Reed-Solomon coding. This is the matrix B in the paper `` Note: Correction to the 1997 Tutorial on Reed-Solomon Coding.'' Rows and cols must be less than 256 for GF(2^8) and 65536 for GF(2^16).
The matrix returned is a rows*cols array. You may access element (i,j) at matrix element i*cols+j.
Condensed_Matrix *gf_condense_dispersal_matrix(int *disp, int *existing_rows, int rows, int cols): When you need to decode, you must delete rows of the dispersal matrix, according to which chunks you do not have. The resulting matrix may be used to recalculate the missing chunks. This procedure does the deleting. Disp is the original dispersal matrix, returned from gf_make_dispersal_matrix(int rows, int cols). Existing_rows is an array with rows elements, containing zeros and ones. Element i should contain one if you have chunk i. Element i should contain zero if you are missing chunk i.
This procedure allocates and returns a Condensed_Matrix, defined as follows:
```
typedef struct {
  int *condensed_matrix;   /* The n*n dispersal matrix with rows deleted */
  int *row_identities;     /* A nx1 vector of the original row identities of the cond_matrix */
} Condensed_Matrix;
```
Note, you always get a rows*rows matrix as a result of gf_condense_dispersal_matrix(), even if no rows need to be deleted. Row_identities tells you which rows are left in the condensed matrix.
int *gf_invert_matrix(int *mat, int rows). This inverts the square matrix mat. This is not destructive: the inverted matrix is allocated and returned.
int *gf_matrix_multiply(int *a, int *b, int rows). This is not a necessary routine, but it's helpful for ensuring that things work right. It multiplies two square matrices and returns the result. The following call should return the identity matrix:
```
   gf_matrix_multiply(gf_invert_matrix(mat, rows), mat, rows)
```
void gf_write_matrix(FILE *f, int *a, int rows, int cols): Save a matrix to a file.
int *gf_read_matrix(FILE *f, int *rows, int *cols): Read a matrix from a file.

The Programs

Gf_mult, gf_div, xor are completely straightforward.

Parity_test allocates two random regions and performs a specified number of parity operations on them. This allows you to test the speed of your system doing parity.

Rs_encode_file and rs_decode_file and the two non-trivial programs. Rs_encode_file is called with the following arguments:

   rs_encode_file filename n m stem

It takes the file filename and breaks it up into n data chunks and m coding chunks. The size of the chunks will be padded so that the operations work correctly (chunk size times n is a multiple of the word size). It stores the chunks in the files stem-xxxx.rs, where xxxx is the chunk number. Chunks 0 through n-1 are the data chunks and chunks n through n+m-1 are the coding chunks. It also stores a file called stem-info.txt that contains the dispersal matrix plus some other information for the decoding. Note, you could recreate the dispersal matrix rather than read it in.

Rs_decode_file is called with the following argument:

   rs_decode_file stem

As long as stem-info.txt and any n chunks exist, it will recreate the original file and write it to standard output.

So, try the following to make sure it works. Encode the file gf_mult.c into five data and four coding chunks:

UNIX> rs_encode_file gf_mult.c 5 4 code
Writing code-0000.rs ... Done
Writing code-0001.rs ... Done
Writing code-0002.rs ... Done
Writing code-0003.rs ... Done
Writing code-0004.rs ... Done
Calculating  code-0005.rs ... writing  ... Done
Calculating  code-0006.rs ... writing  ... Done
Calculating  code-0007.rs ... writing  ... Done
Calculating  code-0008.rs ... writing  ... Done

You'll see that 9 chunks and the info file are created:

UNIX>  ls -l code*
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0000.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0001.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0002.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0003.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0004.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0005.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0006.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0007.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0008.rs
-rw-r--r--    1 plank    staff         120 Jun  4 10:29 code-info.txt

Now, remove four of the chunks -- this removes three data and one coding chunk:

UNIX>  rm code-0000.rs code-0002.rs code-0004.rs code-0006.rs

And finally, decode the file. Note, a bunch of info is printed to standard error:

UNIX>  rs_decode_file code > tmp
Blocks to decode: 3
Inverting condensed dispersal matrix ... Done

Condensed matrix:

   7   7   6   6   1
   0   1   0   0   0
  15  14  14  15   1
   0   0   0   1   0
   2 125 149 253  22

Inverted matrix:

 182 143 122  22  84
   0   1   0   0   0
  27  35 215 186  84
   0   0   0   1   0
 126  71 179 223  84
Decoding block 0 ... writing ... Done
Writing block 1 from memory ... Done
Decoding block 2 ... writing ... Done
Writing block 3 from memory ... Done
Decoding block 4 ... writing ... Done

As you can see, tmp and gf_mult.c are identical:

UNIX>  diff tmp gf_mult.c

Caveats and subtleties:

rs_encode_matrix reads the entire file into memory, and holds n+1 chunks in memory -- the n data blocks, and each coding block as it is calculated. Obviously, there are other ways to do this.

Each coding block is the sum of each data block times a factor. In order to do this, rs_encode_matrix calls gf_multiply_region and overwrites the data block with a factor times the data block. This factor is stored in the array factors. When a data block multiplied by factor f needs to be multiplied by factor g, the block is multiplied by g/f.

Similarly, rs_decode_matrix holds n+1 blocks in memory and uses a factors array in the same manner as rs_encode_matrix.

There is a new library for Galois Field arithmetic that is much faster and more general purpose than this one. It may be found at http://web.eecs.utk.edu/~jplank/plank/papers/CS-07-593/ GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding