There is a new library for Galois Field arithmetic that is much faster and more general purpose than this one. It may be found at

http://web.eecs.utk.edu/~jplank/plank/papers/CS-07-593/ GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding


GFLIB - C Procedures for Galois Field Arithmetic and Reed-Solomon Coding

James S. Plank
Logistical Computing and Internetworking (LoCI) Laboratory
Department of Computer Science
University of Tennessee

June 4, 2003. $Revision: 1.2 $

Here is gflib_1.1.shar if you need an old version.

This site: http://web.eecs.utk.edu/~jplank/plank/gflib/.

This web site contains C procedures for limited Galois Field arithmetic and Reed-Solomon coding. To make best use of this code, please read the following tutorial and accompanying note:

This material is based upon work supported by the National Science Foundation under Grant No. 0204007. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

The Files

The code is available in the following two formats:

  1. Tar file: gflib.tar
  2. Shar file: gflib.shar

These contain the following files:


Limitations

Above, it says ``limited'' Galois Field arithmetic and Reed-Solomon coding. This is because the arithmetic and coding are limited to GF(2^8) and GF(2^16). Using GF(2^8) is much faster than GF(2^16), since the internal data structures are smaller, and this allows for an optimization for multiplication/division. However, using GF(2^8) limits the coding to 255 total chunks (data plus coding). While this should be adequate for most usage scenarios, GF(2^16) may be employed for larger numbers of chunks (up to 65536).


Compilation

Compile the code with one of two arguments to make:

This will create the object file gflib.o, and the programs gf_mult, gf_div, xor, parity_test, rs_encode_file and rs_decode_file.

To use the procedures in gflib.c, include gflib.h and compile with gflib.o.

There should be no other dependencies in using this code.


Thread Safety

The only call that should be protected is gf_modar_setup. After that, there should be no race conditions in any of the calls.


The Procedures In gflib.c

Note, with the exception of gf_add_parity(), none of these routines check their input values. This is for speed. If input checks are desired, you will have to write them yourself.


The Programs

Gf_mult, gf_div, xor are completely straightforward.

Parity_test allocates two random regions and performs a specified number of parity operations on them. This allows you to test the speed of your system doing parity.

Rs_encode_file and rs_decode_file and the two non-trivial programs. Rs_encode_file is called with the following arguments:

   rs_encode_file filename n m stem
It takes the file filename and breaks it up into n data chunks and m coding chunks. The size of the chunks will be padded so that the operations work correctly (chunk size times n is a multiple of the word size). It stores the chunks in the files stem-xxxx.rs, where xxxx is the chunk number. Chunks 0 through n-1 are the data chunks and chunks n through n+m-1 are the coding chunks. It also stores a file called stem-info.txt that contains the dispersal matrix plus some other information for the decoding. Note, you could recreate the dispersal matrix rather than read it in.

Rs_decode_file is called with the following argument:

   rs_decode_file stem
As long as stem-info.txt and any n chunks exist, it will recreate the original file and write it to standard output.

So, try the following to make sure it works. Encode the file gf_mult.c into five data and four coding chunks:

UNIX> rs_encode_file gf_mult.c 5 4 code
Writing code-0000.rs ... Done
Writing code-0001.rs ... Done
Writing code-0002.rs ... Done
Writing code-0003.rs ... Done
Writing code-0004.rs ... Done
Calculating  code-0005.rs ... writing  ... Done
Calculating  code-0006.rs ... writing  ... Done
Calculating  code-0007.rs ... writing  ... Done
Calculating  code-0008.rs ... writing  ... Done
You'll see that 9 chunks and the info file are created:
UNIX>  ls -l code*
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0000.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0001.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0002.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0003.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0004.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0005.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0006.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0007.rs
-rw-r--r--    1 plank    staff          78 Jun  4 10:29 code-0008.rs
-rw-r--r--    1 plank    staff         120 Jun  4 10:29 code-info.txt
Now, remove four of the chunks -- this removes three data and one coding chunk:
UNIX>  rm code-0000.rs code-0002.rs code-0004.rs code-0006.rs
And finally, decode the file. Note, a bunch of info is printed to standard error:
UNIX>  rs_decode_file code > tmp
Blocks to decode: 3
Inverting condensed dispersal matrix ... Done

Condensed matrix:

   7   7   6   6   1
   0   1   0   0   0
  15  14  14  15   1
   0   0   0   1   0
   2 125 149 253  22

Inverted matrix:

 182 143 122  22  84
   0   1   0   0   0
  27  35 215 186  84
   0   0   0   1   0
 126  71 179 223  84
Decoding block 0 ... writing ... Done
Writing block 1 from memory ... Done
Decoding block 2 ... writing ... Done
Writing block 3 from memory ... Done
Decoding block 4 ... writing ... Done
As you can see, tmp and gf_mult.c are identical:
UNIX>  diff tmp gf_mult.c

Caveats and subtleties:

rs_encode_matrix reads the entire file into memory, and holds n+1 chunks in memory -- the n data blocks, and each coding block as it is calculated. Obviously, there are other ways to do this.

Each coding block is the sum of each data block times a factor. In order to do this, rs_encode_matrix calls gf_multiply_region and overwrites the data block with a factor times the data block. This factor is stored in the array factors. When a data block multiplied by factor f needs to be multiplied by factor g, the block is multiplied by g/f.

Similarly, rs_decode_matrix holds n+1 blocks in memory and uses a factors array in the same manner as rs_encode_matrix.