CS594 -- Notes and Papers
January 11, 1996
Class intro.
January 16, 1996 -- Uniprocessor Checkpointing
Papers to read:
- T. Tannenbaum and M. Litzkow, ``The Condor Distributed Processing
System'', Dr. Dobb's Journal, #225, Feb, 1995, pp. 40-48.
January 18, 1996 -- More Uniprocessor Checkpointing
- B. A. Kingsbury and J. T. Kline, ``Job and Process Recovery in a
UNIX-based Operating System'', Usenix Winter Technical Conference,
January, 1989, pp. 355-364.
- T. J. Killian, ``Processes as Files'', Usenix Summer Technical
Conference, June, 1984, 203-207.
January 23, 1996 -- Class Cancelled
January 25, 1996 -- Incremental Checkpointing
- S. I. Feldman and C. B. Brown,
``IGOR: A system for Program Debugging via Reversible Execution'',
ACM SIGPLAN Notices, Workshop on Parallel and Distributed
Debugging,
24(1), January, 1989, pp. 112-123.
January 30, 1996 -- Copy-on-write/Memory Exclusion
- K. Li, J. F. Naughton and J. S. Plank,
``Low-Latency, Concurrent Checkpointing for
Parallel Programs'', IEEE Transactions on Parallel and Distributed
Systems, 5(8), August, 1994, pp. 874--879.
February 1, 1996 -- Libckpt
- J. S. Plank, M. Beck, G. Kingsley and K. Li,
``Libckpt: Transparent Checkpointing under Unix'',
Conference Proceedings, Usenix Winter 1995 Technical Conference,
New Orleans, LA, January, 1995, pp. 213--223.
February 6, 1996 -- Class cancelled due to snow
February 8, 1996 -- Compiler Assisted Techniques for Checkpointing
- C-C. J. Li and W. K. Fuchs,
``CATCH -- Compiler-Assisted Techniques for Checkpointing'',
20th International Symposium on Fault Tolerant Computing,
1990, pp. 74-81.
February 13, 1996 -- Fast Incremental Checkpoint Compression
- J. S. Plank, J. Xu and R. H. B. Netzer,
``Compressed Differences: An Algorithm for Fast
Incremental Checkpointing'',
Submitted for publication,
University of Tennessee Technical Report CS-95-302, August, 1995.
February 15, 1996 -- Uniprocessor Tools for Fault-Tolerance
- M. Russinovich and Z. Segall,
``Fault-Tolerance for Off-The-Shelf Applications and Hardware'',
25th International Symposium on Fault Tolerant Computing,
1995, pp. 67--71.
- Y. Huang, C. Kintala and Y-M. Wang,
``Software Tools and Libraries for Fault-Tolerance'',
IEEE Technical Committee on Operating Systems and
Application Environments, Special Issue on Fault-Tolerance
Winter, 1995.
February 20, 1996 -- Compiler-Assisted Memory Exclusion
- M. Beck, J. S. Plank and G. Kingsley,
``Compiler-Assisted Checkpointing'',
Technical Report CS-94-269, University of Tennessee, December, 1994
February 22, 1996 -- No class
February 27, 1996 -- Discussion of Yuan Ma's talk
February 29, 1996 -- RAID
- M. Holland, G. A. Gibson and D. P. Siewiorek
``Fast, On-Line Failure Recovery in Redundant Disk Arrays'',
23th International Symposium on Fault Tolerant Computing,
1993, pp. 422--431.
March 5, 1996 -- Discussion of John Lin's talk
March 7, 1996 -- Disk Modelling
- C. Ruemmler and J. Wilkes,
``Modelling Disks'',
Technical Report HPL-93-68, Computer Systems Laboratory,
Hewlett Packard, July, 1993.
March 12, 1996 -- Discussion of Trent Jaeger's talk
March 14, 1996 -- More File Systems
- J. H. Hartman and J. K. Ousterhout,
``The Zebra Striped Network File System'',
Operating Systems Review, 27(5), (SOSP-13), December, 1993,
pp. 29-43.