Journal of Parallel and Distributed Computing, Vol. 61, No. 11, November, 2001, pp. 1570-1590.
Available via anonymous ftp to cs.utk.edu in pub/plank/papers/JPDC01.pdf and pub/plank/papers/JPDC01.ps.Z.
Matlab scripts for this work are here.
Keywords Checkpointing, performance prediction, parameter selection, parallel computation, Markov chain, exponential failure and repair distributions.
author J. S. Plank and M. G. Thomason
title Processor Allocation and Checkpoint Interval Selection
in Cluster Computing Systems
journal Journal of Parallel and Distributed Computing
publisher Academic Press
month November
year 2001
volume 61
number 11
pages 1570-1590
where http://www.idealibrary.com/links/toc/jpdc/61/11/0
http://web.eecs.utk.edu/~jplank/plank/papers/JPDC01.html
@INPROCEEDINGS{pt:01:pac,
author = "J. S. Plank and M. G. Thomason",
title = "Processor Allocation and Checkpoint Interval Selection
in Cluster Computing Systems",
journal = "Journal of Parallel and Distributed Computing",
publisher = "Academic Press",
month = "November",
year = "2001",
volume = "61",
number = "11",
pages = "1570-1590",
where = "http://www.idealibrary.com/links/toc/jpdc/61/11/0
http://web.eecs.utk.edu/~jplank/plank/papers/JPDC01.html"
}