Stanimire Tomov

 
Visiting Scholar

DOE Nano, ITER-REF, MAGMA, BEAST, PEEKS, SLATE

Betancourt, F., Wong, K., Asemota, E., Marshall, Q., Nichols, D., Tomov, S. "openDIEL: A Parallel Workflow Engine and DataAnalytics Framework," In Practice and Experience in Advanced Research Computing (PEARC ’19), ACM, Chicago, IL, July 28-August 1, 2019 [pdf] [bibtex]

Nichols, D., Wong, K., Tomov, S., Ng, L., Chen, S., Gessinger, A. "MagmaDNN: Accelerated Deep Learning Using MAGMA," In Practice and Experience in Advanced Research Computing (PEARC ’19), ACM, Chicago, IL, July 28-August 1, 2019 [pdf] [bibtex]

Wong, K., Tomov, S., Dongarra, J. "Hands-on Research and Training in High-Performance Data Sciences, Data Analytics, and Machine Learning for Emerging Environments," ISC High Performance 2019, "HPC Education and Training for Emerging Technologies” workshop, Springer International Publishing, Frankfurt, Germany, June 20, 2019 [pdf] [bibtex]

Nichols, D., Tomov, N.-S., Betancourt, F., Tomov, S., Wong, K., Dongarra, J. "MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing," ISC High Performance 2019, "Scalable Data Analytics in Scientific Computing” workshop, Springer International Publishing, Frankfurt, Germany, June 20, 2019 [pdf] [bibtex]

Abdelfattah, A., Tomov, S., Dongarra, J. "Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs," 33rd IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, Rio de Janeiro, Brazil, May 20-24, 2019 [bibtex]

Tomov, S., Haidar, A., Ayala, A., Schultz, D., Dongarra, J. "Design and Implementation for FFT-ECP on Distributed Accelerated Systems," ECP WBS 2.3.3.09 Milestone Report, Innovative Computing Laboratory, University of Tennessee, FFT-ECP ST-MS-10-1410, April 4, 2019 [pdf] [bibtex]

Tomov, S., Haidar, A., Schultz, D., Dongarra, J. "Evaluation and Design of FFT for Distributed Accelerated Systems," ECP WBS 2.3.3.09 Milestone Report, Innovative Computing Laboratory, University of Tennessee, FFT-ECP ST-MS-10-1216, October 1, 2018 [pdf] [bibtex]

Yamazaki, I., Tomov, S., Dongarra, J. "Sampling Algorithms to Update Truncated SVD," IEEE International Conference on Big Data, Boston, MA, December 11-14, 2017 [pdf] [bibtex]

Dongarra, J., Haidar, A., Hernandez, O., Tomov, S., Gorentla Venkata, M. "POMPEI: Programming with OpenMP4 for Exascale Investigations," University of Tennessee Computer Science Technical Report, UT-EECS-17-754, December 7, 2017 [pdf] [bibtex]

Haidar, A., Abdelfatah, A., Zounon, M., Tomov, S., Dongarra, J. "A Guide For Achieving High Performance With Very Small Matrices on GPU: A case Study of Batched LU and Cholesky Factorizations," IEEE Transactions on Parallel and Distributed Systems, DOI: 10.1109/TPDS.2017.2783929, December, 2017 [bibtex]

Haidar, A., Wu, P., Tomov, S., Dongarra, J. "Investigating Half Precision Arithmetic to Accelerate Dense Linear System Solvers," ScalA17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ACM, Denver, Colorado, November 12-17, 2017 [pdf] [bibtex]

Gates, M., Tomov, S., Dongarra, J. "Accelerating the SVD Two Stage Bidiagonal Reduction and Divide and Conquer Using GPUs," Parallel Computing, 71, November, 2017 [bibtex]

Haidar, A., Jagode, H., YarKhan, A., Vaccaro, P., Tomov, S. , Dongarra, J. "Power-aware Computing: Measurement, Control, and Performance Analysis for Intel Xeon Phi," 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), Best Paper Finalist , IEEE, Waltham, MA, September 12-14, 2017 [pdf] [bibtex]

Haidar, A., Kabir, K., Fayad, D., Tomov, S., Dongarra, J. "Out Of Memory SVD Solver for Big Data," 2017 IEEE High Performance Extreme Computing Conference (HPEC'17), IEEE, Waltham, MA, September 12-14, 2017 [pdf] [bibtex]

Kabir, K., Haidar, A., Tomov, S., Bouteiller, A., Dongarra, J. "A Framework for Out of Memory SVD Algorithms," ISC High Performance 2017, Springer International Publishing, Frankfurt, Germany, pp. 158-178, June 19-21, 2017 [pdf] [bibtex]

Kabir, K., Haidar, A., Tomov, S., Bouteiller, A., Dongarra, J. "A Framework for Out of Memory SVD Algorithms," ISC High Performance 2017, Springer International Publishing, Frankfurt, Germany, pp. 158-178, June 19-21, 2017 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs," International Conference on Supercomputing (ICS'17), ACM, Chicago, Illinois, pp. 1-10, June 14-16, 2017 [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs," International Conference on Supercomputing (ICS'17), ACM, Chicago, Illinois, pp. 1-10, June 14-16, 2017 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures," International Conference on Computational Science (ICCS'17), Zurich, Switzerland, pp. 606-615, June 12-14, 2017 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures," International Conference on Computational Science (ICCS'17), Zurich, Switzerland, pp. 606-615, June 12-14, 2017 [pdf] [bibtex]

Dong, T., Haidar, A., Tomov, S., Dongarra, J. "Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices," International Conference on Computational Science (ICCS'17), Zurich, Switzerland, pp. 1008-1018, June 12-14, 2017 [pdf] [bibtex]

Dong, T., Haidar, A., Tomov, S., Dongarra, J. "Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices," International Conference on Computational Science (ICCS'17), Zurich, Switzerland, pp. 1008-1018, June 12-14, 2017 [pdf] [bibtex]

Yamazaki, I., Nooshabadi, S., Tomov, S., Dongarra, J. "Structure-aware Linear Solver for Realtime Convex Optimization for Embedded Systems," IEEE Embedded Systems Letters, IEEE, Vol. PP, No. 99, May 2, 2017 [pdf] [bibtex]

Yamazaki, I., Nooshabadi, S., Tomov, S., Dongarra, J. "Structure-aware Linear Solver for Realtime Convex Optimization for Embedded Systems," IEEE Embedded Systems Letters, IEEE, Vol. PP, No. 99, May 2, 2017 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA," Journal of Computational Science, Elsevier, Vol. 20, 85-93, May, 2017 [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA," Journal of Computational Science, Elsevier, Vol. 20, 85-93, May, 2017 [bibtex]

Abdelfattah, A., Baboulin, M., Dobrev, V., Dongarra, J., Haidar, A., Karlin, I., Kolev, Tz., Masliah, I., Tomov, S. "Small Tensor Operations on Advanced Architectures for High-order Applications," University of Tennessee Computer Science Technical Report, UT-EECS-17-749, April 18, 2017 [pdf] [bibtex]

Abdelfattah, A., Baboulin, M., Dobrev, V., Dongarra, J., Haidar, A., Karlin, I., Kolev, Tz., Masliah, I., Tomov, S. "Small Tensor Operations on Advanced Architectures for High-order Applications," University of Tennessee Computer Science Technical Report, UT-EECS-17-749, April 18, 2017 [pdf] [bibtex]

Haidar, A., Abdelfatah, A., Tomov, S., Dongarra, J. "High-performance Cholesky Factorization for GPU-only Execution," Proceedings of the General Purpose GPUs (GPGPU-10), ACM, Austin, TX, pp. 42-52, February 5, 2017 [pdf] [bibtex]

Haidar, A., Abdelfatah, A., Tomov, S., Dongarra, J. "High-performance Cholesky Factorization for GPU-only Execution," Proceedings of the General Purpose GPUs (GPGPU-10), ACM, Austin, TX, pp. 42-52, February 5, 2017 [pdf] [bibtex]

Baboulin, M., Dongarra, J., Remy, A., Tomov, S., Yamazaki, I. "Solving dense symmetric indefinite systems using GPUs," Concurrency and Computation: Practice and Experience, Special Issues on Parallel Processing and Applied Mathematics (PPAM'15) eds. Vol. 29, Issue 9, 2017 [bibtex]

Lopez, M., Larrea, V., Joubert, W., Hernandez, O., Haidar, A., Tomov, S., Dongarra, J. "Evaluation of Directive-based Performance Portable Programming Models," International Journal of High Performance Computing and Networking (IJHPCN), (In Press), 2017 [bibtex]

Lopez, M., Larrea, V., Joubert, W., Hernandez, O., Haidar, A., Tomov, S., Dongarra, J. "Evaluation of Directive-based Performance Portable Programming Models," International Journal of High Performance Computing and Networking (IJHPCN), (In Press), 2017 [bibtex]

Abdelfatah, A., Haidar, A., Tomov, S., Dongarra, J. "Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA," University of Tennessee Computer Science Technical Report, UT-EECS-16-748, December 28, 2016 [pdf] [bibtex]

Haidar, A., Abdelfatah, A., Tomov, S., Dongarra, J. "High-performance Cholesky factorization for GPU-only execution," University of Tennessee Computer Science Technical Report, UT-EECS-16-747, December 26, 2016 [pdf] [bibtex]

Lopez, M., Larrea, V., Joubert, W., Hernandez, O., Haidar, A., Tomov, S., Dongarra, J. "Towards Achieving Performance Portability Using Directives for Accelerators," The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Third Workshop on Accelerator Programming Using Directives (WACCPD), Salt Lake City, Utah, November 13-18, 2016 [pdf] [bibtex]

Haidar, A., Tomov, S., Arturov, K., Guney, M., Story, S., Dongarra, J. "LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi," IEEE High Performance Extreme Computing Conference (HPEC'16), Waltham, MA, September 13-15, 2016 [bibtex]

Haidar, A., Brock, B., Tomov, S., Guidry, M., Billings, J., Shyles, D., Dongarra, J. "Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations," 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), September 13-15, 2016 [pdf] [bibtex]

Masliah, I., Abdelfattah, A., Haidar, A., Tomov, S., Baboulin, M., Falcou, J., Dongarra, J. "High-performance matrix-matrix multiplications of very small matrices," 22nd International European Conference on Parallel and Distributed Computing (Euro-Par'16), Grenoble, France, August 22-26, 2016 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance, Design, and Autotuning of Batched GEMM for GPUs," The International Supercomputing Conference (ISC High Performance 2016), Frankfurt, Germany, June 19-23, 2016 [pdf] [bibtex]

Abdelfattah, A., Baboulin, M., Dobrev, V., Dongarra, J., Earl, C., Falcou, J., Haidar, A., Karlin, I., Kolev, Tz., Masliah, I., Tomov, S. "High-Performance Tensor Contractions for GPUs," International Conference on Computational Science (ICCS'16), San Diego, California, U.S.A., June 6-8, 2016 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs," International Conference on Computational Science (ICCS'16), San Diego, California, U.S.A., June 6-8, 2016 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures," The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, IEEE, Chicago, IL, USA, May 27, 2016 [pdf] [bibtex]

Newburn, CJ., Bansal, G., Wood, M., Crivelli, L., Planas, J., Duran, A., Souza, P., Borges, L., Luszczek, P., Tomov, S., Dongarra, J., Anzt, H., Gates, M., Haidar, A., Jia, Y., Kabir, K., Yamazaki, I., Labarta, J. "Heterogeneous Streaming," The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, IEEE, Chicago, IL, USA, May 23, 2016 [pdf] [bibtex]

Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J. "Performance, Design, and Autotuning of Batched GEMM for GPUs," University of Tennessee Computer Science Technical Report, UT-EECS-16-739, February 1, 2016 [pdf] [bibtex]

Abdelfattah, A., Baboulin, M., Dobrev, V., Dongarra, J., Earl, C., Falcou, J., Haidar, A., Karlin, I., Kolev, Tz., Masliah, I., Tomov, S. "High-Performance Tensor Contractions for GPUs," University of Tennessee Computer Science Technical Report, UT-EECS-16-738, January 21, 2016 [pdf] [bibtex]

Yamazaki, I., Tomov, S., and Dongarra, J. "Non-GPU-resident Dense Symmetric Indefinite Factorization," Concurrency and Computation: Practice and Experience, 2016 [bibtex]

Haidar, A., Jia, Y., Luszczek, P., Tomov, S., YarKhan, A., Dongarra, J. "Weighted Dynamic Scheduling with Many Parallelism Grains for Offloading of Numerical Workloads to Multiple Varied Accelerators," Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'15), ACM, New York, NY, USA, No. 5, November 16, 2015 [pdf] [bibtex]

Mary, T., Yamazaki, I., Kurzak, J., Luszczek, P., Tomov, S., Dongarra, J. "Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs," The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 15), Austin, TX, Nov. 15, 2015 [bibtex]

Yamazaki, I., Tomov, S., Kurzak, J., Dongarra, J., Barlow, J. "Mixed-precision Block Gram Schmidt Orthogonalization," 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Austin, TX, November, 2015 [bibtex]

Baboulin,, M., Dongarra, J., Remy, A., Tomov, S., Yamazaki, I. "Dense Symmetric Indefinite Factorization on GPU acclerated architectures," International Conference on Parallel Processing and Applied Mathematics (PPAM), Krakow, Poland, Sep. 6-9, 2015 [bibtex]

Haidar, A., Luszczek, P., Tomov, S., Dongarra, J. "Batched Matrix Computations on Hardware Accelerators," EuroMPI/Asia 2015 Workshop, Bordeaux, France, September, 2015 [bibtex]

Haidar, A., Tomov, S., Luszczek, P., Dongarra, J. "MAGMA Embedded: Towards a Dense Linear Algebra Library for Energy Efficient Extreme Computing," 19th IEEE High Performance Extreme Computing Conference (HPEC 2015), Best Paper Award, IEEE, Waltham, MA, September, 2015 [pdf] [bibtex]

YarKhan, A., Haidar, A., Cao, C., Luszczek, P., Tomov, S., Dongarra, J. "Cholesky Across Accelerators," 17th IEEE International Conference on High Performance Computing and Communications (HPCC 2015), IEEE, Elizabeth, NJ, August, 2015 [bibtex]

Kabir, K., Haidar, A., Tomov, S., and Dongarra, J. "On the Design, Development, and Analysis of Optimized Matrix-Vector Multiplication Routines for Coprocessors," ISC High Performance 2015, Frankfurt, Germany, July 12-16, 2015 [pdf] [bibtex]

Haidar, A., Dong, T., Tomov, S., Luszczek, P., Dongarra, J. "Framework for Batched and GPU-resident Factorization Algorithms Applied to Block Householder Transformations," ISC HPC, Springer LNCS, Frankfurt, Germany, July 12-16, 2015 [pdf] [bibtex]

Kabir, K., Haidar, A., Tomov, S., and Dongarra, J. "Performance Analysis and Optimisation of Two-Sided Factorization Algorithms for Heterogeneous Platform," The International Conference on Computational Science (ICCS 2015), Reykjavík, Iceland, June 1-3, 2015 [pdf] [bibtex]

Kabir, K., Haidar, A., Tomov, S., and Dongarra, J. "Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architectures," The Spring Simulation Multi-Conference 2015 (SpringSim'15), Alexandria, VA, April 12-15, 2015 [pdf] [bibtex]

Haidar, A., Dong, T., Luszczek, P., Tomov, S., and Dongarra, J. "Batched matrix computations on hardware accelerators based on GPUs," International Journal of High Performance Computing Applications, Sage Publications, Inc., February 9, 2015 [bibtex]

Haidar, A., Dong, T., Luszczek, P., Tomov, S., and Dongarra, J. "Optimization for performance and energy for batched matrix computations on GPUs," GPGPU 2015 Proceedings of the 8th Workshop on General Purpose Processing using GPUs, ACM, San Francisco, CA, pp. 59-69, February 7, 2015 [bibtex]

Anzt, H., Tomov, S., Dongarra, J. "Energy efficiency and performance frontiers for sparse computations on GPU supercomputers," Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM '15), ACM, San Francisco, CA, February, 2015 [pdf] [bibtex]

Haidar, A., Dongarra, J., Kabir, K., Gates, M., Luszczek, P., Tomov, S., Jia, Y. "HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi," Scientific Computing, IO Press, Vol. 23, No. 1, January, 2015 [pdf] [bibtex]

Yamazaki, I., Tomov, S., Dongarra, J. "Computing Low-rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and its Application to Solving a Hierarchically Semiseparable Linear System of Equations," Scientific Programming, 2015, 2015, 2015 [bibtex]

Yamazaki, I., Tomov, S., and Dongarra, J. "Mixed-Precision Cholesky QR Factorization and its Case Studies on Multicore CPU with Multiple GPUs," SIAM Journal on Scientific Computing, Vol. 37, No. 3, C307-C330, 2015 [bibtex]

Anzt, H., Sawyer, W., Tomov, S., Luszczek, P., Dongarra, J. "Acceleration of GPU-based Krylov solvers via Data Transfer Reduction," IJHPCA special issue for ASHES workshop, 2015 [bibtex]

Abalenkovs, M., Abdelfattah, A., Dongarra, J., Gates, M., Haidar, A., Kurzak, J., Luszczek, P., Tomov, S., Yamazaki, I., YarKhan, A. "Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems," Supercomputing frontiers and innovations, Vol. 2, No. 4, pp. 67-86, 2015 [pdf] [bibtex]

Yamazaki, I., Tomov, S., Dongarra, J. "Deflation Strategies to Improve the Convergence of Communication-Avoiding GMRES," 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, New Orleans, LA, Nov. 17, 2014 [pdf] [bibtex]

Haidar, A., Cao, C., Yamazaki, I., Dongarra, J., Gates, M., Luszczek, P., Tomov, S. "Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads Across Accelerators, Coprocessors, and Multicore Processors," Scala 2014, ACM, New Orleans, LA, November 17, 2014 [pdf] [bibtex]

Yamazaki, I., Rajamanickam, S., Boman, E., Hoemmen, M., Heroux, M., Tomov, S. "Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster," The International Conference for High Performance Computing, Networking, Storage and Analysis (SC), New Orleans, LA, November, 2014 [bibtex]

Anzt, H., Tomov, S., Dongarra, J. "Accelerating the LOBPCG method on GPUs using a blocked Sparse Matrix Vector Product," University of Tennessee Computer Science Technical Report, University of Tennessee, Knoxville, TN, UT-EECS-14-731, October 17, 2014 [pdf] [bibtex]

Dong, T., Haidar, A., Tomov, S., Dongarra, J. "A Fast Batched Cholesky Factorization on a GPU," 2014 International Conference on Parallel Processing (ICPP-2014), Minneapolis, MN, September, 2014 [pdf] [bibtex]

Dong, T., Haidar, A., Luszczek, P., Harris, J., Tomov, S., and Dongarra, J. "LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU," 16th IEEE International Conference on High Performance Computing and Communications (HPCC), Paris, France, pp. 157-161, August 20-22, 2014 [pdf] [bibtex]

Dongarra, J., Gates, M., Haidar, A., Kurzak, J., Luszczek, P., Tomov, S., Yamazaki, I. "Accelerating Numerical Dense Linear Algebra Calculations with GPUs," Numerical Calculations with GPUs, Volodymyr Kindratenko, eds., eds. Springer International Publishing, pp. 3-28, July, 2014 [pdf] [bibtex]

Anzt, H., Lukarski, D., Tomov, S., Dongarra, J. "Self-Adaptive Multiprecision Preconditioners on Multicore and Manycore Architectures," VECPAR 2014, Eugene, OR, June, 2014 [pdf] [bibtex]

Haidar, A., Luszczek, P., Tomov, S., Dongarra, J. "Heterogeneous Acceleration for Linear Algebra in Mulit-Coprocessor Environments," VECPAR 2014, Eugene, OR, June, 2014 [pdf] [bibtex]

Dongarra, J., Haidar, A., Kurzak, J., Luszczek, P., Tomov, S., YarKhan, A. "Model-Driven One-Sided Factorizations on Multicore Accelerated Systems," International Journal on Supercomputing Frontiers and Innovations, Vol. 1, No. 1, June, 2014 [pdf] [bibtex]

Cao, C., Dongarra, J., Du, P., Gates, M., Luszczek, P., Tomov, S. "clMAGMA: High Performance Dense Linear Algebra with OpenCL," International Workshop on OpenCL, Bristol University, England, May 12-13, 2014 [pdf] [bibtex]

Anzt, H., Tomov, S., Luszczek, P., Yamazaki, I., Dongarra, J., Sawyer, W. "Optimizing Krylov Subspace Solvers on Graphics Processing Units," Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, IEEE, Phoenix, AZ, May, 2014 [pdf] [bibtex]

Donfack, S., Tomov, S., Dongarra, J. "Dynamically balanced synchronization-avoiding LU factorization with multicore and GPUs," Fourth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2014, IEEE, Phoenix, AZ, May, 2014 [pdf] [bibtex]

Dong, T., Haidar, A., Tomov, S., Dongarra, J. "Batched Cholesky Factorization on a GPU," VECPAR 2014 (Submitted), Eugene, OR, January, 2014 [bibtex]

Du, P., Luszczek, P., Tomov, S., Dongarra, J. "Soft Error Resilient QR Factorization for Hybrid System with GPGPU," Journal of Computational Science, Vassil Alexandrov eds. eds. Elsevier B.V., Vol. 4, No. 6, pp. 457-464, November, 2013 [pdf] [bibtex]

Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S. "Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi," PPAM 2013, Warsaw, Poland, September, 2013 [pdf] [bibtex]

Haidar, A., Tomov, S., Dongarra, J., Solca, R., Schulthess, T. "A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks," International Journal of High Performance Computing Applications, August, 2013 [pdf] [bibtex]

Anzt, H., Tomov, S., Dongarra, J., Heuveline, V. "A Block-Asynchronous Relaxation Method for Graphics Processing Units," Journal of Parallel and Distributed Computing, June, 2013 [pdf] [bibtex]

Haidar, A., Solca, R., Gates, M., Tomov, S., Schulthess, T., Dongarra, J. "Leading Edge Hybrid Multi-GPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations," International Supercomputing Conference ISC, Lecture Notes in Computer Science, Leipzig, Germany, Vol. 7905, pp. 67-80, June, 2013 [pdf] [bibtex]

Chongxiao, C., Dongarra, J., Du, P., Gates, M., Luszczek, P., Tomov, S. "clMAGMA: High Performance Dense Linear Algebra with OpenCL," University of Tennessee Computer Science Technical Report (Lawn 275), UT-CS-13-706, March, 2013 [pdf] [bibtex]

Baboulin, M., Dongarra, J., Herrmann, J., Tomov, S. "Accelerating linear system solutions using randomization techniques," ACM Transactions on Mathematical Software (TOMS), Vol. 39, No 2, February, 2013 [bibtex]

Bosilca, G., Bouteiller, A., Danalis, A., Herault, T., Kurzak, J., Luszczek, P., Tomov, S., and J. Dongarra "Scalable Dense Linear Algebra on Heterogeneous Hardware," HPC: Transition Towards Exascale Processing, in the series Advances in Parallel Computing, IOS Press, 2013 [pdf] [bibtex]

Vetter, J., Glassbrook, R., Schwan, K., Yalamanchili, S., Horton, M., Gavrilovska, A., Slawinska, M., Meredith, J., Roth, P., Spafford, K., Tomov, S., Wynkoop, J. "Keeneland: Computational Science using Heterogeneous GPU Computing," Contemporary High Performance Computing: From Petascale Toward Exascale, Jeffrey Vetter eds. eds. Taylor and Francis, CRC Computational Science Series, Boca Raton, FL, Chapter 7, 2013 [pdf] [bibtex]

Solcà, R., Haidar, A., Tomov, S., Dongarra, J., Schulthess, T. "A Novel Hybrid CPU-GPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks," Supercomputing '12 (poster), Salt Lake City, Utah, November, 2012 [bibtex]

Agullo, E., Bosilca, G., Castagnède, C., Dongarra, J., Ltaief, H., Tomov, S. "Matrices Over Runtime Systems at Exascale," Supercomputing '12 (poster), Salt Lake City, Utah, November, 2012 [bibtex]

Dong, T., Kolev, T., Rieben, R., Dobrev, V., Tomov, S., Dongarra, J. "Acceleration of the BLAST Hydro Code on GPU," Supercomputing '12 (poster), Salt Lake City, Utah, November, 2012 [bibtex]

Donfack, S., Tomov, S., Dongarra, J. "Performance evaluation of LU factorization through hardware counter measurements," University of Tennessee Computer Science Technical Report, ut-cs-12-700, October, 2012 [pdf] [bibtex]

Anzt, H., Tomov, S., Dongarra, J., Heuveline, V. "A Block-Asynchronous Relaxation Method for Graphics Processing Units," Journal of Parallel and Distributed Computing (submitted), October, 2012 [pdf] [bibtex]

Du, P., Tomov, S., and Dongarra, J. "Providing GPU Capability to LU and QR within the ScaLAPACK Framework," University of Tennessee Computer Science Technical Report, UT-CS-12-699 (lawn272), UT-CS-12-699, September 12, 2012 [pdf] [bibtex]

Du, P., Tomov, S., Dongarra, J. "Providing GPU Capability to LU and QR within the ScaLAPACK Framework," University of Tennessee Computer Science Technical Report (also LAWN 272), UT-CS-12-699, September, 2012 [pdf] [bibtex]

Anzt, H., Tomov, S., Dongarra, J., Heuveline, V. "Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems," Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (Best Paper), Rhodes Island, Greece, August, 2012 [pdf] [bibtex]

Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J. "From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming," Parallel Computing, Vol. 38, No. 8, pp. 391-407, August, 2012 [bibtex]

Kasichayanula, K., Terpstra, D., Luszczek, P., Tomov, S., Moore, S., Peterson, G. "Power Aware Computing on GPUs," SAAHPC '12 (Best Paper Award), Argonne, IL, July 10-11, 2012 [pdf] [bibtex]

Yamazaki, I., Tomov, S., Dongarra, J. "One-sided dense matrix factorizations on a multicore with multiple GPU accelerators," The International Conference on Computational Science (ICCS), June 4, 2012 [bibtex]

Song, F., Tomov, S., Dongarra, J. "Enabling and Scaling Matrix Computations on Heterogeneous Multi-Core and Multi-GPU Systems," 26th ACM International Conference on Supercomputing (ICS 2012), ACM, San Servolo Island, Venice, Italy, June, 2012 [pdf] [bibtex]

Baboulin, M., Donfack, S., Dongarra, J., Grigori, L., Remi, A., Tomov, S. "A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines," Proc. of the International Conference on Computational Science (ICCS) , 9, 17-26, June, 2012 [bibtex]

Anzt, H., Tomov, S., Gates, M., Dongarra, J., Heuveline, V. "Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems," ICCS 2012, Omaha, NE, June, 2012 [pdf] [bibtex]

Vomel, C., Tomov, S., Dongarra, J. "Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems," SIAM Journal on Scientific Computing, 34 (2), C70-C82, April 12, 2012 [bibtex]

Baboulin, M., Dongarra, J., Herrmann, J., Tomov, S. "Accelerating Linear System Solutions Using Randomization Techniques," ACM Transactions on Mathematical Software (accepted) (also LAWN 246), March, 2012 [pdf] [bibtex]

Dongarra, J., Kurzak, J., Luszczek, P., Tomov, S. "Dense Linear Algebra on Accelerated Multicore Hardware," High Performance Scientific Computing: Algorithms and Applications, Berry, M., et al. eds. Springer-Verlag, London, UK, 2012 [bibtex]

Kurzak, J., Luszczek, P., Tomov, S., Dongarra, J. "Preliminary Results of Autotuning GEMM Kernels for the NVIDIA Kepler Architecture," LAWN 267, 2012 [pdf] [bibtex]

Anzt, H., Tomov, S., Gates, M., Dongarra, J., Heuveline, V. "Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems," UT-CS-11-689, December 6, 2011 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Langou, J., Ltaief, H., Tomov, S. "LU Factorization for Accelerator-based Systems," IEEE/ACS AICCSA 2011, Sharm-El-Sheikh, Egypt, December, 2011 [pdf] [bibtex]

Anzt, H., Tomov, S., Dongarra, J., Heuveline, V. "A Block-Asynchronous Relaxation Method for Graphics Processing Units," University of Tennessee Computer Science Technical Report, UT-CS-11-687 / LAWN 258, November 30, 2011 [pdf] [bibtex]

Nath, R., Tomov, S., Dong, T., Dongarra, J. "Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs," ACM/IEEE Conference on Supercomputing (SC’11), Seattle, WA, November 12-18, 2011 [pdf] [bibtex]

Malony, A., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Duncan Poole, P., Lamb, C. "Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs," International Conference on Parallel Processing (ICPP'11), Taipei, Taiwan, September 13-16, 2011 [bibtex]

Baboulin, M., Dongarra, J., Herrmann, J., Tomov, S. "Accelerating Linear System Solutions Using Randomization Techniques," INRIA RR-7616 / LAWN #246 (presented at International AMMCS’11), Waterloo, Ontario, Canada, July 25-29, 2011 [bibtex]

Horton, M., Tomov, S., Dongarra, J. "A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures," Symposium for Application Accelerators in High Performance Computing (SAAHPC'11), Knoxville, TN, July 19-20, 2011 [pdf] [bibtex]

Du, P., Luszczek, P., Tomov, S., Dongarra, J. "Soft Error Resilient QR Factorization for Hybrid System," UT-CS-11-675 (also LAPACK Working Note #252), ICL-CS-11-675, July 1, 2011 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Herault, T., Lemarier, P., Saengpatsa, N., Tomov, S., Dongarra, J. "Performance Portability of a GPU Enabled Factorization with the DAGuE Framework," IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 24, 2011 [pdf] [bibtex]

Fengguang, S., Tomov, S., Dongarra, J. "Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures," University of Tennessee Computer Science Technical Report, UT-CS-11-668, (also Lawn 250), June 16, 2011 [pdf] [bibtex]

Bosilca, G., Bouteiller, A., Herault, T., Lemarinier, P., Saengpatsa, N., Tomov, S., Dongarra, J. "A Unified HPC Environment for Hybrid Manycore/GPU Distributed Systems," IEEE International Parallel and Distributed Processing Symposium (submitted), Anchorage, AK, May 16-20, 2011 [bibtex]

Kurzak, J., Tomov, S., Dongarra, J. "Autotuning GEMMs for Fermi," University of Tennessee Computer Science Technical Report, UT-CS-11-671, (also Lawn 245), April 18, 2011 [pdf] [bibtex]

Malony, A., Biersdorff, S., Shende, S., Jagode, H., Tomov, S., Juckeland, G., Dietrich, R., Poole, D., Lamb, C. "Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs," ICPP 2011 (submitted), Taipei, Taiwan, 2011 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S. "A Hybridization Methodology for High-Performance Linear Algebra Software for GPUs," in GPU Computing Gems, Jade Edition, Hwu, W. eds. Elsevier, 2, 473-484, 2011 [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., Tomov, S. "GPU Computing Gems, Jade Edition," ISBN: 9780123859631, Wen-mei W. Hwu eds. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 473-484 (Chapter 34), 2011 [bibtex]

Nath, R., Tomov, S., Dongarra, J. "Blas for GPUs, Scientific Computing with Multicore and Accelerators," Chapman & Hall/CRC Computational Science, Kurzak, J., Bader, D., Dongarra, J. eds. Chapman & Hall/CRC Computational Science, December 7, 2010 [pdf] [bibtex]

Tomov, S., Dongarra, J. "Dense Linear Algebra for Hybrid GPU-based Systems, Scientific Computing with Multicore and Accelerators," Chapman & Hall/CRC Computational Science, Kurzak, J., Bader, D., Dongarra, J. eds. Chapman & Hall/CRC Computational Science, December 7, 2010 [bibtex]

Tomov, S., Faverge, M., Luszczek, P., Dongarra, J. "Using MAGMA with PGI Fortran," PGI Insider, November 15, 2010 [htm] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Faverge, M., Ltaief, H., Thibault, S., Tomov, S. "QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators," Proceedings of IPDPS 2011, Anchorage, AK, ICL-UT-10-04, October 1, 2010 [pdf] [bibtex]

Du, P., Luszczek, P., Tomov, S., Dongarra, J. "Mixed-Tool Performance Analysis on Hybrid Multicore Architectures," First International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2010), San Diego, CA, Sept. 13-16, 2010 [pdf] [bibtex]

Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J. "From CUDA to OpenCL: Towards a Performance-portable Solution for Multiplatform GPU Programming," Parallel Computing (submitted), August, 2010 [bibtex]

Vomel, C., Tomov, S., Dongarra, J. "Divide & Conquer on Hybrid GPU-Accelerated Multicore Systems," SIAM Journal on Scientific Computing (submitted), August, 2010 [bibtex]

Nath, R., Tomov, S., Dongarra, J. "An Improved MAGMA GEMM for Fermi GPUs," University of Tennessee Computer Science Technical Report, UT-CS-10-655 (also LAPACK working note 227), July 29, 2010 [pdf] [bibtex]

Ltaief, H., Tomov, S., Nath, R., Du, P., Dongarra, J. "A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators," Proc. of VECPAR'10 (to appear), Berkeley, CA, June 22-25, 2010 [pdf] [bibtex]

Nath, R., Tomov, S., Dongarra, J. "Accelerating GPU Kernels for Dense Linear Algebra," Proc. of VECPAR'10, Berkeley, CA, June 22-25, 2010 [pdf] [bibtex]

Bernholc, J., Hodak, M., Lu, W., Moore, S., Tomov, S. "Scalability Study of a Quantum Simulation Code," PARA 2010, Reykjavik, Iceland, June 6-9, 2010 [bibtex]

Tomov, S., Lu., W., Bernholc, J., Moore, S., Dongarra, J. "Performance Evaluation for Petascale Quantum Simulation Tools," Proceedings of the Cray Users' Group Meeting, Atlanta, GA, May 4, 2010 [bibtex]

Ltaief, H., Tomov, S., Nath, R., Dongarra, J. "Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators," IEEE Transaction on Parallel and Distributed Systems (submitted), March 26, 2010 [pdf] [bibtex]

Tomov, S., Nath, R., Ltaief, H., Dongarra, J. "Dense Linear Algebra Solvers for Multicore with GPU Accelerators," Proc. of IPDPS'10, Atlanta, GA, January 15, 2010 [pdf] [bibtex]

Tomov, S., Nath, R., Dongarra, J. "Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing," Parallel Computing, vol. 36, number 12, pp. 645-654, June 19, 2010 [pdf] [bibtex]

Nath, R., Tomov, S., Dongarra, J. "An Improved MAGMA GEMM for Fermi GPUs," International Journal of High Performance Computing, vol. 24, no. 4, 511-515, November 18, 2010 [bibtex]

Tomov, S., Dongarra, J., Baboulin, M. "Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems," Parallel Computing, Vol. 36, Number 5-6, pp. 232-240, 2010 [pdf] [bibtex]

Agullo, E., Augonnet, C., Dongarra, J., Ltaief, H., Namyst, R., Thibault, S., and Tomov, S. "Faster, Cheaper, Better - a Hybridization Methodology to Develop Linear Algebra Software for GPUs," LAPACK Working Note 230, 2010 [pdf] [bibtex]

Du, P., Weber, R., Luszczek, P., Tomov, S., Peterson, G., Dongarra, J. "From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming," Parallel Computing (submitted), 2010 [bibtex]

Li, Y., Dongarra, J., Tomov, S. "A note on auto-tuning GEMM for GPUs," Proc. of ICCS'09, Baton Rouge, LA, UT-CS-09-635, May 25-27, 2009 [pdf] [bibtex]

Li, Y., Dongarra, J., Tomov, S. "A Note on Auto-tuning GEMM for GPUs," Computational Science – ICCS 2009, Proceedings of the 9th International Conference, Lecture Notes in Computer Science: Theoretical Computer Science and General Issues, Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. eds. Baton Rouge, LA, Parts I-II, Vols. 5544-5545, pp. 884-892, May 25-27, 2009 [bibtex]

Tomov, S., Dongarra, J. "Accelerating the Reduction to Upper Hessenberg Form Through Hybrid GPU-based Computing," University of Tennessee Computer Science Technical Report, UT-CS-09-642 (also LAPACK Working Note 219), May 24, 2009 [pdf] [bibtex]

Tomov, S., Lu, W., Bernholc, J., Moore, S., Dongarra, J. "Performance evaluation for petascale quantum simulation tools," Proceedings of CUG09, Atlanta, GA, May 4-7, 2009 [pdf] [bibtex]

Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S. "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects," Journal of Physics: Conference Series, Vol. 180, 2009 [pdf] [bibtex]

Canning, A., Dongarra, J., Langou, J., Marques, O., Tomov, S., Voemel, C., Wang, L.-W. "Interior State Computation of Nano Structures," PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway, May 13-16, 2008 [pdf] [bibtex]

Baboulin, M., Tomov, S., Dongarra, J. "Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures," PARA 2008, 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim Norway, May 13-16, 2008 [bibtex]

Baboulin, M., Dongarra, J., Tomov, S. "Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures," University of Tennessee Computer Science Technical Report, UT-CS-08-615 (also LAPACK Working Note 200), May 6, 2008 [pdf] [bibtex]

Dongarra, J., Moore, S., Peterson, G., Tomov, S., Allred, J., Natoli, V., Richie, D. "Exploring New Architectures in Accelerating CFD for Air Force Applications," Proceedings of the DoD HPCMP User Group Conference, Seattle, Washington, July 14-17, 2008 [pdf] [bibtex]

Tomov, S., Dongarra, J., Baboulin, M. "Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems," University of Tennessee Computer Science Technical Report, UT-CS-08-632 (also LAPACK Working Note 210), October 17, 2008 [pdf] [bibtex]

Vomel, C., Tomov, S., Marques, O., Canning, A., Wang, L.-W., Dongarra, J. "State-of-the-Art Eigensolvers for Electronic Structure Calculations of Large Scale Nano-Systems," Journal of Computational Physics, Vol. 227, Issue15, pp. 7113-7124, July, 2008 [bibtex]

Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, J., Luszczek, P., Tomov, S. "Exploiting Mixed Precision Floating Point Hardware in Scientific Computations," in High Performance Computing and Grids in Action, Grandinetti, L. eds. IOS Press, Amsterdam, 2008 [pdf] [bibtex]

Buttari, A., Dongarra, J., Kurzak, J., Luszczek, P., Tomov, S. "Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy," ACM Transactions on Mathematical Software, Vol 34, Number 4, pp. 17-22, 2008 [pdf] [bibtex]

Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, Jn., Luszczek, P., Tomov, S. "Exploiting Mixed Precision Floating Point Hardware in Scientific Computations," In High Performance Computing and Grids in Action (to appear), Lucio Grandinetti eds. IOS Press, Amsterdam, 2007 [pdf] [bibtex]

Vo¨mel, C., Tomov, S., Wang, L-W., Marques, O., Dongarra, J. "The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot," Journal of Computational Physics, Volume 223, pp. 774-782, 2007 [pdf] [bibtex]

Demmel, J., Dongarra, J., Parlett, B., Kahan, W., Gu, M., Bindel, D., Hida, Y., Li, X., Marques, O., Riedy, E. J., Voemel, C., Langou, J., Luszczek, P., Kurzak, J., Buttari, A., Langou, J., Tomov, S. "Prospectus for the Next LAPACK and ScaLAPACK Libraries," PARA 2006, Umea, Sweden, June, 2006 [pdf] [bibtex]

Canning, A., Dongarra, J., Langou, J., Marques, O., Tomov, S., Voemel, C., Wang, L-W. "Towards bulk based preconditioning for quantum dot computations," IEEE/ACM Proceedings of HPCNano SC06 (to appear), 2006 [pdf] [bibtex]

Canning, A., Dongarra, J., Langou, J., Marques, O., Tomov, S., Voemel, C., Wang, L-W. "Performance evaluation of eigensolvers in nano-structure computations," IEEE/ACM Proceedings of HPCNano SC06 (to appear), 2006 [pdf] [bibtex]

Voemel, C., Tomov, S., Wang, L-W., Marques, O., Dongarra, J. "The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot," Journal of Computational Physics (submitted), 2006 [pdf] [bibtex]

Zunger, A., Franceschetti, A., Bester, G., Jones, W. B., Kim, K., Graf, P. A., Wang, L-W., Canning, A., Marques, O., Voemel, C., Dongarra, J., Langou, J., Tomov, S. "Predicting the electronic properties of 3D, million-atom semiconductor nanostructure architectures," J. Phys.: Conf. Ser. 46, doi:10.1088/1742-6596/46/1/040, 292-298, 2006 [pdf] [bibtex]

Tomov, S., Langou, J., Dongarra, J., Canning, A., Wang, L-W. "Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures," International Journal of Computational Science and Engineering, Volume 2, Number 3/ 4, pp. 205-212, 2006 [pdf] [bibtex]

Tomov, S., Langou, J., Canning, A., Wang, L.-W., Dongarra, J. "Comparison of Nonlinear Conjugate-Gradient methods for computing the Electronic Properties of Nanostructure Architectures," Proceedings of 5th International Conference on Computational Science (ICCS), Sunderman, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. eds. Springer's Lecture Notes in Computer Science, Atlanta, GA, USA, Part III, pp. 317-325, May, 22-25, 2005 [pdf] [bibtex]

Tomov, S., Langou, J., Canning, A., Wang, L-W., Dongarra, J. "Conjugate-Gradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures," International Journal of Computational Science and Engineering (to appear), June, 2005 [pdf] [bibtex]

Email
Phone 865-974-6317
Office Claxton 317

University of Tennessee
Computer Science Department
Innovative Computing Laboratory
1122 Volunteer Blvd, Claxton Building
Knoxville, Tennessee 37996-3450
Fax 865-974-8296