How to Make Lanczos-Montgomery Fast on Modern Supercomputers?
- ️Fri Jan 05 2024
Abstract
This paper deals with the performance analysis of the INM RAS implementation of the block Lanczos-Montgomery method that was used to accomplish the factorization of the RSA-232 number. Other steps of RSA-232 factorization carried out by CADO-NFS open source library are also mentioned. Further adaptation of parallel block Lanczos-Montgomery method for the modern supercomputers is discussed, including implementation of basic operations of the method on GPUs.
RSA-232 factorization was supported by the Moscow Center of Fundamental and Applied Mathematics at INM RAS (Agreement with the Ministry of Education and Science of the Russian Federation No. 075-15-2022-286), block Lanczos performance analysis and its improvements research was supported by the Russian Science Foundation project № 19-11-00338, https://rscf.ru/en/project/19-11-00338/.
References
Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21(2), 120–126 (1978). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/359340.359342
Montgomery, P.L.: A block Lanczos algorithm for finding dependencies over GF(2). In: Guillou, L.C., Quisquater, J.-J. (eds.) EUROCRYPT 1995. LNCS, vol. 921, pp. 106–120. Springer, Heidelberg (1995). https://doi.org/10.1007/3-540-49264-X_9
Zamarashkin, N., Zheltkov, D.: Block Lanczos–montgomery method with reduced data exchanges. In: Voevodin, V., Sobolev, S. (eds.) RuSCDays 2016. CCIS, vol. 687, pp. 15–26. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-55669-7_2
Lenstra, A.K., Lenstra, H.W., Manasse, M.S., Pollard, J.M.: The number field sieve. In: Lenstra, A.K., Lenstra, H.W. (eds.) The development of the number field sieve. LNM, vol. 1554, pp. 11–42. Springer, Heidelberg (1993). https://doi.org/10.1007/BFb0091537
CADO-NFS Homepage. https://gitlab.inria.fr/cado-nfs/cado-nfs. Accessed 30 Apr 2023
Lanczos, C.: Solution of systems of linear equations by minimized iterations. J. Res. Natl. Bur. Stand. 49, 33–53 (1952). https://doi.org/10.6028/jres.049.006
Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952). https://doi.org/10.6028/jres.049.044
Saad, Y.: Iterative methods for sparse linear systems. Soc. Ind. Appl. Math. (2003). https://doi.org/10.1137/1.9780898718003
Kleinjung, T., et al.: Factorization of a 768-Bit RSA modulus. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 333–350. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14623-7_18
Boudot, F., Gaudry, P., Guillevic, A., Heninger, N., Thomé, E., Zimmermann, P.: Comparing the difficulty of factorization and discrete logarithm: a 240-digit experiment. In: Micciancio, D., Ristenpart, T. (eds.) CRYPTO 2020. LNCS, vol. 12171, pp. 62–91. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-56880-1_3
Arlazarov, V.L., Dinitz, Y.A., Kronrod, M.A., Faradzhev, I.A.: On economical construction of the transitive closure of an oriented graph. Dokl. Akad. Nauk SSSR 194(3), 487–488 (1970)
Coppersmith, D.: Solving homogeneous linear equations over via block Wiedemann algorithm. Math. Comp. 62, 333–350 (1994). https://doi.org/10.1090/S0025-5718-1994-1192970-7
Albrecht M., Bard G., Hart W.: Algorithm 898: Efficient multiplication of dense matrices over GF(2). ACM Trans. Math. Softw. 37(1), Article 9, pp. 1–14 (2010). https://doi.org/10.1145/1644001.1644010
Zamarashkin, N.L., Zheltkov, D.A.: GPU acceleration of dense matrix and block operations for Lanczos method for systems over GF(2). Lobachevskii J. Math. 40, 1881–1991 (2019). https://doi.org/10.1134/S1995080219110337
Thomé, E.: A modified block Lanczos algorithm with fewer vectors. arXiv preprint 1604.02277 (2016). https://doi.org/10.48550/arXiv.1604.02277
Ghysels, P., Vanroose, W.: Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. Parallel Comput. 40(7), 224–238 (2014). https://doi.org/10.1016/j.parco.2013.06.001
Author information
Authors and Affiliations
Marchuk Institute of Numerical Mathematics of the Russian Academy of Sciences, Moscow, Russia
Dmitry Zheltkov, Nikolai Zamarashkin & Sergey Matveev
Lomonosov Moscow State University, Moscow, Russia
Nikolai Zamarashkin & Sergey Matveev
Authors
- Dmitry Zheltkov
You can also search for this author in PubMed Google Scholar
- Nikolai Zamarashkin
You can also search for this author in PubMed Google Scholar
- Sergey Matveev
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Dmitry Zheltkov .
Editor information
Editors and Affiliations
Research Computing Center (RCC), Moscow State University, Moscow, Russia
Vladimir Voevodin
Research Computing Center (RCC), Moscow State University, Moscow, Russia
Sergey Sobolev
Russian Academy of Sciences (RAS), Keldysh Institute of Applied Mathematics, Moscow, Russia
Mikhail Yakobovskiy
Russian Federal Nuclear Center, Sarov, Russia
Rashit Shagaliev
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zheltkov, D., Zamarashkin, N., Matveev, S. (2023). How to Make Lanczos-Montgomery Fast on Modern Supercomputers?. In: Voevodin, V., Sobolev, S., Yakobovskiy, M., Shagaliev, R. (eds) Supercomputing. RuSCDays 2023. Lecture Notes in Computer Science, vol 14388. Springer, Cham. https://doi.org/10.1007/978-3-031-49432-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-49432-1_9
Published: 05 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49431-4
Online ISBN: 978-3-031-49432-1
eBook Packages: Computer ScienceComputer Science (R0)