We have repurposed Google tensor processing units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast intercore interconnects (ICIs), physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this regime, the matrix-multiply units (MXUs) dominate the runtime, yielding impressive scaling, performance, and raw size: Operating in float32 precision, a full 2,048-core pod of third-generation TPUs can multiply two matrices with linear size [Formula: see text] in about 2 min. Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks in dense linear algebra can similarly scale. As examples, we present 1) QR decomposition; 2) resolution of linear systems; and 3) the computation of matrix functions by polynomial iteration, demonstrated by the matrix polar factorization.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9388123 | PMC |
http://dx.doi.org/10.1073/pnas.2122762119 | DOI Listing |
Entropy (Basel)
December 2024
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China.
Differential Computation Analysis (DCA) leverages memory traces to extract secret keys, bypassing countermeasures employed in white-box designs, such as encodings. Although researchers have made great efforts to enhance security against DCA, most solutions considerably decrease algorithmic efficiency. In our approach, the Feistel cipher SM4 is implemented by a series of table-lookup operations, and the input and output of each table are protected by affine transformations and nonlinear encodings generated randomly.
View Article and Find Full Text PDFNat Commun
January 2025
Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA.
The ability to perform mathematical computations using metastructures is an emergent paradigm that carries the potential of wave-based analog computing to the realm of near-speed-of-light, low-loss, compact devices. We theoretically introduce and experimentally verify the concept of a reconfigurable metastructure that performs analog complex mathematical computations using electromagnetic waves. Reconfigurable, RF-based components endow our device with the ability to perform stationary and non-stationary iterative algorithms.
View Article and Find Full Text PDFNeural Netw
December 2024
Department of Earth Science and Engineering, Imperial College London, Prince Consort Road, London SW7 2BP, UK; Centre for AI-Physics Modelling, Imperial-X, White City Campus, Imperial College London, W12 7SL, UK.
Machine learning (ML) has benefited from both software and hardware advancements, leading to increasing interest in capitalising on ML throughout academia and industry. There have been efforts in the scientific computing community to leverage this development via implementing conventional partial differential equation (PDE) solvers with machine learning packages, most of which rely on structured spatial discretisation and fast convolution algorithms. However, unstructured meshes are favoured in problems with complex geometries.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Mathematics, College of Science, King Khalid, University, Abha, 61413, Saudi Arabia.
Algebraic structures play a vital role in securing important data. These structures are utilized to construct the non-linear components of block ciphers. Since constructing non-linear components through algebraic structures is crucial for the confusion aspects of encryption schemes, relying solely on these structures can result in limited key spaces.
View Article and Find Full Text PDFBiomed Eng Lett
January 2025
NaviNetics, Inc, Rochester, MN USA.
Stereotactic systems have traditionally used Cartesian coordinate combined with linear algebraic mathematical models to navigate the brain. Previously, the development of a novel stereotactic system allowed for improved patient comfort, reduced size, and carried through a simplified interface for surgeons. The system was designed with a work envelope and trajectory range optimized for deep brain stimulation applications only.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!