Large-scale distributed linear algebra with tensor processing units.

Proc Natl Acad Sci U S A

Sandbox Alphabet X, The Moonshot Factory, Mountain View, CA 94043.

Published: August 2022

We have repurposed Google tensor processing units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast intercore interconnects (ICIs), physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this regime, the matrix-multiply units (MXUs) dominate the runtime, yielding impressive scaling, performance, and raw size: Operating in float32 precision, a full 2,048-core pod of third-generation TPUs can multiply two matrices with linear size [Formula: see text] in about 2 min. Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks in dense linear algebra can similarly scale. As examples, we present 1) QR decomposition; 2) resolution of linear systems; and 3) the computation of matrix functions by polynomial iteration, demonstrated by the matrix polar factorization.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9388123PMC
http://dx.doi.org/10.1073/pnas.2122762119DOI Listing

Publication Analysis

Top Keywords

linear algebra
12
tensor processing
8
processing units
8
dense linear
8
linear
5
large-scale distributed
4
distributed linear
4
algebra tensor
4
units repurposed
4
repurposed google
4

Similar Publications

Differential Computation Analysis (DCA) leverages memory traces to extract secret keys, bypassing countermeasures employed in white-box designs, such as encodings. Although researchers have made great efforts to enhance security against DCA, most solutions considerably decrease algorithmic efficiency. In our approach, the Feistel cipher SM4 is implemented by a series of table-lookup operations, and the input and output of each table are protected by affine transformations and nonlinear encodings generated randomly.

View Article and Find Full Text PDF

The ability to perform mathematical computations using metastructures is an emergent paradigm that carries the potential of wave-based analog computing to the realm of near-speed-of-light, low-loss, compact devices. We theoretically introduce and experimentally verify the concept of a reconfigurable metastructure that performs analog complex mathematical computations using electromagnetic waves. Reconfigurable, RF-based components endow our device with the ability to perform stationary and non-stationary iterative algorithms.

View Article and Find Full Text PDF

Implementing the discontinuous-Galerkin finite element method using graph neural networks with application to diffusion equations.

Neural Netw

December 2024

Department of Earth Science and Engineering, Imperial College London, Prince Consort Road, London SW7 2BP, UK; Centre for AI-Physics Modelling, Imperial-X, White City Campus, Imperial College London, W12 7SL, UK.

Machine learning (ML) has benefited from both software and hardware advancements, leading to increasing interest in capitalising on ML throughout academia and industry. There have been efforts in the scientific computing community to leverage this development via implementing conventional partial differential equation (PDE) solvers with machine learning packages, most of which rely on structured spatial discretisation and fast convolution algorithms. However, unstructured meshes are favoured in problems with complex geometries.

View Article and Find Full Text PDF

A combinatory approach of non-chain ring and henon map for image encryption application.

Sci Rep

January 2025

Department of Mathematics, College of Science, King Khalid, University, Abha, 61413, Saudi Arabia.

Algebraic structures play a vital role in securing important data. These structures are utilized to construct the non-linear components of block ciphers. Since constructing non-linear components through algebraic structures is crucial for the confusion aspects of encryption schemes, relying solely on these structures can result in limited key spaces.

View Article and Find Full Text PDF

Stereotactic systems have traditionally used Cartesian coordinate combined with linear algebraic mathematical models to navigate the brain. Previously, the development of a novel stereotactic system allowed for improved patient comfort, reduced size, and carried through a simplified interface for surgeons. The system was designed with a work envelope and trajectory range optimized for deep brain stimulation applications only.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!