Models of artificial intelligence (AI) that have billions of parameters can achieve high accuracy across a range of tasks, but they exacerbate the poor energy efficiency of conventional general-purpose processors, such as graphics processing units or central processing units. Analog in-memory computing (analog-AI) can provide better energy efficiency by performing matrix-vector multiplications in parallel on 'memory tiles'. However, analog-AI has yet to demonstrate software-equivalent (SW) accuracy on models that require many such tiles and efficient communication of neural-network activations between the tiles. Here we present an analog-AI chip that combines 35 million phase-change memory devices across 34 tiles, massively parallel inter-tile communication and analog, low-power peripheral circuitry that can achieve up to 12.4 tera-operations per second per watt (TOPS/W) chip-sustained performance. We demonstrate fully end-to-end SW accuracy for a small keyword-spotting network and near-SW accuracy on the much larger MLPerf recurrent neural-network transducer (RNNT), with more than 45 million weights mapped onto more than 140 million phase-change memory devices across five chips.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10447234PMC
http://dx.doi.org/10.1038/s41586-023-06337-5DOI Listing

Publication Analysis

Top Keywords

analog-ai chip
8
energy efficiency
8
processing units
8
phase-change memory
8
memory devices
8
analog-ai
4
chip energy-efficient
4
energy-efficient speech
4
speech recognition
4
recognition transcription
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!