Large language models (LLMs), with their remarkable generative capacities, have greatly impacted a range of fields, but they face scalability challenges due to their large parameter counts, which result in high costs for training and inference. The trend of increasing model sizes is exacerbating these challenges, particularly in terms of memory footprint, latency and energy consumption. Here we explore the deployment of 'mixture of experts' (MoEs) networks-networks that use conditional computing to keep computational demands low despite having many parameters-on three-dimensional (3D) non-volatile memory (NVM)-based analog in-memory computing (AIMC) hardware.
View Article and Find Full Text PDFMemristive technology has been rapidly emerging as a potential alternative to traditional CMOS technology, which is facing fundamental limitations in its development. Since oxide-based resistive switches were demonstrated as memristors in 2008, memristive devices have garnered significant attention due to their biomimetic memory properties, which promise to significantly improve power consumption in computing applications. Here, we provide a comprehensive overview of recent advances in memristive technology, including memristive devices, theory, algorithms, architectures, and systems.
View Article and Find Full Text PDFDeep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory devices organized in crossbar arrays could store the synaptic weights in their conductance states and perform the expensive weighted summations in place in a non-von Neumann manner.
View Article and Find Full Text PDFIn-memory computing using resistive memory devices is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. However, due to device variability and noise, the network needs to be trained in a specific way so that transferring the digitally trained weights to the analog resistive memory devices will not result in significant loss of accuracy. Here, we introduce a methodology to train ResNet-type convolutional neural networks that results in no appreciable accuracy loss when transferring weights to phase-change memory (PCM) devices.
View Article and Find Full Text PDFSpiking neural networks (SNN) are computational models inspired by the brain's ability to naturally encode and process information in the time domain. The added temporal dimension is believed to render them more computationally efficient than the conventional artificial neural networks, though their full computational capabilities are yet to be explored. Recently, in-memory computing architectures based on non-volatile memory crossbar arrays have shown great promise to implement parallel computations in artificial and spiking neural networks.
View Article and Find Full Text PDFNeuromorphic computing has emerged as a promising avenue towards building the next generation of intelligent computing systems. It has been proposed that memristive devices, which exhibit history-dependent conductivity modulation, could efficiently represent the synaptic weights in artificial neural networks. However, precise modulation of the device conductance over a wide dynamic range, necessary to maintain high network accuracy, is proving to be challenging.
View Article and Find Full Text PDFNeural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry.
View Article and Find Full Text PDFDense crossbar arrays of non-volatile memory (NVM) can potentially enable massively parallel and highly energy-efficient neuromorphic computing systems. The key requirements for the NVM elements are continuous (analog-like) conductance tuning capability and switching symmetry with acceptable noise levels. However, most NVM devices show non-linear and asymmetric switching behaviors.
View Article and Find Full Text PDF