Area-Efficient Mapping of Convolutional Neural Networks to Memristor Crossbars Using Sub-Image Partitioning.

Micromachines (Basel)

School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea.

Published: January 2023

Memristor crossbars can be very useful for realizing edge-intelligence hardware, because the neural networks implemented by memristor crossbars can save significantly more computing energy and layout area than the conventional CMOS (complementary metal-oxide-semiconductor) digital circuits. One of the important operations used in neural networks is convolution. For performing the convolution by memristor crossbars, the full image should be partitioned into several sub-images. By doing so, each sub-image convolution can be mapped to small-size unit crossbars, of which the size should be defined as 128 × 128 or 256 × 256 to avoid the line resistance problem caused from large-size crossbars. In this paper, various convolution schemes with 3D, 2D, and 1D kernels are analyzed and compared in terms of neural network's performance and overlapping overhead. The neural network's simulation indicates that the 2D + 1D kernels can perform the sub-image convolution using a much smaller number of unit crossbars with less rate loss than the 3D kernels. When the CIFAR-10 dataset is tested, the mapping of sub-image convolution of 2D + 1D kernels to crossbars shows that the number of unit crossbars can be reduced almost by 90% and 95%, respectively, for 128 × 128 and 256 × 256 crossbars, compared with the 3D kernels. On the contrary, the rate loss of 2D + 1D kernels can be less than 2%. To improve the neural network's performance more, the 2D + 1D kernels can be combined with 3D kernels in one neural network. When the normalized ratio of 2D + 1D layers is around 0.5, the neural network's performance indicates very little rate loss compared to when the normalized ratio of 2D + 1D layers is zero. However, the number of unit crossbars for the normalized ratio = 0.5 can be reduced by half compared with that for the normalized ratio = 0.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9959389PMC
http://dx.doi.org/10.3390/mi14020309DOI Listing

Publication Analysis

Top Keywords

memristor crossbars
16
unit crossbars
16
neural network's
16
normalized ratio
16
neural networks
12
sub-image convolution
12
network's performance
12
number unit
12
rate loss
12
crossbars
11

Similar Publications

Neuromorphic circuits and devices have been introduced in the last decades as elements of a key strategy for developing of new paradigms of computation, inspired by the intent to mimic elementary neuron structure and biological mechanisms, for the overcoming of energy and timeconsuming bottlenecks achieved by digital computing (DC) technologies. Although the term "neuromorphic" is in common use, its meaning is often misunderstood and indistinctly associated with many different technologies, based on both conventional and unconventional electronic components and architectures. Here an overview of the different technological strategies used for developing neuromorphic computing systems is proposed, with an insight on the neuromorphic features they implement and a special focus on the technological strategies and patents that exploit unconventional computing paradigms.

View Article and Find Full Text PDF

Defect-Tolerant Memristor Crossbar Circuits for Local Learning Neural Networks.

Nanomaterials (Basel)

January 2025

School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea.

Local learning algorithms, such as Equilibrium Propagation (EP), have emerged as alternatives to global learning methods like backpropagation for training neural networks. EP offers the potential for more energy-efficient hardware implementation by utilizing only local neuron information for weight updates. However, the practical implementation of EP using memristor-based circuits has significant challenges due to the immature fabrication processes of memristors, resulting in defects and variability issues.

View Article and Find Full Text PDF

Analog In-memory Computing (IMC) has demonstrated energy-efficient and low latency implementation of convolution and fully-connected layers in deep neural networks (DNN) by using physics for computing in parallel resistive memory arrays. However, recurrent neural networks (RNN) that are widely used for speech-recognition and natural language processing have tasted limited success with this approach. This can be attributed to the significant time and energy penalties incurred in implementing nonlinear activation functions that are abundant in such models.

View Article and Find Full Text PDF

We demonstrate low energy, forming and compliance-free operation of a resistive memory obtained by the partial oxidation of a two-dimensional layered van-der-Waals semiconductor: hafnium disulfide (HfS). Semiconductor-oxide heterostructures are achieved by low temperature (<300 °C) thermal oxidation of HfS under dry conditions, carefully controlling process parameters. The resulting HfOS/HfS heterostructures are integrated between metal contacts, forming vertical crossbar devices.

View Article and Find Full Text PDF

IMPACT: In-Memory ComPuting Architecture based on Y-FlAsh Technology for Coalesced Tsetlin machine inference.

Philos Trans A Math Phys Eng Sci

January 2025

Microsystems Group, School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.

The increasing demand for processing large volumes of data for machine learning (ML) models has pushed data bandwidth requirements beyond the capability of traditional von Neumann architecture. In-memory computing (IMC) has recently emerged as a promising solution to address this gap by enabling distributed data storage and processing at the micro-architectural level, significantly reducing both latency and energy. In this article, we present In-Memory comPuting architecture based on Y-FlAsh technology for Coalesced Tsetlin machine inference (IMPACT), underpinned on a cutting-edge memory device, Y-Flash, fabricated on a 180 nm complementary metal oxide semiconductor (CMOS) process.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!