Dropout is a mechanism to prevent deep neural networks from overfitting and improving their generalization. Random dropout is the simplest method, where nodes are randomly terminated at each step of the training phase, which may lead to network accuracy reduction. In dynamic dropout, the importance of each node and its impact on the network performance is calculated, and the important nodes do not participate in the dropout. But the problem is that the importance of the nodes is not calculated consistently. A node may be considered less important and be dropped in one training epoch and on a batch of data before entering the next epoch, in which it may be an important node. On the other hand, calculating the importance of each unit in every training step is costly. In the proposed method, using random forest and Jensen-Shannon divergence, the importance of each node is calculated once. Then, in the forward propagation steps, the importance of the nodes is propagated and used in the dropout mechanism. This method is evaluated and compared with some previously proposed dropout approaches using two different deep neural network architectures on the MNIST, NorB, CIFAR10, CIFAR100, SVHN, and ImageNet datasets. The results suggest that the proposed method has better accuracy with fewer nodes and better generalizability. Also, the evaluations show that the approach has comparable complexity with other approaches and its convergence time is low as compared with state-of-the-art methods.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2023.05.044DOI Listing

Publication Analysis

Top Keywords

deep neural
12
forward propagation
8
neural networks
8
random forest
8
dropout mechanism
8
proposed method
8
dropout
7
nodes
5
propagation dropout
4
dropout deep
4

Similar Publications

Rapidly detecting hydrogen leaks is critical for the safe large-scale implementation of hydrogen technologies. However, to date, no technically viable sensor solution exists that meets the corresponding response time targets under technically relevant conditions. Here, we demonstrate how a tailored long short-term transformer ensemble model for accelerated sensing (LEMAS) speeds up the response of an optical plasmonic hydrogen sensor by up to a factor of 40 and eliminates its intrinsic pressure dependence in an environment emulating the inert gas encapsulation of large-scale hydrogen installations by accurately predicting its response value to a hydrogen concentration change before it is physically reached by the sensor hardware.

View Article and Find Full Text PDF

Current neural network models of primate vision focus on replicating overall levels of behavioral accuracy, often neglecting perceptual decisions' rich, dynamic nature. Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an approximation that allows us to constrain the number of time steps an RNN takes to solve a task with human RTs.

View Article and Find Full Text PDF

Giant cell arteritis (GCA), a systemic vasculitis affecting large and medium-sized arteries, poses significant diagnostic and management challenges, particularly in preventing irreversible complications like vision loss. Recent advancements in artificial intelligence (AI) technologies, including machine learning (ML) and deep learning (DL), offer promising solutions to enhance diagnostic accuracy and optimize treatment strategies for GCA. This systematic review, conducted according to the PRISMA 2020 guidelines, synthesizes existing literature on AI applications in GCA care, with a focus on diagnostic accuracy, treatment outcomes, and predictive modeling.

View Article and Find Full Text PDF

Machine learning approaches including deep learning models have shown promising performance in the automatic detection of Parkinson's disease. These approaches rely on different types of data with voice recordings being the most used due to the convenient and non-invasive nature of data acquisition. Our group has successfully developed a novel approach that uses convolutional neural network with transfer learning to analyze spectrogram images of the sustained vowel /a/ to identify people with Parkinson's disease.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!