Deep learning involves a difficult nonconvex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but the calculation of function, gradient, and Hessian is expensive. In particular, the communication and the synchronization cost may become a bottleneck. In this letter, we focus on situations where the model is distributedly stored and propose a novel distributed Newton method for training deep neural networks. By variable and feature-wise data partitions and some careful designs, we are able to explicitly use the Jacobian matrix for matrix-vector products in the Newton method. Some techniques are incorporated to reduce the running time as well as memory consumption. First, to reduce the communication cost, we propose a diagonalization method such that an approximate Newton direction can be obtained without communication between machines. Second, we consider subsampled Gauss-Newton matrices for reducing the running time as well as the communication cost. Third, to reduce the synchronization cost, we terminate the process of finding an approximate Newton direction even though some nodes have not finished their tasks. Details of some implementation issues in distributed environments are thoroughly investigated. Experiments demonstrate that the proposed method is effective for the distributed training of deep neural networks. Compared with stochastic gradient methods, it is more robust and may give better test accuracy.

Download full-text PDF

Source
http://dx.doi.org/10.1162/neco_a_01088DOI Listing

Publication Analysis

Top Keywords

deep neural
12
neural networks
12
distributed newton
8
distributed training
8
synchronization cost
8
newton method
8
training deep
8
running time
8
time well
8
communication cost
8

Similar Publications

A multicenter study of neurofibromatosis type 1 utilizing deep learning for whole body tumor identification.

NPJ Digit Med

January 2025

Neurofibromatosis Type 1 Center and Laboratory for Neurofibromatosis Type 1 Research, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.

Deep-learning models have shown promise in differentiating between benign and malignant lesions. Previous studies have primarily focused on specific anatomical regions, overlooking tumors occurring throughout the body with highly heterogeneous whole-body backgrounds. Using neurofibromatosis type 1 (NF1) as an example, this study developed highly accurate MRI-based deep-learning models for the early automated screening of malignant peripheral nerve sheath tumors (MPNSTs) against complex whole-body background.

View Article and Find Full Text PDF

We aimed to build a robust classifier for the MGMT methylation status of glioblastoma in multiparametric MRI. We focused on multi-habitat deep image descriptors as our basic focus. A subset of the BRATS 2021 MGMT methylation dataset containing both MGMT class labels and segmentation masks was used.

View Article and Find Full Text PDF

An automatic cervical cell classification model based on improved DenseNet121.

Sci Rep

January 2025

Department of Biomedical Engineering, School of Life Science and Technology, Changchun University of Science and Technology, Changchun, 130022, China.

The cervical cell classification technique can determine the degree of cellular abnormality and pathological condition, which can help doctors to detect the risk of cervical cancer at an early stage and improve the cure and survival rates of cervical cancer patients. Addressing the issue of low accuracy in cervical cell classification, a deep convolutional neural network A2SDNet121 is proposed. A2SDNet121 takes DenseNet121 as the backbone network.

View Article and Find Full Text PDF

Deep vein thrombosis (DVT) in patients undergoing endoscopic endonasal surgery remains underexplored, despite its potential impact on postoperative recovery. This study aimed to develop and validate a predictive nomogram for assessing the risk of lower-limb DVT in such patients without chemoprophylaxis. A retrospective analysis was conducted on 935 patients with postoperative lower-limb vein ultrasonography.

View Article and Find Full Text PDF

Assessment of deep learning technique for fully automated mandibular segmentation.

Am J Orthod Dentofacial Orthop

February 2025

Department of Orthodontics, Faculty of Dentistry, Çanakkale Onsekiz Mart University, Çanakkale, Turkey.

Introduction: This study aimed to assess the precision of an open-source, clinician-trained, and user-friendly convolutional neural network-based model for automatically segmenting the mandible.

Methods: A total of 55 cone-beam computed tomography scans that met the inclusion criteria were collected and divided into test and training groups. The MONAI (Medical Open Network for Artificial Intelligence) Label active learning tool extension was used to train the automatic model.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!