Deep learning involves a difficult nonconvex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but the calculation of function, gradient, and Hessian is expensive. In particular, the communication and the synchronization cost may become a bottleneck. In this letter, we focus on situations where the model is distributedly stored and propose a novel distributed Newton method for training deep neural networks. By variable and feature-wise data partitions and some careful designs, we are able to explicitly use the Jacobian matrix for matrix-vector products in the Newton method. Some techniques are incorporated to reduce the running time as well as memory consumption. First, to reduce the communication cost, we propose a diagonalization method such that an approximate Newton direction can be obtained without communication between machines. Second, we consider subsampled Gauss-Newton matrices for reducing the running time as well as the communication cost. Third, to reduce the synchronization cost, we terminate the process of finding an approximate Newton direction even though some nodes have not finished their tasks. Details of some implementation issues in distributed environments are thoroughly investigated. Experiments demonstrate that the proposed method is effective for the distributed training of deep neural networks. Compared with stochastic gradient methods, it is more robust and may give better test accuracy.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/neco_a_01088 | DOI Listing |
NPJ Digit Med
January 2025
Neurofibromatosis Type 1 Center and Laboratory for Neurofibromatosis Type 1 Research, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.
Deep-learning models have shown promise in differentiating between benign and malignant lesions. Previous studies have primarily focused on specific anatomical regions, overlooking tumors occurring throughout the body with highly heterogeneous whole-body backgrounds. Using neurofibromatosis type 1 (NF1) as an example, this study developed highly accurate MRI-based deep-learning models for the early automated screening of malignant peripheral nerve sheath tumors (MPNSTs) against complex whole-body background.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Electrical Electronical Engineering, Yaşar University, Bornova, İzmir, Turkey.
We aimed to build a robust classifier for the MGMT methylation status of glioblastoma in multiparametric MRI. We focused on multi-habitat deep image descriptors as our basic focus. A subset of the BRATS 2021 MGMT methylation dataset containing both MGMT class labels and segmentation masks was used.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Biomedical Engineering, School of Life Science and Technology, Changchun University of Science and Technology, Changchun, 130022, China.
The cervical cell classification technique can determine the degree of cellular abnormality and pathological condition, which can help doctors to detect the risk of cervical cancer at an early stage and improve the cure and survival rates of cervical cancer patients. Addressing the issue of low accuracy in cervical cell classification, a deep convolutional neural network A2SDNet121 is proposed. A2SDNet121 takes DenseNet121 as the backbone network.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, No.119 South Fourth Ring West Road, Fengtai District 100070, Beijing, China.
Deep vein thrombosis (DVT) in patients undergoing endoscopic endonasal surgery remains underexplored, despite its potential impact on postoperative recovery. This study aimed to develop and validate a predictive nomogram for assessing the risk of lower-limb DVT in such patients without chemoprophylaxis. A retrospective analysis was conducted on 935 patients with postoperative lower-limb vein ultrasonography.
View Article and Find Full Text PDFAm J Orthod Dentofacial Orthop
February 2025
Department of Orthodontics, Faculty of Dentistry, Çanakkale Onsekiz Mart University, Çanakkale, Turkey.
Introduction: This study aimed to assess the precision of an open-source, clinician-trained, and user-friendly convolutional neural network-based model for automatically segmenting the mandible.
Methods: A total of 55 cone-beam computed tomography scans that met the inclusion criteria were collected and divided into test and training groups. The MONAI (Medical Open Network for Artificial Intelligence) Label active learning tool extension was used to train the automatic model.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!