An empirical study of large-scale data-driven full waveform inversion.

Peng Jin Yinan Feng Shihang Feng Hanchen Wang Yinpeng Chen Benjamin Consolvo Zicheng Liu Youzuo Lin

Sci Rep

School of Data Science and Society, The University of North Carolina at Chapel Hill, Chapel Hill, USA.

Published: August 2024

The study explores how big data influences deep learning models specifically focused on full waveform inversion (FWI) problems.
It utilizes the OPENFWI dataset, combining 10 subsets that contain 470,000 pairs of seismic data and velocity maps for training and evaluation.
Results show average performance boosts of over 13% in MAE and MSE, with a notable 28.60% improvement in generalization tests, highlighting the importance of scaling model capacity with data size.

This paper investigates the impact of big data on deep learning models to help solve the full waveform inversion (FWI) problem. While it is well known that big data can boost the performance of deep learning models in many tasks, its effectiveness has not been validated for FWI. To address this gap, we present an empirical study that investigates how deep learning models in FWI behave when trained on OPENFWI, a collection of large-scale, multi-structural, synthetic datasets published recently. In particular, we train and evaluate the FWI models on a combination of 10 2D subsets in OPENFWI that contain 470 K pairs of seismic data and velocity maps in total. Our experiments demonstrate that training on the combined dataset yields an average improvement of 13.03% in MAE, 7.19% in MSE and 1.87% in SSIM compared to each split dataset, and an average improvement of 28.60%, 21.55% and 8.22% in the leave-one-out generalization test. We further demonstrate that model capacity needs to scale in accordance with data size for optimal improvement, where our largest model yields an average improvement of 20.06%, 13.39% and 0.72% compared to the smallest one.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11358280	PMC
http://dx.doi.org/10.1038/s41598-024-68573-7	DOI Listing

Publication Analysis

Top Keywords

deep learning

learning models

average improvement

empirical study

full waveform

waveform inversion

big data

yields average

study large-scale

large-scale data-driven

Similar Publications

Automated detection of traumatic bleeding in CT images using 3D U-Net# and multi-organ segmentation.

Biomed Phys Eng Express

January 2025

Chiba University Center for Frontier Medical Engineering, 1-33 Yayoi-cho, Inage-ku, Chiba, Chiba, 263-8522, JAPAN.

Rizki Nurfauzi Ayaka Baba Taka-Aki Nakada Toshiya Nakaguchi Yukihiro Nomura

Traumatic injury remains a leading cause of death worldwide, with traumatic bleeding being one of its most critical and fatal consequences. The use of whole-body computed tomography (WBCT) in trauma management has rapidly expanded. However, interpreting WBCT images within the limited time available before treatment is particularly challenging for acute care physicians.

View Article and Find Full Text PDF

Similar Publications

Pedestrian POSE estimation using multi-branched deep learning pose net.

PLoS One

January 2025

Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon.

Muhammad Alyas Shahid Mudassar Raza Muhammad Sharif Reem Alshenaifi Seifedine Kadry

In human activity-recognition scenarios, including head and entire body pose and orientations, recognizing the pose and direction of a pedestrian is considered a complex problem. A person may be traveling in one sideway while focusing his attention on another side. It is occasionally desirable to analyze such orientation estimates using computer-vision tools for automated analysis of pedestrian behavior and intention.

View Article and Find Full Text PDF

Similar Publications

Identification of hypertension gene expression biomarkers based on the DeepGCFS algorithm.

PLoS One

January 2025

Shanghai Urban Construction Vocational College, Shanghai, China.

Zongjin Li Liqin Tian Libing Bai Zeyu Jia Xiaoming Wu

Hypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension biomarkers.

View Article and Find Full Text PDF

Similar Publications

Maize quality detection based on MConv-SwinT high-precision model.

PLoS One

January 2025

Engineering Research Center of Hydrogen Energy Equipment& Safety Detection, Universities of Shaanxi Province, Xijing University, Xi'an, China.

Ning Zhang Yuanqi Chen Enxu Zhang Ziyang Liu Jie Yue

The traditional method of corn quality detection relies heavily on the subjective judgment of inspectors and suffers from a high error rate. To address these issues, this study employs the Swin Transformer as an enhanced base model, integrating machine vision and deep learning techniques for corn quality assessment. Initially, images of high-quality, moldy, and broken corn were collected.

View Article and Find Full Text PDF

Similar Publications

Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools.

Bioinformatics

January 2025

Department of Biology, Emory University, Atlanta, GA 30322, United States.

R Prabakaran Y Bromberg

Motivation: In silico functional annotation of proteins is crucial to narrowing the sequencing-accelerated gap in our understanding of protein activities. Numerous function annotation methods exist, and their ranks have been growing, particularly so with the recent deep learning-based developments. However, it is unclear if these tools are truly predictive.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!