ConBGAT: a novel model combining convolutional neural networks, transformer and graph attention network for information extraction from scanned image.

PeerJ Comput Sci

Data Science Laboratory/Data Science Department/Faculty of Information Technology, Industrial University of Ho Chi Minh City, Ho Chi Minh, Vietnam.

Published: November 2024

AI Article Synopsis

  • Extracting information from scanned images is vital, but traditional methods often fail due to inadequate use of image and text features.
  • The new model, ConBGAT, combines convolutional neural networks (CNNs), Transformers, and graph attention networks to improve extraction efficiency and accuracy.
  • Extensive testing shows that ConBGAT outperforms existing methods and sets a new standard for scanned image information extraction.

Article Abstract

Extracting information from scanned images is a critical task with far-reaching practical implications. Traditional methods often fall short by inadequately leveraging both image and text features, leading to less accurate and efficient outcomes. In this study, we introduce ConBGAT, a cutting-edge model that seamlessly integrates convolutional neural networks (CNNs), Transformers, and graph attention networks to address these shortcomings. Our approach constructs detailed graphs from text regions within images, utilizing advanced Optical Character Recognition to accurately detect and interpret characters. By combining superior extracted features of CNNs for image and Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) for text, our model achieves a comprehensive and efficient data representation. Rigorous testing on real-world datasets shows that ConBGAT significantly outperforms existing methods, demonstrating its superior capability across multiple evaluation metrics. This advancement not only enhances accuracy but also sets a new benchmark for information extraction in scanned image.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622835PMC
http://dx.doi.org/10.7717/peerj-cs.2536DOI Listing

Publication Analysis

Top Keywords

convolutional neural
8
neural networks
8
graph attention
8
extraction scanned
8
scanned image
8
conbgat novel
4
novel model
4
model combining
4
combining convolutional
4
networks transformer
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!