Captioning is the process of assembling a description for an image. Previous research on captioning has usually focused on foreground objects. In captioning concepts, there are two main objects for discussion: background object and foreground object. In contrast to the previous image-captioning research, generating captions from the geological images of rocks is more focused on the background of the images. This study proposed image captioning using a convolutional neural network, long short-term memory, and word2vec to generate words from the image. The proposed model was constructed by a convolutional neural network (CNN), long short-term memory (LSTM), and word2vec and gave a dense output of 256 units. To make it properly grammatical, a sequence of predicted words was reconstructed into a sentence by the beam search algorithm with K = 3. An evaluation of the pre-trained baseline model VGG16 and our proposed CNN-A, CNN-B, CNN-C, and CNN-D models used BLEU score methods for the N-gram. The BLEU scores achieved for BLEU-1 using these models were 0.5515, 0.6463, 0.7012, 0.7620, and 0.5620, respectively. BLEU-2 showed scores of 0.6048, 0.6507, 0.7083, 0.8756, and 0.6578, respectively. BLEU-3 performed with scores of 0.6414, 0.6892, 0.7312, 0.8861, and 0.7307, respectively. Finally, BLEU-4 had scores of 0.6526, 0.6504, 0.7345, 0.8250, and 0.7537, respectively. Our CNN-C model outperformed the other models, especially the baseline model. Furthermore, there are several future challenges in studying captions, such as geological sentence structure, geological sentence phrase, and constructing words by a geological tagger.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9693370 | PMC |
http://dx.doi.org/10.3390/jimaging8110294 | DOI Listing |
Sci Data
January 2025
Shanghai Artificial Intelligence Research Institute Co., Ltd., Shanghai, 200240, China.
Academic data processing is crucial in scientometrics and bibliometrics, such as research trending analysis and citation recommendation. Existing datasets in this domain have predominantly concentrated on textual data, overlooking the importance of visual elements. To bridge this gap, we introduce a multidisciplinary multimodal aligned dataset (MMAD) specifically designed for academic data processing.
View Article and Find Full Text PDFJAMA Cardiol
January 2025
Department of Emergency Medicine, Rush University Medical Center, Chicago, Illinois.
Importance: Lung ultrasound (LUS) aids in the diagnosis of patients with dyspnea, including those with cardiogenic pulmonary edema, but requires technical proficiency for image acquisition. Previous research has demonstrated the effectiveness of artificial intelligence (AI) in guiding novice users to acquire high-quality cardiac ultrasound images, suggesting its potential for broader use in LUS.
Objective: To evaluate the ability of AI to guide acquisition of diagnostic-quality LUS images by trained health care professionals (THCPs).
J Environ Manage
February 2025
Department of Plant Biology and Ecology, University of Seville, Avda. Reina Mercedes S/n, Apartado de Correos, 1095, 41012, Sevilla, Spain. Electronic address:
Urban environments are usually polluted by anthropogenic activities like traffic, a major source of potentially toxic elements (PTEs), and ornamental plant species may reduce contamination by trapping traffic-related air pollutants in their leaves. The purpose of this study was tested the trapping pollutant capacity of four species commonly used in green areas of Seville city (SW Spain) to better choose species in urban green planning. Composition of particulate matter (PM) obtained from foliar surfaces (sPM) and wax-included (wPM) was determined by EDX-SEM analysis in samples from different city locations.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea.
Generating accurate and contextually rich captions for images and videos is essential for various applications, from assistive technology to content recommendation. However, challenges such as maintaining temporal coherence in videos, reducing noise in large-scale datasets, and enabling real-time captioning remain significant. We introduce MIRA-CAP (Memory-Integrated Retrieval-Augmented Captioning), a novel framework designed to address these issues through three core innovations: a cross-modal memory bank, adaptive dataset pruning, and a streaming decoder.
View Article and Find Full Text PDFIntern Emerg Med
January 2025
Emergency Department, National Institute of Medical Sciences and Nutrition Salvador Zubiran, Avenida Vasco de Quiróga No. 15, Colonia Belisario Domínguez Sección XVI, Alcaldía Tlalpan, CP 14080, Mexico City, Mexico.
The COVID-19 pandemic provided an ideal scenario for studying the care of the elderly population, we implemented a tool named the Geriatric Measure (GM) tool to determine the severity and need for hospitalization. The objective of the study is to evaluate if the results of a brief Geriatric Measure tool are associated with mortality and other outcomes among older adults with COVID-19 treated in the emergency department. Retrospective observational cohort study.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!