The paper introduces a testing framework for the evaluation and validation of text line segmentation algorithms. Text line segmentation represents the key action for correct optical character recognition. Many of the tests for the evaluation of text line segmentation algorithms deal with text databases as reference templates. Because of the mismatch, the reliable testing framework is required. Hence, a new approach to a comprehensive experimental framework for the evaluation of text line segmentation algorithms is proposed. It consists of synthetic multi-like text samples and real handwritten text as well. Although the tests are mutually independent, the results are cross-linked. The proposed method can be used for different types of scripts and languages. Furthermore, two different procedures for the evaluation of algorithm efficiency based on the obtained error type classification are proposed. The first is based on the segmentation line error description, while the second one incorporates well-known signal detection theory. Each of them has different capabilities and convenience, but they can be used as supplements to make the evaluation process efficient. Overall the proposed procedure based on the segmentation line error description has some advantages, characterized by five measures that describe measurement procedures.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3231474PMC
http://dx.doi.org/10.3390/s110908782DOI Listing

Publication Analysis

Top Keywords

text segmentation
20
segmentation algorithms
16
evaluation text
12
approach comprehensive
8
text
8
testing framework
8
framework evaluation
8
based segmentation
8
segmentation error
8
error description
8

Similar Publications

Recent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the transformer architecture in natural language processing, has emerged as useful techniques for discriminating retinal images.

View Article and Find Full Text PDF

Post-traumatic stress and major depressive disorders are associated with "overgeneral" autobiographical memory, or impaired recall of specific life events. Interpersonal trauma exposure, a risk factor for both conditions, may influence how symptomatic trauma-exposed (TE) individuals segment everyday events. The ability to parse experience into units (event segmentation) supports memory.

View Article and Find Full Text PDF

Deep learning methods have been widely used for various glioma predictions. However, they are usually task-specific, segmentation-dependent and lack of interpretable biomarkers. How to accurately predict the glioma histological grade and molecular subtypes at the same time and provide reliable imaging biomarkers is still challenging.

View Article and Find Full Text PDF

Large language models in cancer: potentials, risks, and safeguards.

BJR Artif Intell

January 2025

Department of Machine Learning, Moffitt Cancer Center, Tampa, FL, Moffitt Cancer Center and Research Institute, Tampa, FL 33612, United States.

This review examines the use of large language models (LLMs) in cancer, analysing articles sourced from PubMed, Embase, and Ovid Medline, published between 2017 and 2024. Our search strategy included terms related to LLMs, cancer research, risks, safeguards, and ethical issues, focusing on studies that utilized text-based data. 59 articles were included in the review, categorized into 3 segments: quantitative studies on LLMs, chatbot-focused studies, and qualitative discussions on LLMs on cancer.

View Article and Find Full Text PDF

TExCNN: Leveraging Pre-Trained Models to Predict Gene Expression from Genomic Sequences.

Genes (Basel)

December 2024

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China.

Background/objectives: Understanding the relationship between DNA sequences and gene expression levels is of significant biological importance. Recent advancements have demonstrated the ability of deep learning to predict gene expression levels directly from genomic data. However, traditional methods are limited by basic word encoding techniques, which fail to capture the inherent features and patterns of DNA sequences.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!