Multilevel linguistic features have been proposed for discourse analysis, but there have been few applications of multilevel linguistic features to readability models and also few validations of such models. Most traditional readability formulae are based on generalized linear models (GLMs; e.g., discriminant analysis and multiple regression), but these models have to comply with certain statistical assumptions about data properties and include all of the data in formulae construction without pruning the outliers in advance. The use of such readability formulae tends to produce a low text classification accuracy, while using a support vector machine (SVM) in machine learning can enhance the classification outcome. The present study constructed readability models by integrating multilevel linguistic features with SVM, which is more appropriate for text classification. Taking the Chinese language as an example, this study developed 31 linguistic features as the predicting variables at the word, semantic, syntax, and cohesion levels, with grade levels of texts as the criterion variable. The study compared four types of readability models by integrating unilevel and multilevel linguistic features with GLMs and an SVM. The results indicate that adopting a multilevel approach in readability analysis provides a better representation of the complexities of both texts and the reading comprehension process.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3758/s13428-014-0459-x | DOI Listing |
Ear Hear
January 2025
San Francisco Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, California, USA.
Objectives: Cochlear implant (CI) user functional outcomes are challenging to predict because of the variability in individual anatomy, neural health, CI device characteristics, and linguistic and listening experience. Machine learning (ML) techniques are uniquely poised for this predictive challenge because they can analyze nonlinear interactions using large amounts of multidimensional data. The objective of this article is to systematically review the literature regarding ML models that predict functional CI outcomes, defined as sound perception and production.
View Article and Find Full Text PDFBehav Res Methods
January 2025
CogNosco Lab, Department of Psychology and Cognitive Sciences, University of Trento, Trento, Italy.
We introduce EmoAtlas, a computational library/framework extracting emotions and syntactic/semantic word associations from texts. EmoAtlas combines interpretable artificial intelligence (AI) for syntactic parsing in 18 languages and psychologically validated lexicons for detecting the eight emotions in Plutchik's theory. We show that EmoAtlas can match or surpass transformer-based natural language processing techniques, BERT or large language models like ChatGPT 3.
View Article and Find Full Text PDFSensors (Basel)
January 2025
SHCCIG Yubei Coal Industry Co., Ltd., Xi'an 710900, China.
The coal mining industry in Northern Shaanxi is robust, with a prevalent use of the local dialect, known as "Shapu", characterized by a distinct Northern Shaanxi accent. This study addresses the practical need for speech recognition in this dialect. We propose an end-to-end speech recognition model for the North Shaanxi dialect, leveraging the Conformer architecture.
View Article and Find Full Text PDFRadiother Oncol
January 2025
Department of Radiation Oncology, Stanford University, Stanford, CA, United States. Electronic address:
Background And Purpose: Radiation therapy (RT) is highly effective, but its success depends on accurate, manual target delineation, which is time-consuming, labor-intensive, and prone to variability. Despite AI advancements in auto-contouring normal tissues, accurate RT target volume delineation remains challenging. This study presents Radformer, a novel visual language model that integrates text-rich clinical data with medical imaging for accurate automated RT target volume delineation.
View Article and Find Full Text PDFBrief Bioinform
November 2024
Suzhou Key Lab of Multi-modal Data Fusion and Intelligent Healthcare, No. 1188 Wuzhong Avenue, Wuzhong District Suzhou, Suzhou 215004, China.
The automatic and accurate extraction of diverse biomedical relations from literature constitutes the core elements of medical knowledge graphs, which are indispensable for healthcare artificial intelligence. Currently, fine-tuning through stacking various neural networks on pre-trained language models (PLMs) represents a common framework for end-to-end resolution of the biomedical relation extraction (RE) problem. Nevertheless, sequence-based PLMs, to a certain extent, fail to fully exploit the connections between semantics and the topological features formed by these connections.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!