One of the fundamental questions about human language is whether all languages are equally complex. Here, we approach this question from an information-theoretic perspective. We present a large scale quantitative cross-linguistic analysis of written language by training a language model on more than 6500 different documents as represented in 41 multilingual text collections consisting of ~ 3.5 billion words or ~ 9.0 billion characters and covering 2069 different languages that are spoken as a native language by more than 90% of the world population. We statistically infer the entropy of each language model as an index of what we call average prediction complexity. We compare complexity rankings across corpora and show that a language that tends to be more complex than another language in one corpus also tends to be more complex in another corpus. In addition, we show that speaker population size predicts entropy. We argue that both results constitute evidence against the equi-complexity hypothesis from an information-theoretic perspective.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10505229 | PMC |
http://dx.doi.org/10.1038/s41598-023-42327-3 | DOI Listing |
MAGMA
January 2025
Aix Marseille Univ, CNRS, CRMBM, Marseille, France.
Objective: Segmentation of individual thigh muscles in MRI images is essential for monitoring neuromuscular diseases and quantifying relevant biomarkers such as fat fraction (FF). Deep learning approaches such as U-Net have demonstrated effectiveness in this field. However, the impact of reducing neural network complexity remains unexplored in the FF quantification in individual muscles.
View Article and Find Full Text PDFClin Rheumatol
January 2025
Biochemistry Department, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt.
The current study was deployed to evaluate the role of metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) and miR-155, along with the inflammatory markers, TNFα and IL-6, and the adhesion molecule, cluster of differentiation 106 (CD106), in Behçet's disease (BD) pathogenesis. The study also assessed MALAT1/miR-155 as promising diagnostic and prognostic biomarkers for BD. The current retrospective case-control study included 74 Egyptian BD patients and 50 age and sex-matched controls.
View Article and Find Full Text PDFMol Diagn Ther
January 2025
Istituto Europeo di Oncologia, IRCCS, Via Adamello 16, 20139, Milan, Italy.
Background: Predicting response to targeted cancer therapies increasingly relies on both simple and complex genetic biomarkers. Comprehensive genomic profiling using high-throughput assays must be evaluated for reproducibility and accuracy compared with existing methods.
Methods: This study is a multicenter evaluation of the Oncomine™ Comprehensive Assay Plus (OCA Plus) Pan-Cancer Research Panel for comprehensive genomic profiling of solid tumors.
Discov Oncol
January 2025
Department of Neurosurgery, China-Japan Union Hospital of Jilin University, Changchun, 130033, Jilin, China.
Nucleotide-binding oligomerization domain protein 1 (NOD1) is one of the innate immune receptors that has been associated with tumorigenesis and abnormally expressed in various cancers. However, the role of NOD1 in Glioblastoma Multiforme (GBM) has not been investigated. We used the Tumor Immune Estimate Resource (TIMER) database to compare the differential expression of NOD1 in various tumors.
View Article and Find Full Text PDFPharmacoeconomics
January 2025
Belgian Health Care Knowledge Centre, Brussels, Belgium.
Background: Forecasting future public pharmaceutical expenditure is a challenge for healthcare payers, particularly owing to the unpredictability of new market introductions and their economic impact. No best-practice forecasting methods have been established so far. The literature distinguishes between the top-down approach, based on historical trends, and the bottom-up approach, using a combination of historical and horizon scanning data.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!