Quantifying patterns of punctuation in modern Chinese prose.

Chaos

Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, ul. Radzikowskiego 152, 31-342 Kraków, Poland.

Published: February 2025

Recent research shows that punctuation patterns in texts exhibit universal features across languages. Analysis of Western classical literature reveals that the distribution of spaces between punctuation marks aligns with a discrete Weibull distribution, typically used in survival analysis. By extending this analysis to Chinese literature represented here by three notable contemporary works, it is shown that Zipf's law applies to Chinese texts similarly to Western texts, where punctuation patterns also improve adherence to the law. Additionally, the distance distribution between punctuation marks in Chinese texts follows the Weibull model, though larger spacing is less frequent than in English translations. Sentence-ending punctuation, representing sentence length, diverges more from this pattern, reflecting greater flexibility in sentence length. This variability supports the formation of complex, multifractal sentence structures, particularly evident in Gao Xingjian's Soul Mountain. These findings demonstrate that both Chinese and Western texts share universal punctuation and word distribution patterns, underscoring their broad applicability across languages.

Download full-text PDF

Source
http://dx.doi.org/10.1063/5.0248520DOI Listing

Publication Analysis

Top Keywords

punctuation patterns
8
punctuation marks
8
chinese texts
8
western texts
8
sentence length
8
punctuation
7
chinese
5
texts
5
quantifying patterns
4
patterns punctuation
4

Similar Publications

Punctuation Patterns in by James Joyce Are Largely Translation-Invariant.

Entropy (Basel)

February 2025

Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, 31-342 Kraków, Poland.

The complexity characteristics of texts written in natural languages are significantly related to the rules of punctuation. In particular, the distances between punctuation marks measured by the number of words quite universally follow the family of Weibull distributions known from survival analyses. However, the values of two parameters marking specific forms of these distributions distinguish specific languages.

View Article and Find Full Text PDF

Quantifying patterns of punctuation in modern Chinese prose.

Chaos

February 2025

Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, ul. Radzikowskiego 152, 31-342 Kraków, Poland.

Recent research shows that punctuation patterns in texts exhibit universal features across languages. Analysis of Western classical literature reveals that the distribution of spaces between punctuation marks aligns with a discrete Weibull distribution, typically used in survival analysis. By extending this analysis to Chinese literature represented here by three notable contemporary works, it is shown that Zipf's law applies to Chinese texts similarly to Western texts, where punctuation patterns also improve adherence to the law.

View Article and Find Full Text PDF

Dyslexia-dysorthographia is a neurodevelopmental disorder in which the symptoms appear during the person's development (generally around the age of 7 or 8) and persist throughout life. The study of this written language disorder mainly focuses on children, principally in the clinical, cognitive science and neuroscience fields. The importance and originality of this study are that it investigates the impact of dyslexia-dysorthographia on written production in young adults (students) with dyslexia, from a psycholinguistic perspective.

View Article and Find Full Text PDF

Background: Body condition score (BCS) is a subjective tool and the deposition of subcutaneous fat differs from the deposition of abdominal fat.

Objective: The aim of this study was to evaluate by multivariate regression models and ultrasonography the amount of fat accumulated in different areas of the abdominal cavity in Jersey non-pregnant non-lactating cows and its association with their BCS.

Methods: From a commercial farm, 21 non-pregnant non-lactating Jersey cows were selected at random.

View Article and Find Full Text PDF

Multifractal Hopscotch in by Julio Cortázar.

Entropy (Basel)

August 2024

Complex Systems Theory Department, Institute of Nuclear Physics, Polish Academy of Sciences, 31-342 Kraków, Poland.

Punctuation is the main factor introducing correlations in natural language written texts and it crucially impacts their overall effectiveness, expressiveness, and readability. Punctuation marks at the end of sentences are of particular importance as their distribution can determine various complexity features of written natural language. Here, the sentence length variability (SLV) time series representing by Julio Cortázar are subjected to quantitative analysis with an attempt to identify their distribution type, long-memory effects, and potential multiscale patterns.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!