This work describes the development of a list of monolingual word alignments taken from parallel Russian simplification data. This word lists can be used in such lexical simplification tasks as rule-based simplification applications and lexically constrained decoding for neural machine translation models. Moreover, they constitute a valuable source of information for developing educational materials for teaching Russian as a second/foreign language. In this work, a word list was compiled automatically and post-edited by human experts. The resulting list contains 1409 word pairs in which each "complex" word has an equivalent "simpler" (shorter, more frequent, modern, international) synonym. We studied the contents of the word list by comparing the frequencies of the words in the pairs and their levels in the special CEFR-graded vocabulary lists for learners of Russian as a foreign language. The evaluation demonstrated that lexical simplification by means of single-word synonym replacement does not occur often in the adapted texts. The resulting list also illustrates the peculiarities of the lexical simplification task for L2 learners, such as the choice of a less frequent but international word.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9510348PMC
http://dx.doi.org/10.3389/frai.2022.984759DOI Listing

Publication Analysis

Top Keywords

lexical simplification
12
word
8
word alignments
8
alignments parallel
8
parallel russian
8
russian simplification
8
simplification data
8
word list
8
simplification
6
list
5

Similar Publications

Article Synopsis
  • - This study explores how digital platforms affect user behavior in terms of polarization, misinformation, and news consumption by analyzing 34 years' worth of online comments.
  • - Researchers examined about 300 million comments from eight different platforms to assess the complexity and changes in language use over time.
  • - Findings indicate a general trend of shorter, less rich comments across platforms, while users still introduce new vocabulary consistently, suggesting changes in language are more due to social influences than just platform effects.
View Article and Find Full Text PDF

Infant-directed speech (IDS) is known to be characterised by phonetic and prosodic cues along with reduced vocabulary and syntax compared to adult-directed speech (ADS). However, there is considerable variation between mothers in the degree of lexical and syntactic reduction of their IDS. The present study aims to investigate the correspondences of the inter-individual variation of maternal IDS at 6 and 18 months with infants' language development at 18 months.

View Article and Find Full Text PDF

We assessed phonological and apraxic impairments in Hindi persons with aphasia (PwA) and compared them to Italian PwA reported in previous studies. Overall, we found strong similarities. Phonological errors were present across production tasks (repetition, reading and naming), most errors were non-lexical and, among those, a majority involved individual phonemes.

View Article and Find Full Text PDF

Features of lexical complexity: insights from L1 and L2 speakers.

Front Artif Intell

November 2023

School of Computing, George Mason University, Fairfax, VA, United States.

We discover sizable differences between the lexical complexity assignments of first language (L1) and second language (L2) English speakers. The complexity assignments of 940 shared tokens without context were extracted and compared from three lexical complexity prediction (LCP) datasets: the CompLex dataset, the Word Complexity Lexicon, and the CERF-J wordlist. It was found that word frequency, length, syllable count, familiarity, and prevalence as well as a number of derivations had a greater effect on perceived lexical complexity for L2 English speakers than they did for L1 English speakers.

View Article and Find Full Text PDF

MeaningBERT: assessing meaning preservation between sentences.

Front Artif Intell

September 2023

Group for Research in Artificial Intelligence of Laval University, Department of Computer Science and Software Engineering, Université Laval, Québec, QC, Canada.

In the field of automatic text simplification, assessing whether or not the meaning of the original text has been preserved during simplification is of paramount importance. Metrics relying on n-gram overlap assessment may struggle to deal with simplifications which replace complex phrases with their simpler paraphrases. Current evaluation metrics for meaning preservation based on large language models (LLMs), such as BertScore in machine translation or QuestEval in summarization, have been proposed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!