Child-directed print corpora enable systematic psycholinguistic investigations, but this research infrastructure is not available in many understudied languages. Moreover, researchers of understudied languages are dependent on manual tagging because precise automatized parsers are not yet available. One plausible way forward is to limit the intensive work to a small-sized corpus. However, with little systematic enquiry about approaches to corpus construction, it is unclear how robust a small corpus can be made. The current study examines the potential of a non-sequential sampling protocol for small corpus development (NSP-SCD) through a cross-corpora and within-corpus analysis. A corpus comprising 17,584 words was developed by applying the protocol to a larger corpus of 150,595 words from children's books for 3-to-10-year-olds. While the larger corpus will by definition have more instances of unique words and unique orthographic units, still, the selectively sampled small corpus approximated the larger corpus for lexical and orthographic diversity and was equivalent for orthographic representation and word length. Psycholinguistic complexity increased by book level and varied by parts of speech. Finally, in a robustness check of lexical diversity, the non-sequentially sampled small corpus was more efficient compared to a same-sized corpus constructed by simply using all sentences from a few books (402 books vs. seven books). If a small corpus must be used then non-sequential sampling from books stratified by book level makes the corpus statistics better approximate what is found in larger corpora. Overall, the protocol shows promise as a tool to advance the science of child language acquisition in understudied languages.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11133114 | PMC |
http://dx.doi.org/10.3758/s13428-024-02339-x | DOI Listing |
Neuro Oncol
December 2024
Department of Imaging and Pathology, KU Leuven, Leuven, Belgium.
Background: Hippocampal avoidance during prophylactic cranial irradiation (HA-PCI) is proposed to reduce neurocognitive decline, while preserving the benefits of PCI. We evaluated whether (HA-)PCI induces changes in white matter (WM) microstructure and whether sparing the hippocampus has an impact on preserving brain network topology. Additionally, we evaluated associations between topological metrics with hippocampal volume and neuropsychological outcomes.
View Article and Find Full Text PDFEnviron Res
December 2024
School of Environmental Ecology and Biological Engineering, Institute of Changjiang Water Environment and Ecological Security, Key Laboratory for Green Chemical Process of Ministry of Education, Hubei Key Laboratory of Novel Reactor and Green Chemical Technology, Wuhan Institute of Technology, Wuhan, 430205, China. Electronic address:
Aquaculture systems contribute to atmospheric NO, but the magnitude of this NO source is largely uncertain. Here, we synthesized data from 139 aquaculture sites based on 59 peer-reviewed publications, and estimated that China's aquaculture systems emitted 9.68 Gg N yr (4.
View Article and Find Full Text PDFBMC Vet Res
December 2024
Department of Animal Science, Faculty of Agriculture, University of Peradeniya, Peradeniya, 20400, Sri Lanka.
Background: Reproductive efficiency is paramount in the dairy industry, where early pregnancy detection of dairy cows will allow to detect the non-pregnant animals early, thus enabling to re-synchronize them and getting them pregnant leading to decrease in calving interval, which, in turn, is critical for maximizing productivity and economic gain. The objective of this study was to evaluate the colour Doppler ultrasonography (CDUS) and peripheral blood leukocytes (PBLs)-based pregnancy-associated biomarker mRNAs expression for the earliest detection of pregnancy status in the dairy cows at post insemination. Intensively managed animals were ovulation synchronized and subjected to timed artificial insemination (TAI).
View Article and Find Full Text PDFChem Sci
December 2024
Department of Intelligent Systems, Delft University of Technology Delft 2629 HZ The Netherlands
The demand for innovative synthetic polymers with improved properties is high, but their structural complexity and vast design space hinder rapid discovery. Machine learning-guided molecular design is a promising approach to accelerate polymer discovery. However, the scarcity of labeled polymer data and the complex hierarchical structure of synthetic polymers make generative design particularly challenging.
View Article and Find Full Text PDFCell Tissue Res
December 2024
Department of Reproduction Biology, Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315, Berlin, Germany.
The study aimed to establish a long-term 3D cell culture model using luteinized follicular cells to investigate the functionality and life cycle of the CL in felids. A mixture of cell types from antral follicles was luteinized in vitro and cultured for up to 23 days. The method, initially applied to the domestic cat, was later extended to Persian and Clouded leopards.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!