Publications by authors named "Robert Forkel"

Speech consists of a continuous stream of acoustic signals, yet humans can segment words and other constituents from each other with astonishing precision. The acoustic properties that support this process are not well understood and remain understudied for the vast majority of the world's languages, in particular regarding their potential variation. Here we report cross-linguistic evidence for the lengthening of word-initial consonants across a typologically diverse sample of 51 languages.

View Article and Find Full Text PDF

Wurm & Hattori's Language Atlas of the Pacific Area describes the geographic speaker areas of the languages and language varieties spoken in the Pacific. Thanks to the efforts of the Electronic Cultural Atlas Initiative, this monumental piece of work has been available in digital form for over 15 years. But lacking proper identification of language varieties, this digitized data was largely unusable for today's research methods.

View Article and Find Full Text PDF

Every human has a body. Yet, languages differ in how they divide the body into parts to name them. While universal naming strategies exist, there is also variation in the vocabularies of body parts across languages.

View Article and Find Full Text PDF

Language comparison requires user-friendly tools that facilitate the standardization of linguistic data. We present two resources built on the basis of a standardized cross-linguistic format and show how the data is curated and extended. The first resource, the Concepticon, is a reference catalog for standardized concepts from linguistic research.

View Article and Find Full Text PDF

Although the peopling of Remote Oceania is well-documented as a general process of eastward migrations from Island Southeast Asia and Near Oceania toward the archipelagos of Remote Oceania, the origin and the development of Polynesian societies in the Western Pacific (Polynesian Outliers), far away from the Polynesian triangle, remain unclear. Here, we present a large-scale geochemical sourcing study of stone artifacts excavated from archeological sites in central Vanuatu, the Solomon Islands, and the Caroline Islands and provide unambiguous evidence of multiple long-distance voyages, with exotic stone materials being transported up to 2500 kilometers from their source. Our results emphasize high mobility in the Western Pacific during the last millennium CE and offer insights on the scale and timing of contacts between the Polynesian Outliers, their neighbors in the Western Pacific, and societies of Western Polynesia.

View Article and Find Full Text PDF

While global patterns of human genetic diversity are increasingly well characterized, the diversity of human languages remains less systematically described. Here, we outline the Grambank database. With over 400,000 data points and 2400 languages, Grambank is the largest comparative grammatical database available.

View Article and Find Full Text PDF

Human history is written in both our genes and our languages. The extent to which our biological and linguistic histories are congruent has been the subject of considerable debate, with clear examples of both matches and mismatches. To disentangle the patterns of demographic and cultural transmission, we need a global systematic assessment of matches and mismatches.

View Article and Find Full Text PDF

Although lexical borrowing is an important aspect of language evolution, there have been few attempts to automate the identification of borrowings in lexical datasets. Moreover, none of the solutions which have been proposed so far identify borrowings across multiple languages. This study proposes a new method for the task and tests it on a newly compiled large comparative dataset of 48 South-East Asian languages from Southern China.

View Article and Find Full Text PDF

Psychologists and linguists collect various data on word and concept properties. In psychology, scholars have accumulated norms and ratings for a large number of words in languages with many speakers. In linguistics, scholars have accumulated cross-linguistic information about the relations between words and concepts.

View Article and Find Full Text PDF

Compositional analyses have long been used to determine the geological sources of artefacts. Geochemical "fingerprinting" of artefacts and sources is the most effective way to reconstruct strategies of raw material and artefact procurement, exchange or interaction systems, and mobility patterns during prehistory. The efficacy and popularity of geochemical sourcing has led to many projects using various analytical techniques to produce independent datasets.

View Article and Find Full Text PDF

Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven questions can now be investigated. Such advances, however, are bringing high requirements in terms of rigorousness for preparing and curating datasets.

View Article and Find Full Text PDF

Many human languages have words for emotions such as "anger" and "fear," yet it is not clear whether these emotions have similar meanings across languages, or why their meanings might vary. We estimate emotion semantics across a sample of 2474 spoken languages using "colexification"-a phenomenon in which languages name semantically related concepts with the same word. Analyses show significant variation in networks of emotion concept colexification, which is predicted by the geographic proximity of language families.

View Article and Find Full Text PDF

The amount of available digital data for the languages of the world is constantly increasing. Unfortunately, most of the digital data are provided in a large variety of formats and therefore not amenable for comparison and re-use. The Cross-Linguistic Data Formats initiative proposes new standards for two basic types of data in historical and typological language comparison (word lists, structural datasets) and a framework to incorporate more data types (e.

View Article and Find Full Text PDF

We present a new open source software tool called BEASTling, designed to simplify the preparation of Bayesian phylogenetic analyses of linguistic data using the BEAST 2 platform. BEASTling transforms comparatively short and human-readable configuration files into the XML files used by BEAST to specify analyses. By taking advantage of Creative Commons-licensed data from the Glottolog language catalog, BEASTling allows the user to conveniently filter datasets using names for recognised language families, to impose monophyly constraints so that inferred language trees are backward compatible with Glottolog classifications, or to assign geographic location data to languages for phylogeographic analyses.

View Article and Find Full Text PDF