K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics.

Behav Res Methods

Department of Linguistics and Cognitive Science, Hankuk University of Foreign Studies, Mohyeon, Yongin, Gyeonggi, 17035, South Korea.

Published: October 2017

This article presents K-SPAN (Korean Surface Phonetics and Neighborhoods), a database of surface phonetic forms and several measures of phonological neighborhood density for 63,836 Korean words. Currently publicly available Korean corpora are limited by the fact that they only provide orthographic representations in Hangeul, which is problematic since phonetic forms in Korean cannot be reliably predicted from orthographic forms. We describe the method used to derive the surface phonetic forms from a publicly available orthographic corpus of Korean, and report on several statistics calculated using this database; namely, segment unigram frequencies, which are compared to previously reported results, along with segment-based and syllable-based neighborhood density statistics for three types of representation: an "orthographic" form, which is a quasi-phonological representation, a "conservative" form, which maintains all known contrasts, and a "modern" form, which represents the pronunciation of contemporary Seoul Korean. These representations are rendered in an ASCII-encoded scheme, which allows users to query the corpus without having to read Korean orthography, and permits the calculation of a wide range of phonological measures.

Download full-text PDF

Source
http://dx.doi.org/10.3758/s13428-016-0836-8DOI Listing

Publication Analysis

Top Keywords

phonetic forms
16
surface phonetic
12
neighborhood density
12
korean
8
korean surface
8
phonological neighborhood
8
density statistics
8
forms
5
k-span lexical
4
lexical database
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!