Reading Akkadian cuneiform using natural language processing.

PLoS One

Jacob M. Alkow Department of Archaeology and Ancient Near Eastern Civilizations, Tel Aviv University, Tel Aviv, Israel.

Published: December 2020

In this paper we present a new method for automatic transliteration and segmentation of Unicode cuneiform glyphs using Natural Language Processing (NLP) techniques. Cuneiform is one of the earliest known writing system in the world, which documents millennia of human civilizations in the ancient Near East. Hundreds of thousands of cuneiform texts were found in the nineteenth and twentieth centuries CE, most of which are written in Akkadian. However, there are still tens of thousands of texts to be published. We use models based on machine learning algorithms such as recurrent neural networks (RNN) with an accuracy reaching up to 97% for automatically transliterating and segmenting standard Unicode cuneiform glyphs into words. Therefore, our method and results form a major step towards creating a human-machine interface for creating digitized editions. Our code, Akkademia, is made publicly available for use via a web application, a python package, and a github repository.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592802PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0240511PLOS

Publication Analysis

Top Keywords

natural language
8
language processing
8
unicode cuneiform
8
cuneiform glyphs
8
cuneiform
5
reading akkadian
4
akkadian cuneiform
4
cuneiform natural
4
processing paper
4
paper method
4

Similar Publications

Music has long been recognized as a noninvasive and cost-effective means of reducing pain. However, the selection of music for pain relief often relies on intuition rather than on a scientific understanding of the impact of basic musical attributes on pain perception. This study examines how a fundamental element of music-tempo-affects its pain-relieving properties.

View Article and Find Full Text PDF

Use of ChatGPT Large Language Models to Extract Details of Recommendations for Additional Imaging From Free-Text Impressions of Radiology Reports.

AJR Am J Roentgenol

January 2025

Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, 1620 Tremont Street, Boston, MA 02120 Phone: 617-525-9702.

Automated extraction of actionable details of recommendations for additional imaging (RAIs) from radiology reports could facilitate tracking and timely completion of clinically necessary RAIs and thereby potentially reduce diagnostic delays. To assess the performance of large-language models (LLMs) in extracting actionable details of RAIs from radiology reports. This retrospective single-center study evaluated reports of diagnostic radiology examinations performed across modalities and care settings within five subspecialties (abdominal imaging, musculoskeletal imaging, neuroradiology, nuclear medicine, thoracic imaging) in August 2023.

View Article and Find Full Text PDF

The Sexual Abuse History Questionnaire (SAHQ), a widely used screening tool for childhood sexual abuse (CSA) and adolescent/adult sexual assault (AASA) experiences, has limited examination of its psychometric properties in diverse populations. Our study assessed the SAHQ's psychometric properties (i.e.

View Article and Find Full Text PDF

Background And Aims: Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement.

View Article and Find Full Text PDF

Clinical entity-aware domain adaptation in low resource setting for inflammatory bowel disease.

Front Artif Intell

January 2025

Language Intelligence and Information Retrieval (LIIR) Lab, Department of Computer Science, KU Leuven, Leuven, Belgium.

The digitization of healthcare records has revolutionized medical research and patient care, with electronic health records (EHRs) containing a wealth of structured and unstructured data. Extracting valuable information from unstructured clinical text presents a significant challenge, necessitating automated tools for efficient data mining. Natural language processing (NLP) methods have been pivotal in this endeavor, aiming to extract crucial clinical concepts embedded within free-form text.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!