Evaluating prose style transfer with the Bible.

R Soc Open Sci

Department of Computer Science, Dartmouth College Hanover, NH 03755, USA.

Published: October 2018

In the prose style transfer task a system, provided with text input and a target prose style, produces output which preserves the meaning of the input text but alters the style. These systems require parallel data for evaluation of results and usually make use of parallel data for training. Currently, there are few publicly available corpora for this task. In this work, we identify a high-quality source of aligned, stylistically distinct text in different versions of the Bible. We provide a standardized split, into training, development and testing data, of the public domain versions in our corpus. This corpus is highly parallel since many Bible versions are included. Sentences are aligned due to the presence of chapter and verse numbers within all versions of the text. In addition to the corpus, we present the results, as measured by the BLEU and PINC metrics, of several models trained on our data which can serve as baselines for future research. While we present these data as a style transfer corpus, we believe that it is of unmatched quality and may be useful for other natural language tasks as well.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6227951PMC
http://dx.doi.org/10.1098/rsos.171920DOI Listing

Publication Analysis

Top Keywords

prose style
12
style transfer
12
parallel data
8
style
5
data
5
evaluating prose
4
transfer bible
4
bible prose
4
transfer task
4
task system
4

Similar Publications

A publishing infrastructure for Artificial Intelligence (AI)-assisted academic authoring.

J Am Med Inform Assoc

September 2024

Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States.

Objective: Investigate the use of advanced natural language processing models to streamline the time-consuming process of writing and revising scholarly manuscripts.

Materials And Methods: For this purpose, we integrate large language models into the Manubot publishing ecosystem to suggest revisions for scholarly texts. Our AI-based revision workflow employs a prompt generator that incorporates manuscript metadata into templates, generating section-specific instructions for the language model.

View Article and Find Full Text PDF
Article Synopsis
  • This study aimed to assess women's understanding of the benefits of adjuvant endocrine therapy by developing and testing different question types to measure perceived overall survival (OS) benefits.
  • Conducted from August 2022 to March 2023, the research involved qualitative interviews and focus groups to refine questions, resulting in three effective modified questions that combined graphical and prose styles.
  • Findings revealed that patients significantly overestimated their 10-year OS benefit (42%) compared to the PREDICT model’s actual estimate (4.4%), highlighting a need for varied question types to enhance understanding, especially among underrepresented groups.
View Article and Find Full Text PDF

With the development of science, speech, picture, and other analysis, problems have been gradually better solved, but the study of Chinese text has been a complex problem to overcome. Chinese text analysis requires not only statistics but also semantic comprehension analysis. Different text types need other language style feature modeling to obtain good recognition results.

View Article and Find Full Text PDF

Intrinsically disordered regions (IDRs) are ubiquitous across all domains of life and play a range of functional roles. While folded domains are generally well described by a stable three-dimensional structure, IDRs exist in a collection of interconverting states known as an ensemble. This structural heterogeneity means that IDRs are largely absent from the Protein Data Bank, contributing to a lack of computational approaches to predict ensemble conformational properties from sequence.

View Article and Find Full Text PDF

Purpose: This mixed methods study developed multiple question types to understand and measure women's perceived benefit from adjuvant endocrine therapy. We hypothesis that patients do not understand this benefit and sought to develop the questions needed to test this hypothesis and obtain initial patient estimates.

Methods: From 8/2022 to 3/2023, qualitative interviews focused on assessing and modifying 9 initial varied question types asking about the overall survival (OS) benefit from adjuvant endocrine therapy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!