Collecting parallel sentences from nonparallel data is a long-standing natural language processing research problem. In particular, parallel training sentences are very important for the quality of machine translation systems. While many existing methods have shown encouraging results, they cannot learn various alignment weights in parallel sentences. To address this issue, we propose a novel parallel hierarchical attention neural network which encodes monolingual sentences versus bilingual sentences and construct a classifier to extract parallel sentences. In particular, our attention mechanism structure can learn different alignment weights of words in parallel sentences. Experimental results show that our model can obtain state-of-the-art performance on the English-French, English-German, and English-Chinese dataset of BUCC 2017 shared task about parallel sentences' extraction.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7482026PMC
http://dx.doi.org/10.1155/2020/8823906DOI Listing

Publication Analysis

Top Keywords

parallel sentences
20
sentences
8
sentences nonparallel
8
parallel
8
parallel hierarchical
8
hierarchical attention
8
learn alignment
8
alignment weights
8
weights parallel
8
extracting parallel
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!