Purpose: Many Natural Language Processing (NLP) methods achieve greater performance when the input text is preprocessed to remove extraneous or unnecessary text. A technique known as text segmentation can facilitate this step by isolating key sections from a document. Give that transformer models-such as Bidirectional Encoder Representations from Transformers (BERT)-have demonstrated state-of-the-art performance on many NLP tasks, it is desirable to leverage such models for segmentation.
View Article and Find Full Text PDF