Classifying scientific publications according to Field-of-Science taxonomies is of crucial importance, powering a wealth of relevant applications including Search Engines, Tools for Scientific Literature, Recommendation Systems, and Science Monitoring. Furthermore, it allows funders, publishers, scholars, companies, and other stakeholders to organize scientific literature more effectively, calculate impact indicators along Science Impact pathways and identify emerging topics that can also facilitate Science, Technology, and Innovation policy-making. As a result, existing classification schemes for scientific publications underpin a large area of research evaluation with several classification schemes currently in use.
View Article and Find Full Text PDFBackground And Objectives: Systematic reviews form the basis of evidence-based medicine, but are expensive and time-consuming to produce. To address this burden, we have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps.
Methods: Pythia incorporates a set of natural-language questions with machine-learning algorithms to rank all PubMed citations based on relevance, returning the 100 top-ranked citations for human screening.
In this paper, we describe a hierarchical bi-directional attention-based Re-current Neural Network (RNN) as a reusable sequence encoder architecture, which is used as sentence and document encoder for document classification. The sequence encoder is composed of two bi-directional RNN equipped with an attention mechanism that identifies and captures the most important elements, words or sentences, in a document followed by a dense layer for the classification task. Our approach utilizes the hierarchical nature of documents which are composed of sequences of sentences and sentences are composed of sequences of words.
View Article and Find Full Text PDF