As Named Entity Recognition (NER) has been essential in identifying critical elements of unstructured content, generic NER tools remain limited in recognizing entities specific to a domain, such as drug use and public health. For such high-impact areas, accurately capturing relevant entities at a more granular level is critical, as this information influences real-world processes. On the other hand, training NER models for a specific domain without handcrafted features requires an extensive amount of labeled data, which is expensive in human effort and time. In this study, we employ distant supervision utilizing a domain-specific ontology to reduce the need for human labor and train models incorporating domain-specific (e.g., drug use) external knowledge to recognize domain specific entities. We capture entities related the drug use and their trends in government epidemiology reports, with an improvement of 8% in F1-score.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3233/SHTI220048 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!