Publications by authors named "John DeNero"

Objective: We develop natural language processing (NLP) methods capable of accurately classifying tumor attributes from pathology reports given minimal labeled examples. Our hierarchical cancer to cancer transfer (HCTC) and zero-shot string similarity (ZSS) methods are designed to exploit shared information between cancers and auxiliary class features, respectively, to boost performance using enriched annotations which give both location-based information and document level labels for each pathology report.

Materials And Methods: Our data consists of 250 pathology reports each for kidney, colon, and lung cancer from 2002 to 2019 from a single institution (UCSF).

View Article and Find Full Text PDF

Objective: We aim to build an accurate machine learning-based system for classifying tumor attributes from cancer pathology reports in the presence of a small amount of annotated data, motivated by the expensive and time-consuming nature of pathology report annotation. An enriched labeling scheme that includes the location of relevant information along with the final label is used along with a corresponding hierarchical method for classifying reports that leverages these enriched annotations.

Materials And Methods: Our data consists of 250 colon cancer and 250 kidney cancer pathology reports from 2002 to 2019 at the University of California, San Francisco.

View Article and Find Full Text PDF

Objective: Cancer is a leading cause of death, but much of the diagnostic information is stored as unstructured data in pathology reports. We aim to improve uncertainty estimates of machine learning-based pathology parsers and evaluate performance in low data settings.

Materials And Methods: Our data comes from the Urologic Outcomes Database at UCSF which includes 3232 annotated prostate cancer pathology reports from 2001 to 2018.

View Article and Find Full Text PDF

Background: Individuals are increasingly turning to search engines like Google to obtain health information and access resources. Analysis of Google search queries offers a novel approach, which is part of the methodological toolkit for infodemiology or infoveillance researchers, to understanding population health concerns and needs in real time or near-real time. While searches predominantly have been examined with the Google Trends website tool, newer application programming interfaces (APIs) are now available to academics to draw a richer landscape of searches.

View Article and Find Full Text PDF