Severity: Warning
Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 176
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Machine learning (ML) and Natural Language Processing (NLP) have achieved remarkable success in many fields and have brought new opportunities and high expectation in the analyses of medical data, of which the most common type is the massive free-text electronic medical records (EMR). However, the free EMR texts are lacking consistent standards, rich of private information, and limited in availability. Also, it is often hard to have a balanced number of samples for the types of diseases under study. These problems hinder the development of ML and NLP methods for EMR data analysis. To tackle these problems, we developed a model called Medical Text Generative Adversarial Network or mtGAN, to generate synthetic EMR text. It is based on the GAN framework and is trained by the REINFORCE algorithm. It takes disease tags as inputs and generates synthetic texts as EMRs for the corresponding diseases. We evaluate the model from micro-level, macro-level and application-level on a Chinese EMR text dataset. The results show that the method has a good capacity to fit real data and can generate realistic and diverse EMR samples. This provides a novel way to avoid potential leakage of patient privacy while still supply sufficient well-controlled cohort data for developing downstream ML and NLP methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCBB.2019.2948985 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!