Background: The need is growing to create medical big data based on the electronic health records collected from different hospitals. Errors for sure occur and how to correct them should be explored.
Methods: Electronic health records of 9,197,817 patients and 53,081,148 visits, totaling about 500 million records for 2006-2016, were transmitted from eight hospitals into an integrated database.
Background: Identification of genes with ascending or descending monotonic expression patterns over time or stages of stem cells is an important issue in time-series microarray data analysis. We propose a method named Monotonic Feature Selector (MFSelector) based on a concept of total discriminating error (DEtotal) to identify monotonic genes. MFSelector considers various time stages in stage order (i.
View Article and Find Full Text PDFWe previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.
View Article and Find Full Text PDFExome sequencing (exome-seq) has aided in the discovery of a huge amount of mutations in cancers, yet challenges remain in converting oncogenomics data into information that is interpretable and accessible for clinical care. We constructed DriverDB (http://ngs.ym.
View Article and Find Full Text PDFMicroRNAs (miRNAs) are small RNAs ∼22 nt in length that are involved in the regulation of a variety of physiological and pathological processes. Advances in high-throughput small RNA sequencing (smRNA-seq), one of the next-generation sequencing applications, have reshaped the miRNA research landscape. In this study, we established an integrative database, the YM500 (http://ngs.
View Article and Find Full Text PDF