Background: Synthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic association studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literature-screening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM), a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, a free SVM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies.

Results: The data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the two-way z score method demonstrated the best screening performance, achieving 97.5% recall, 98.3% specificity and 31.9% precision in performance testing. Compared with the traditional screening process based on a complex PubMed query, the SVM tool reduced by about 90% the number of abstracts requiring individual review by the database curator. The tool also ascertained 47 articles that were missed by the traditional literature screening process during the 4-week test period. We examined the literature on genetic associations with preterm birth as an example. Compared with the traditional, manual process, the GAPscreener both reduced effort and improved accuracy.

Conclusion: GAPscreener is the first free SVM-based application available for screening the human genetic association literature in PubMed with high recall and specificity. The user-friendly graphical user interface makes this a practical, stand-alone application. The software can be downloaded at no charge.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2387176PMC
http://dx.doi.org/10.1186/1471-2105-9-205DOI Listing

Publication Analysis

Top Keywords

genetic association
20
human genetic
16
screening human
8
association literature
8
literature pubmed
8
support vector
8
vector machine
8
association studies
8
traditional literature
8
gapscreener free
8

Similar Publications

Therapeutic strategy for efficiently targeting cancer cells needs an in-depth understanding of the cellular and molecular interplay in the tumor microenvironment (TME). TME comprises heterogeneous cells clustered together to translate tumor initiation, migration, and proliferation. The TME mainly comprises proliferating tumor cells, stromal cells, blood vessels, lymphatic vessels, cancer-associated fibroblasts (CAFs), extracellular matrix (ECM), and cancer stem cells (CSC).

View Article and Find Full Text PDF

WDR74-Mediated Ribosome Biogenesis and Proteome Dynamics During Mouse Preimplantation Development.

Genes Cells

January 2025

Advanced Biological Information Research Division, INAMORI Frontier Research Center, Kyushu University, Fukuoka, Japan.

Preimplantation embryonic development is orchestrated by dynamic changes in the proteome and transcriptome, regulated by mechanisms such as maternal-to-zygotic transition. Here, we employed label-free quantitative proteomics to comprehensively analyze proteome dynamics from germinal vesicle oocytes to blastocysts in mouse embryos. We identified 3490 proteins, including 715 consistently detected across all stages, revealing stage-specific changes in proteins associated with translation, protein modification, and mitochondrial metabolism.

View Article and Find Full Text PDF

Comparative transcriptome and metabolome analysis of sweet potato ( (L.) Lam.) tuber development.

Front Plant Sci

January 2025

Institute of Food Crops, Hainan Academy of Agricultural Sciences/Hainan Key Laboratory of Crop Genetics and Breeding, Haikou, China.

Introduction: Sweet potato is an important food, feed and industrial raw material, and its tubers are rich in starch, carotenoids and anthocyanins.

Methods: To elucidate the gene expression regulation and metabolic characteristics during the development of sweet potato tubers, transcriptomic and metabolomic analyses were performed on the tubers of three different sweet potato varieties at three developmental stages (70, 100, and 130 days (d)).

Results: RNA-seq analysis revealed that 16,303 differentially expressed genes (DEGs) were divided into 12 clusters according to their expression patterns, and the pathways of each cluster were annotated.

View Article and Find Full Text PDF

Cold stress during the seedling stage significantly threatens rice ( L.) production, specifically in temperate climates. This study aimed to identify quantitative trait loci (QTLs) associated with cold tolerance at the seedling stage.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!