Background: The rapid advancements in deep neural network models have significantly enhanced the ability to extract features from microbial sequence data, which is critical for addressing biological challenges. However, the scarcity and complexity of labeled microbial data pose substantial difficulties for supervised learning approaches. To address these issues, we propose DNASimCLR, an unsupervised framework designed for efficient gene sequence data feature extraction.
Results: DNASimCLR leverages convolutional neural networks and the SimCLR framework, based on contrastive learning, to extract intricate features from diverse microbial gene sequences. Pre-training was conducted on two classic large scale unlabelled datasets encompassing metagenomes and viral gene sequences. Subsequent classification tasks were performed by fine-tuning the pretrained model using the previously acquired model. Our experiments demonstrate that DNASimCLR is at least comparable to state-of-the-art techniques for gene sequence classification. For convolutional neural network-based approaches, DNASimCLR surpasses the latest existing methods, clearly establishing its superiority over the state-of-the-art CNN-based feature extraction techniques. Furthermore, the model exhibits superior performance across diverse tasks in analyzing biological sequence data, showcasing its robust adaptability.
Conclusions: DNASimCLR represents a robust and database-agnostic solution for gene sequence classification. Its versatility allows it to perform well in scenarios involving novel or previously unseen gene sequences, making it a valuable tool for diverse applications in genomics.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11476100 | PMC |
http://dx.doi.org/10.1186/s12859-024-05955-8 | DOI Listing |
Plant Physiol
January 2025
Beijing Key Laboratory of Development and Quality Control of Ornamental Crops, Department of Ornamental Horticulture, China Agricultural University, Beijing 100193, China.
Trichomes play a crucial role in plant resistance to abiotic and biotic stresses, and their development and characteristics vary across different species. This study demonstrates that trichomes of Lilium pumilum exhibit synchronized growth during flower bud differentiation and enhance the plant's adaptability to UV-B radiation and aphid infection. We identified LpNAC48, a NAC family transcription factor (TF), that interacted with the B-box (BBX) family TF LpBBX28, during trichome formation in L.
View Article and Find Full Text PDFPhytopathology
January 2025
University of Florida, Microbiology & Cell Science, Cancer/Genetics Research Complex 302, 2033 Mowry Road, Gainesville, Florida, United States, 32610;
(L.) Moench is the fifth most important cereal crop and expected to gain prominence due to its versatility, low input requirements, and tolerance to hot and dry conditions. In warm and humid environments the productivity of sorghum is severely limited by the hemibiotrophic fungal pathogen , the causal agent of anthracnose.
View Article and Find Full Text PDFJCO Precis Oncol
January 2025
Department of Urology, Kyoto University School of Medicine, Kyoto, Japan.
Purpose: Circulating tumor DNA (ctDNA) analysis is an alternative to tissue biopsy for genotyping in various cancers. We aimed to establish a plasma ctDNA sequencing assay, then evaluate its clinical utility in advanced urothelial cancer (UC).
Materials And Methods: This study included 82 patients with muscle-invasive or metastatic UC.
Inflamm Bowel Dis
January 2025
Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, 1 Gustave L. Levy Pl, Box 1498, New York, NY 10029, USA.
Background: Clonal hematopoiesis of indeterminate potential (CHIP) is the presence of somatic mutations in myeloid and lymphoid malignancy genes in the blood cells of individuals without a hematologic malignancy. Inflammation is hypothesized to be a key mediator in the progression of CHIP to hematologic malignancy and patients with CHIP have a high prevalence of inflammatory diseases. This study aimed to identify the prevalence and characteristics of CHIP in patients with inflammatory bowel disease (IBD).
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2025
State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100080, China.
Various mature tissue-resident cells exhibit progenitor characteristics following injury. However, the existence of endogenous stem cells with multiple lineage potentials in the adult spinal cord remains a compelling area of research. In this study, we present a cross-species investigation that extends from development to injury.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!