This paper describes a method to compress molecular biology databases that are characterized by an increasing proportion of data derived from genome projects. The performance of our tool has been tested on various data files of the EMBL nucleotide sequence database. The best compression ratios were achieved on EST (Expressed Sequence Tags) data, typically derived from large-scale sequence projects. The compression of sequence database updates was tested in combination with the common Unix compression program 'compress'. Our tool improved the efficiency of 'compress' on average by 16%.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/11.2.219 | DOI Listing |
PLoS One
January 2025
Department of Reproductive Medicine, Guangzhou Women and Children's Medical center Liuzhou Hospital, Liuzhou, Guangxi, China.
Endometrial cancer (UCEC) is the most prevalent gynecological malignancy in high-income countries, and its incidence is rising globally. Although early-stage UCEC can be treated with surgery, advanced cases have a poor prognosis, highlighting the need for effective molecular biomarkers to improve diagnosis and prognosis. In this study, we analyzed mRNA and miRNA sequencing data from UCEC tissues and adjacent non-cancerous tissues from the TCGA database.
View Article and Find Full Text PDFPLoS One
January 2025
Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, Catanzaro, Italy.
Plant viruses pose a significant threat to global agriculture and require efficient tools for their timely detection. We present AutoPVPrimer, an innovative pipeline that integrates artificial intelligence (AI) and machine learning to accelerate the development of plant virus primers. The pipeline uses Biopython to automatically retrieve different genomic sequences from the NCBI database to increase the robustness of the subsequent primer design.
View Article and Find Full Text PDFPlant Dis
January 2025
Guizhou University, Jiaxiu South Street, Huaxi District, Guiyang, China, 550025;
Passion fruit (Passiflora edulis) is a commercially important crop known for its nutritional value, high antioxidant content, and use in beverages and desserts. Gulupa baciliform virus A (GBVA), tentatively named Badnavirus in the family Caulimoviridae, is a cryptic circular double-stranded DNA (dsDNA, ≈6,951 bps) virus recently reported in Colombia with asymptomatic infection of passion fruit (Sepúlveda et al. 2022).
View Article and Find Full Text PDFPlant Dis
January 2025
Wenshan University, Wenshan Sanqi Institute of Science and Technology, Wenshan, Yunnan, China;
The roots of Salvia yunnanensis, an herbaceous perennial widely distributed in Southwest China, is often used as a substitute for S. miltiorrhiza, a highly valued plant in traditional Chinese medicine (Wu et al. 2014).
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Department of Biochemistry, Kyushu University Graduate School of Medical Sciences, Maidashi 3-1-1, Higashi-Ku, Fukuoka 812-8582, Japan.
An enzyme with strong single-stranded DNA (ssDNA) ligation activity would be advantageous for many molecular biology applications. However, currently available enzymes exhibit only limited activity. Here, we identified an enzyme with strong ssDNA ligation activity upon searching the databases for proteins homologous to TS2126 RNA ligase, the known enzyme with the highest yet limited ssDNA ligation activity.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!