Background: DNA sequences are pivotal for a wide array of research in biology. Large sequence databases, like GenBank, provide an amazing resource to utilize DNA sequences for large scale analyses. However, many sequence records on GenBank contain more than one gene or are portions of genomes. Inconsistencies in the way genes are annotated and the numerous synonyms a single gene may be listed under provide major challenges for extracting large numbers of subsequences for comparative analysis across taxa. At present, there is no easy way to extract portions from many GenBank accessions based on annotations where gene names may vary extensively.
Results: The R package allows users to extract sequences based on GenBank annotations through the ACNUC retrieval system given search terms of gene synonyms and accession numbers. extracts subsequences of interest and then writes them to a FASTA file for users to employ in their research endeavors.
Conclusion: FASTA files of extracted subsequences and accession tables generated by allow users to quickly find and extract subsequences from GenBank accessions. These sequences can then be incorporated in various analyses, like the construction of phylogenies to test a wide range of ecological and evolutionary hypotheses.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6034590 | PMC |
http://dx.doi.org/10.7717/peerj.5179 | DOI Listing |
Sci Rep
January 2025
College of computer science and technology, China University of Petroleum (East China), No.66 Changjiang West Road, Huangdao, Qingdao, 266580, Shandong, China.
Addressing the issues of inadequate information exchange among subsequences in the operational time series of water injection pumps, leading to low accuracy and high false alarm rates in anomaly detection, this paper proposes a multidimensional time series anomaly detection method for water injection pump operations, leveraging Long Short-Term Memory Autoencoder augmented with Attention Mechanism (LSTMA-AE) and mechanistic constraints. The LSTMA-AE framework encompasses three primary modules: a Time Feature Extraction Module (Encoder), an Attention Layer, and a Data Reconstruction Module (Decoder). The Encoder captures temporal dependencies and features within the input sequences, mapping the input data into a higher-dimensional space.
View Article and Find Full Text PDFComput Biol Chem
December 2024
Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea. Electronic address:
Binding affinity prediction has been considered as a fundamental task in drug discovery. Despite much effort to improve accuracy of binding affinity prediction, the prior work considered only macro-level features that can represent the characteristics of the whole architecture of a drug and a target protein, and the features from local structure of the drug and the protein tend to be lost. In this paper, we propose a deep learning model that can comprehensively extract the local features of both a drug and a target protein for accurate binding affinity prediction.
View Article and Find Full Text PDFJ Comput Chem
January 2025
Amity Centre for Artificial Intelligence, Amity University, Noida, India.
Predicting drug target binding affinity has huge relevance in Modern drug discovery and drug repositioning processes which assist doctors to come up with new drugs or even use the existing drugs for new target proteins. In silico models, using advanced deep learning techniques could further assist these prediction tasks by providing most prominent drug target pairs. Considering these factors, a deep learning based algorithmic framework is developed in this study to support drug target interaction prediction.
View Article and Find Full Text PDFJ Comput Biol
December 2024
Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan.
Often, bioinformatics uses summary sketches to analyze next-generation sequencing data, but most sketches are not well understood statistically. Under a simple mutation model, Blanca et al. analyzed complete sketches, that is, the complete set of unassembled -mers, from two closely related sequences.
View Article and Find Full Text PDFInt J Mol Sci
November 2024
School of Information, Guizhou University of Finance and Economics, Guiyang 550025, China.
Existing deep learning methods have shown outstanding performance in predicting drug-target interactions. However, they still have limitations: (1) the over-reliance on locally extracted features by some single encoders, with insufficient consideration of global features, and (2) the inadequate modeling and learning of local crucial interaction sites in drug-target interaction pairs. In this study, we propose a novel drug-target interaction prediction model called the Neural Fingerprint and Self-Attention Mechanism (NFSA-DTI), which effectively integrates the local information of drug molecules and target sequences with their respective global features.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!