Motivation: DNA repeats are a common feature of most genomic sequences. Their de novo identification is still difficult despite being a crucial step in genomic analysis and oligonucleotides design. Several efficient algorithms based on word counting are available, but too short words decrease specificity while long words decrease sensitivity, particularly in degenerated repeats.
Results: The Repeat Analysis Program (RAP) is based on a new word-counting algorithm optimized for high resolution repeat identification using gapped words. Many different overlapping gapped words can be counted at the same genomic position, thus producing a better signal than the single ungapped word. This results in better specificity both in terms of low-frequency detection, being able to identify sequences repeated only once, and highly divergent detection, producing a generally high score in most intron sequences.
Availability: The program is freely available for non-profit organizations, upon request to the authors.
Contact: giorgio.valle@unipd.it
Supplementary Information: The program has been tested on the Caenorhabditis elegans genome using word lengths of 12, 14 and 16 bases. The full analysis has been implemented in the UCSC Genome Browser and is accessible at http://genome.cribi.unipd.it.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/bti039 | DOI Listing |
Rev Esp Patol
January 2025
Laboratory of Genomics and Human Genetics, Pasteur Institute of Morocco, Casablanca, Morocco. Electronic address:
Early-onset Myopathy, Areflexia, Respiratory Distress and Dysphagia (EMARDD) is a congenital neuromuscular disease with a progressive muscle weakness, respiratory failure, joint contractures, and scoliosis without any symptoms of functional brain anomalies caused by variants in the MEGF10 gene. Here, we report the clinical phenotype and genetic features of a Moroccan patient who carries a novel variant associated with EMARDD on the MEGF10 gene. The Whole Exome Sequencing analysis conducted on a 11 year old boy with respiratory and swallowing difficulties revealed the presence of the novel variant c.
View Article and Find Full Text PDFFEMS Yeast Res
January 2025
Department of Life Sciences, Chalmers University of Technology, 412 58 Gothenburg, Sweden.
Yeast-based sensors have shown great applicability for deorphanization of G protein-coupled receptors (GPCRs) and screening of ligands targeting these. A GPCR of great interest is free fatty acid 2 receptor (FFA2R), for which short-chain fatty acids such as propionate and acetate are agonists. FFA2R regulates a wide array of downstream receptor signaling pathways in both adipose tissue and immune cells and has been recognized as a promising therapeutic target, having been implicated in several metabolic and inflammatory diseases.
View Article and Find Full Text PDFJ Integr Plant Biol
January 2025
Key Laboratory of Seed Innovation, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, 100101, China.
Plant oils play a crucial role in human nutrition, industrial applications and biofuel production. While the enzymes involved in fatty acid (FA) biosynthesis are well-studied, the regulatory networks governing these processes remain largely unexplored. This review explores the intricate regulatory networks modulating seed oil biosynthesis, focusing on key pathways and factors.
View Article and Find Full Text PDFNat Methods
January 2025
Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.
Single-cell proteomics (SCP) promises to revolutionize biomedicine by providing an unparalleled view of the proteome in individual cells. Here, we present a high-sensitivity SCP workflow named Chip-Tip, identifying >5,000 proteins in individual HeLa cells. It also facilitated direct detection of post-translational modifications in single cells, making the need for specific post-translational modification-enrichment unnecessary.
View Article and Find Full Text PDFHeliyon
July 2024
Department of Breast Surgery, Institute of Breast Disease, Second Hospital of Dalian Medical University, Zhongshan Road, Dalian, 116023, Liaoning, China.
Identifying driver genes in cancer is a difficult task because of the heterogeneity of cancer as well as the complex interactions among genes. As sequencing data become more readily available, there is a growing need for detecting cancer driver genes based on statistical and mathematical modeling methods. Currently, plenty of driver gene identification algorithms have been published, but they fail to achieve consistent results.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!