An algorithm is described for generation of the long sequence written in a four letter alphabet from the constituent k-tuple words in the minimal number of separate, randomly defined fragments of the starting sequence. It is primarily intended for use in sequencing by hybridization (SBH) process- a potential method for sequencing human genome DNA (Drmanac et al., Genomics 4, pp. 114-128, 1989). The algorithm is based on the formerly defined rules and informative entities of the linear sequence. The algorithm requires neither knowledge on the number of appearances of a given k-tuple in sequence fragments, nor the information on which k-tuple words are on the ends of a fragment. It operates with the mixed content of k-tuples of the various lengths. The concept of the algorithm enables operations with the k-tuple sets containing false positive and false negative k-tuples. The content of the false k-tuples primarily affects the completeness of the generated sequence, and its correctness in the specific cases only. The algorithm can be used for the optimization of SBH parameters in the simulation experiments, as well as for the sequence generation in the real SBH experiments on the genomic DNA.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/07391102.1991.10507867 | DOI Listing |
J Glob Health
January 2025
Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
Background: Recreational screen time (RST) has been found to be associated with cognitive decline and neurodegenerative diseases. However, the association between RST and age-related macular degeneration (AMD), an ocular neurodegenerative disease, is still unclear. We aimed to elucidate the association between RST and AMD.
View Article and Find Full Text PDFScience
January 2025
Department of Neuroscience, University of California, Berkeley, Berkeley, CA, USA.
The mechanisms by which the brain replays neural activity sequences remain unknown. Recording from large ensembles of hippocampal place cells in freely behaving rats, we observed that replay content is strictly organized over multiple timescales and governed by self-avoidance. After movement cessation, replays avoided the animal's previous path for 3 seconds.
View Article and Find Full Text PDFWe lack tools to edit DNA sequences at scales necessary to study 99% of the human genome that is noncoding. To address this gap, we applied CRISPR prime editing to insert recombination handles into repetitive sequences, up to 1697 per cell line, which enables generating large-scale deletions, inversions, translocations, and circular DNA. Recombinase induction produced more than 100 stochastic megabase-sized rearrangements in each cell.
View Article and Find Full Text PDFIn 2021, a year before ChatGPT took the world by storm amid the excitement about generative artificial intelligence (AI), AlphaFold 2 cracked the 50-year-old protein-folding problem, predicting three-dimensional (3D) structures for more than 200 million proteins from their amino acid sequences. This accomplishment was a precursor to an unprecedented burgeoning of large language models (LLMs) in the life sciences. That was just the beginning.
View Article and Find Full Text PDFPLoS One
January 2025
DIADE, IRD, Cirad, University of Montpellier, Montpellier, France.
Motivation: Genotyping of bi-parental populations can be performed with low-coverage next-generation sequencing (LC-NGS). This allows the creation of highly saturated genetic maps at reasonable cost, precisely localized recombination breakpoints (i.e.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!