Motivation: Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis. We have identified a number of design parameters to be optimized including uniqueness of the probe sequences within the whole genome, melting temperature and self-hybridization potential.

Results: We introduce the uniqueness score, U, a novel quality measure for oligonucleotide probes and present a method to quickly compute it. We show that U is equivalent to the number of shortest unique substrings in the probe and describe an efficient greedy algorithm to design mammalian whole genome tiling arrays using probes that maximize U. Using the mouse genome, we demonstrate how several optimizations influence the tiling array design characteristics. With a sensible set of parameters, our designs cover 78% of the mouse genome including many regions previously considered 'untilable' due to the presence of repetitive sequence. Finally, we compare our whole genome tiling array designs with commercially available designs.

Availability: Source code is available under an open source license from http://www.ebi.ac.uk/~graef/arraydesign/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5892713PMC
http://dx.doi.org/10.1093/bioinformatics/btm200DOI Listing

Publication Analysis

Top Keywords

genome tiling
12
tiling arrays
12
tiling array
12
array design
8
design characteristics
8
mouse genome
8
genome
6
tiling
6
array
6
optimized design
4

Similar Publications

Rapid, inexpensive multiplex pathogen detection using resequencing microarrays.

J Virol Methods

January 2025

Centrillion Technologies, Palo Alto, CA 94303, United States.

Humanity faces an ongoing battle at the microscopic level to identify, contain, and treat outbreaks of numerous pathogens each year. Global genomic surveillance is the first step in monitoring outbreaks, but high-throughput methods are expensive and time-consuming. To solve this problem, we designed and manufactured a resequencing microarray capable of identifying 35 viral pathogens, 21 pathogenic bacteria, 16 antibiotic resistance genes, and 6 controls.

View Article and Find Full Text PDF

A comprehensive benchmark for multiple highly efficient base editors with broad targeting scope.

bioRxiv

December 2024

Key Laboratory of Bioresource Research and Development of Liaoning Province, College of Life and Health Sciences, Northeastern University, Shenyang, 110819, China.

As the toolbox of base editors (BEs) expands, selecting appropriate BE and guide RNA (gRNA) to achieve optimal editing efficiency and outcome for a given target becomes challenging. Here, we construct a set of 10 adenine and cytosine BEs with high activity and broad targeting scope, and comprehensively evaluate their editing profiles and properties head-to-head with 34,040 BE-gRNA-target combinations using genomically integrated long targets and tiling gRNA strategies. Interestingly, we observe widespread non-canonical protospacer adjacent motifs (PAMs) for these BEs.

View Article and Find Full Text PDF

Pathogen sequencing is an important tool for disease surveillance and demonstrated its high value during the COVID-19 pandemic. Viral sequencing during the pandemic allowed us to track disease spread, quickly identify new variants, and guide the development of vaccines. Tiled amplicon sequencing, in which a panel of primers is used for multiplex amplification of fragments across an entire genome, was the cornerstone of SARS-CoV-2 sequencing.

View Article and Find Full Text PDF

Screened of long non-coding RNA related to wool development and fineness in Gansu alpine fine-wool sheep.

BMC Genomics

January 2025

Gansu Key Laboratory of Herbivorous Animal Biotechnology, College of Animal Science and Technology, Gansu Agricultural University, Lanzhou, 730070, China.

Wool growth and fineness regulation is influenced by some factors such as genetics and environment. At the same time, lncRNA participates in numerous biological processes in animal production. In this research, we conducted a thorough analysis and characterization of the microstructure of wool, along with long non-coding RNAs (lncRNAs), their target genes, associated pathways, and Gene Ontology terms pertinent to the wool fineness development.

View Article and Find Full Text PDF

Whole-genome sequencing surveillance of Siberian tick-borne encephalitis virus (TBEV) identifies an additional lineage in Kyrgyzstan.

Virus Res

January 2025

UK Health Security Agency, Science Group, Porton Down, Salisbury, UK; Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, UK; Health Protection Research Unit (HPRU) in Emerging and Zoonotic Infections, Institute of Infection and Veterinary and Ecological Science, University of Liverpool, Liverpool, UK.

Tick-borne encephalitis virus (TBEV) is the most prevalent tick-borne viral disease in Europe and Asia. There are three main subtypes of the virus: European, Siberian, and Far Eastern, each of which having distinctive ecology, clinical presentation, and geographic distribution. In recent years, other TBEV subtypes have been described, namely the Himalayan and Baikalian subtypes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!