Tiling arrays make possible a large-scale exploration of the genome thanks to probes which cover the whole genome with very high density, up to 2,000,000 probes. Biological questions usually addressed are either the expression difference between two conditions or the detection of transcribed regions. In this work, we propose to consider both questions simultaneously as an unsupervised classification problem by modeling the joint distribution of the two conditions. In contrast to previous methods, we account for all available information on the probes as well as biological knowledge such as annotation and spatial dependence between probes. Since probes are not biologically relevant units, we propose a classification rule for non-connected regions covered by several probes. Applications to transcriptomic and ChIP-chip data of Arabidopsis thaliana obtained with a NimbleGen tiling array highlight the importance of a precise modeling and of the region classification. The "TAHMMAnnot" package is implemented in R and C and is freely available from CRAN.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.2202/1544-6115.1692 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!