Publications by Andrew B Clegg

Publications by authors named "Andrew B Clegg"

Page 1 of 1

Uncovering the molecular machinery of the human spindle--an integration of wet and dry systems biology.

Ana M Rojas Anna Santamaria Rainer Malik Thomas Skøt Jensen Roman Körner Andrew B Clegg

PLoS One

August 2012

The mitotic spindle is an essential molecular machine involved in cell division, whose composition has been studied extensively by detailed cellular biology, high-throughput proteomics, and RNA interference experiments. However, because of its dynamic organization and complex regulation it is difficult to obtain a complete description of its molecular composition. We have implemented an integrated computational approach to characterize novel human spindle components and have analysed in detail the individual candidates predicted to be spindle proteins, as well as the network of predicted relations connecting known and putative spindle proteins.

View Article and Find Full Text PDF

Extending CATH: increasing coverage of the protein structure universe and linking structure with function.

Alison L Cuff Ian Sillitoe Tony Lewis Andrew B Clegg Robert Rentzsch

Nucleic Acids Res

January 2011

CATH version 3.3 (class, architecture, topology, homology) contains 128,688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version 3.

View Article and Find Full Text PDF

Finding the "dark matter" in human and yeast protein network prediction and modelling.

Juan A G Ranea Ian Morilla Jon G Lees Adam J Reid Corin Yeats Andrew B Clegg

PLoS Comput Biol

September 2010

Accurate modelling of biological systems requires a deeper and more complete knowledge about the molecular components and their functional associations than we currently have. Traditionally, new knowledge on protein associations generated by experiments has played a central role in systems modelling, in contrast to generally less trusted bio-computational predictions. However, we will not achieve realistic modelling of complex molecular systems if the current experimental designs lead to biased screenings of real protein networks and leave large, functionally important areas poorly characterised.

View Article and Find Full Text PDF

CODA: accurate detection of functional associations between proteins in eukaryotic genomes using domain fusion.

Adam J Reid Juan A G Ranea Andrew B Clegg Christine A Orengo

PLoS One

June 2010

Background: In order to understand how biological systems function it is necessary to determine the interactions and associations between proteins. Gene fusion prediction is one approach to detection of such functional relationships. Its use is however known to be problematic in higher eukaryotic genomes due to the presence of large homologous domain families.

View Article and Find Full Text PDF

The EMBRACE web service collection.

Steve Pettifer Jon Ison Matús Kalas Dave Thorne Philip McDermott Andrew B Clegg

Nucleic Acids Res

July 2010

The EMBRACE (European Model for Bioinformatics Research and Community Education) web service collection is the culmination of a 5-year project that set out to investigate issues involved in developing and deploying web services for use in the life sciences. The project concluded that in order for web services to achieve widespread adoption, standards must be defined for the choice of web service technology, for semantically annotating both service function and the data exchanged, and a mechanism for discovering services must be provided. Building on this, the project developed: EDAM, an ontology for describing life science web services; BioXSD, a schema for exchanging data between services; and a centralized registry (http://www.

View Article and Find Full Text PDF

An integrated approach to the interpretation of single amino acid polymorphisms within the framework of CATH and Gene3D.

Jose M G Izarzugaza Anja Baresic Lisa E M McMillan Corin Yeats Andrew B Clegg

BMC Bioinformatics

August 2009

Background: The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.

View Article and Find Full Text PDF

A realistic assessment of methods for extracting gene/protein interactions from free text.

Renata Kabiljo Andrew B Clegg Adrian J Shepherd

BMC Bioinformatics

July 2009

Background: The automated extraction of gene and/or protein interactions from the literature is one of the most important targets of biomedical text mining research. In this paper we present a realistic evaluation of gene/protein interaction mining relevant to potential non-specialist users. Hence we have specifically avoided methods that are complex to install or require reimplementation, and we coupled our chosen extraction methods with a state-of-the-art biomedical named entity tagger.

View Article and Find Full Text PDF

Text mining.

Andrew B Clegg Adrian J Shepherd

Methods Mol Biol

October 2008

One of the fastest-growing fields in bioinformatics is text mining: the application of natural language processing techniques to problems of knowledge management and discovery, using large collections of biological or biomedical text such as MEDLINE. The techniques used in text mining range from the very simple (e.g.

View Article and Find Full Text PDF

Benchmarking natural-language parsers for biological applications using dependency graphs.

Andrew B Clegg Adrian J Shepherd

BMC Bioinformatics

January 2007

Background: Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria.

View Article and Find Full Text PDF