Publications by authors named "Ian Castleden"

In eukaryotic organisms, subcellular protein location is critical in defining protein function and understanding sub-functionalization of gene families. Some proteins have defined locations, whereas others have low specificity targeting and complex accumulation patterns. There is no single approach that can be considered entirely adequate for defining the in vivo location of all proteins.

View Article and Find Full Text PDF

The increased diversity and scale of published biological data has to led to a growing appreciation for the applications of machine learning and statistical methodologies to gain new insights. Key to achieving this aim is solving the Relationship Extraction problem which specifies the semantic interaction between two or more biological entities in a published study. Here, we employed two deep neural network natural language processing (NLP) methods, namely: the continuous bag of words (CBOW), and the bi-directional long short-term memory (bi-LSTM).

View Article and Find Full Text PDF

Agriculture faces increasing demand for yield, higher plant-derived protein content and diversity while facing pressure to achieve sustainability. Although the genomes of many of the important crops have been sequenced, the subcellular locations of most of the encoded proteins remain unknown or are only predicted. Protein subcellular location is crucial in determining protein function and accumulation patterns in plants, and is critical for targeted improvements in yield and resilience.

View Article and Find Full Text PDF

The RNA-binding pentatricopeptide repeat (PPR) family comprises hundreds to thousands of genes in most plants, but only a few dozen in algae, indicating massive gene expansions during land plant evolution. The nature and timing of these expansions has not been well defined due to the sparse sequence data available from early-diverging land plant lineages. In this study, we exploit the comprehensive OneKP datasets of over 1000 transcriptomes from diverse plants and algae toward establishing a clear picture of the evolution of this massive gene family, focusing on the proteins typically associated with RNA editing, which show the most spectacular variation in numbers and domain composition across the plant kingdom.

View Article and Find Full Text PDF

Queens of social insects make all mate-choice decisions on a single day, except in honeybees whose queens can conduct mating flights for several days even when already inseminated by a number of drones. Honeybees therefore appear to have a unique, evolutionarily derived form of sexual conflict: a queen's decision to pursue risky additional mating flights is driven by later-life fitness gains from genetically more diverse worker-offspring but reduces paternity shares of the drones she already mated with. We used artificial insemination, RNA-sequencing and electroretinography to show that seminal fluid induces a decline in queen vision by perturbing the phototransduction pathway within 24-48 hr.

View Article and Find Full Text PDF

Measuring changes in protein or organelle abundance in the cell is an essential, but challenging aspect of cell biology. Frequently-used methods for determining organelle abundance typically rely on detection of a very few marker proteins, so are unsatisfactory. In silico estimates of protein abundances from publicly available protein spectra can provide useful standard abundance values but contain only data from tissue proteomes, and are not coupled to organelle localization data.

View Article and Find Full Text PDF

We applied N labeling approaches to leaves of the rosette to characterize their protein degradation rate and understand its determinants. The progressive labeling of new peptides with N and measuring the decrease in the abundance of >60,000 existing peptides over time allowed us to define the degradation rate of 1228 proteins in vivo. We show that Arabidopsis protein half-lives vary from several hours to several months based on the exponential constant of the decay rate for each protein.

View Article and Find Full Text PDF

The SUBcellular location database for Arabidopsis proteins (SUBA4, http://suba.live) is a comprehensive collection of manually curated published data sets of large-scale subcellular proteomics, fluorescent protein visualization, protein-protein interaction (PPI) as well as subcellular targeting calls from 22 prediction programs. SUBA4 contains an additional 35 568 localizations totalling more than 60 000 experimental protein location claims as well as 37 new suborganellar localization categories.

View Article and Find Full Text PDF

The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30-40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation.

View Article and Find Full Text PDF

Barley, wheat, rice and maize provide the bulk of human nutrition and have extensive industrial use as agricultural products. The genomes of these crops each contains >40,000 genes encoding proteins; however, the major genome databases for these species lack annotation information of protein subcellular location for >80% of these gene products. We address this gap, by constructing the compendium of crop protein subcellular locations called crop Proteins with Annotated Locations (cropPAL).

View Article and Find Full Text PDF

Sub-functionalization during the expansion of gene families in eukaryotes has occurred in part through specific subcellular localization of different family members. To better understand this process in plants, compiled records of large-scale proteomic and fluorescent protein localization datasets can be explored and bioinformatic predictions for protein localization can be used to predict the gaps in experimental data. This process can be followed by targeted experiments to test predictions.

View Article and Find Full Text PDF

Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory.

View Article and Find Full Text PDF

In reverse genetic knockout (KO) studies that aim to assign function to specific genes, confirming the reduction in abundance of the encoded protein will often aid the link between genotype and phenotype. However, measuring specific protein abundance is particularly difficult in plant research, where only a limited number of antibodies are available. This problem is enhanced when studying gene families or different proteins derived from the same gene (isoforms), as many antibodies cross react with more than one protein.

View Article and Find Full Text PDF

Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models.

View Article and Find Full Text PDF

Fluorescent protein (FP) tagging approaches are widely used to determine the subcellular location of plant proteins. Here we give a brief overview of FP approaches, highlight potential technical problems, and discuss what to consider when designing FP/protein fusion constructs and performing transformation assays. We analyze published FP tagging data sets along with data from proteomics studies collated in SUBA3, a subcellular location database for Arabidopsis proteins, and assess the reliability of these data sets by comparing them.

View Article and Find Full Text PDF

The provision of precise metadata is an important but a largely underrated challenge for modern science [Nature 2009, 461, 145]. We describe here a dictionary methods language dREL that has been designed to enable complex data relationships to be expressed as formulaic scripts in data dictionaries written in DDLm [Spadaccini and Hall J. Chem.

View Article and Find Full Text PDF

Background: Arabidopsis thaliana is clearly established as the model plant species. Given the ever-growing demand for food, there is a need to translate the knowledge learned in Arabidopsis to agronomically important species, such as rice (Oryza sativa). To gain a comparative insight into the similarities and differences into how organs are built and how plants respond to stress, the transcriptomes of Arabidopsis and rice were compared at the level of gene orthology and functional categorisation.

View Article and Find Full Text PDF

Proteomics has become a critical tool in the functional understanding of plant processes at the molecular level. Proteomics-based studies have also contributed to the ever-expanding array of data in modern biology, with many generating Web portals and online resources that contain incrementally expanding and updated information. Many of these resources reflect specialist research areas with significant and novel information that is not currently captured by centralized repositories.

View Article and Find Full Text PDF

Plant mitochondria play central roles in cellular energy production, metabolism and stress responses. Recent phosphoproteomic studies in mammalian and yeast mitochondria have presented evidence indicating that protein phosphorylation is a likely regulatory mechanism across a broad range of important mitochondrial processes. This study investigated protein phosphorylation in purified mitochondria from cell suspensions of the model plant Arabidopsis thaliana using affinity enrichment and proteomic tools.

View Article and Find Full Text PDF

Peroxisomes play key roles in energy metabolism, cell signaling, and plant development. A better understanding of these important functions will be achieved with a more complete definition of the peroxisome proteome. The isolation of peroxisomes and their separation from mitochondria and other major membrane systems have been significant challenges in the Arabidopsis (Arabidopsis thaliana) model system.

View Article and Find Full Text PDF