Background: Pulsed field gel electrophoresis (PFGE) is currently the most widely and routinely used method by the Centers for Disease Control and Prevention (CDC) and state health labs in the United States for Salmonella surveillance and outbreak tracking. Major drawbacks of commercially available PFGE analysis programs have been their difficulty in dealing with large datasets and the limited availability of analysis tools. There exists a need to develop new analytical tools for PFGE data mining in order to make full use of valuable data in large surveillance databases.

Results: In this study, a software package was developed consisting of five types of bioinformatics approaches exploring and implementing for the analysis and visualization of PFGE fingerprinting. The approaches include PFGE band standardization, Salmonella serotype prediction, hierarchical cluster analysis, distance matrix analysis and two-way hierarchical cluster analysis. PFGE band standardization makes it possible for cross-group large dataset analysis. The Salmonella serotype prediction approach allows users to predict serotypes of Salmonella isolates based on their PFGE patterns. The hierarchical cluster analysis approach could be used to clarify subtypes and phylogenetic relationships among groups of PFGE patterns. The distance matrix and two-way hierarchical cluster analysis tools allow users to directly visualize the similarities/dissimilarities of any two individual patterns and the inter- and intra-serotype relationships of two or more serotypes, and provide a summary of the overall relationships between user-selected serotypes as well as the distinguishable band markers of these serotypes. The functionalities of these tools were illustrated on PFGE fingerprinting data from PulseNet of CDC.

Conclusions: The bioinformatics approaches included in the software package developed in this study were integrated with the PFGE database to enhance the data mining of PFGE fingerprints. Fast and accurate prediction makes it possible to elucidate Salmonella serotype information before conventional serological methods are pursued. The development of bioinformatics tools to distinguish the PFGE markers and serotype specific patterns will enhance PFGE data retrieval, interpretation and serotype identification and will likely accelerate source tracking to identify the Salmonella isolates implicated in foodborne diseases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851133PMC
http://dx.doi.org/10.1186/1471-2105-14-S14-S15DOI Listing

Publication Analysis

Top Keywords

hierarchical cluster
16
cluster analysis
16
pfge
13
data mining
12
salmonella serotype
12
analysis
10
analysis tools
8
pfge data
8
software package
8
package developed
8

Similar Publications

Background: The study aims to explore the epidemiologic information related to severe periodontitis in China.

Methods: We analyzed data from the Global Burden of Disease (GBD) 2021 study to delineate the incidence, prevalence, and disability-adjusted life years (DALYs) attributable to severe periodontitis in China, stratified by age and gender. A range of analytical methods, including comparative analysis, trend analysis, decomposition analysis, hierarchical cluster analysis, health inequality analysis, and predictive modeling, were employed to provide a comprehensive evaluation of the disease burden.

View Article and Find Full Text PDF

Terrestrial ecosystems have vital impacts on soil carbon sequestration, but under disturbances from anthropogenic activities, the typical indicator combinations of SOC distribution in coastal areas remain unclear. On the basis of surface soil sampling and calculations of related eco-environmental indices in the Yellow River Delta (YRD), we performed geostatistical analysis combined with Spearman's correlation analysis, principal component analysis (PCA), and hierarchical clustering analysis (HCA) to explore the spatial heterogeneity of soil organic carbon (SOC) and influential spatiotemporal factors. Overall, the results revealed that in the seaward direction of the Yellow River, the SOC concentration decreased from west to east, with a low mean value of 5.

View Article and Find Full Text PDF

Testing for a difference in means of a single feature after clustering.

Biostatistics

December 2024

Department of Statistics, University of British Columbia, 3182 Earth Sciences Building, 2207 Main Mall, Vancouver, BC V6T 1Z4, Canada.

For many applications, it is critical to interpret and validate groups of observations obtained via clustering. A common interpretation and validation approach involves testing differences in feature means between observations in two estimated clusters. In this setting, classical hypothesis tests lead to an inflated Type I error rate.

View Article and Find Full Text PDF

In this study, screening of the collected 70 Salvia nemorosa L. populations from 54 habitats from West Azerbaijan province, Iran was evaluated by analyzing the content of phytochemical compounds, antioxidant activity, and UHPLC-HRMS profiling in different populations. The aerial parts of the plants were analyzed based on total phenolic (TPC) and flavonoid (TFC), total tannin (TTC), ascorbic acid (AAC), chlorophylls (Cla, and Clb), total carotenoid (TCC), β-carotene, antioxidant activity (by DPPH and FRAP assays), and 40 polyphenolic compounds by UHPLC-HRMS (phenolic acids, flavonoids and fatty acyl glicosides).

View Article and Find Full Text PDF

Wealth inequality is one of the most profound challenges confronting society today. However, an important issue in addressing inequality lies in formalizing the diversity of individual perspectives regarding what constitutes a fair distribution of resources. We tackle this topic by simulating wealth inequality through the allocation of bonus endowments in both Dictator Game (DG) and Ultimatum Game (UG) settings and capturing distributive decisions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!