Background: Dengue is a common vector-borne disease in tropical countries caused by the Dengue virus. This virus may trigger a disease with several symptoms like fever, headache, nausea, vomiting, and muscle pain. Indeed, dengue illness may also present more severe and life-threatening conditions like hemorrhagic fever and dengue shock syndrome. The causes that lead hosts to develop severe infections are multifactorial and not fully understood. However, it is hypothesized that different viral genome signatures may partially contribute to the disease outcome. Therefore, it is plausible to suggest that deeper DENV genetic information analysis may bring new clues about genetic markers linked to severe illness.

Method: Pattern recognition in very long protein sequences is a challenge. To overcome this difficulty, we map protein chains onto matrix data structures that reveal patterns and allow us to classify dengue proteins associated with severe illness outcomes in human hosts. Our analysis uses co-occurrence of amino acids to build the matrices and Random Forests to classify them. We then interpret the classification model using SHAP Values to identify which amino acid co-occurrences increase the likelihood of severe outcomes.

Results: We trained ten binary classifiers, one for each dengue virus protein sequence. We assessed the classifier performance through five metrics: PR-AUC, ROC-AUC, F1-score, Precision and Recall. The highest score on all metrics corresponds to the protein E with a 95% confidence interval. We also compared the means of the classification metrics using the Tukey HSD statistical test. In four of five metrics, protein E was statistically different from proteins M, NS1, NS2A, NS2B, NS3, NS4A, NS4B and NS5, showing that E markers has a greater chance to be associated with severe dengue. Furthermore, the amino acid co-occurrence matrix highlight pairs of amino acids within Domain 1 of E protein that may be associated with the classification result.

Conclusion: We show the co-occurrence patterns of amino acids present in the protein sequences that most correlate with severe dengue. This evidence, used by the classification model and verified by statistical tests, mainly associates the E protein with the severe outcome of dengue in human hosts. In addition, we present information suggesting that patterns associated with such severe cases can be found mostly in Domain 1, inside protein E. Altogether, our results may aid in developing new treatments and being the target of debate on new theories regarding the infection caused by dengue in human hosts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8858567PMC
http://dx.doi.org/10.1186/s12859-022-04597-yDOI Listing

Publication Analysis

Top Keywords

amino acids
16
dengue virus
12
associated severe
12
human hosts
12
dengue
11
severe
9
protein
9
caused dengue
8
protein sequences
8
classification model
8

Similar Publications

Lactobacillus salivarius metabolite succinate enhances chicken intestinal stem cell activities via the SUCNR1-mitochondria axis.

Poult Sci

December 2024

MOA Key Laboratory of Animal Virology, College of Animal Sciences, Zhejiang University, Hangzhou 310058, PR China; Department of Veterinary Medicine, College of Animal Sciences, Zhejiang University, Hangzhou 310058, PR China. Electronic address:

The activity of intestinal stem cells (ISCs) can be modulated by Lactobacillus, which subsequently affects the mucosal absorptive capacity. However, the underlying mechanisms remain unclear. In this study, a total of 189 Hy-Line Brown chickens (Gallus) were randomly assigned to one of seven experimental groups (n = 27 per group).

View Article and Find Full Text PDF

Microbial activity in the deep continental subsurface is difficult to measure due to low cell densities, low energy fluxes, cryptic elemental cycles and enigmatic metabolisms. Nonetheless, direct access to rare sample sites and sensitive laboratory measurements can be used to better understand the variables that govern microbial life underground. In this study, we sampled fluids from six boreholes at depths ranging from 244 m to 1,478 m below ground at the Sanford Underground Research Facility (SURF), a former goldmine in South Dakota, United States.

View Article and Find Full Text PDF

Introduction: Crop rotation of tobacco with other crops could effectively break the negative impact of continuous tobacco cropping, but the mechanisms of intercropping system effects on tobacco, especially on the rhizosphere, are not clear.

Methods: In this study, we investigated the impact of intercropping system on the diversity and function of tobacco metabolites and microorganisms through metabolomic and metagenomic analyses of the tobacco rhizosphere microenvironment intercropped with maize and soybean.

Results: The results showed that the contents of huperzine b, chlorobenzene, and P-chlorophenylalanine in tobacco rhizosphere soils differed significantly among soybean-tobacco and maize-tobacco intercropping system.

View Article and Find Full Text PDF

The 26S proteasome complex is the hub for regulated protein degradation in the cell. It is composed of two biochemically distinct complexes: the 20S core particle with proteolytic active sites in an internal chamber and the 19S regulatory particle, consisting of a lid and base subcomplex. The base contains ubiquitin receptors and an AAA+ (ATPases associated with various cellular activities) motor that unfolds substrates prior to degradation.

View Article and Find Full Text PDF

Unlabelled: The reflexive translation of symbols in one chemical language to another defined genetics. Yet, the co-linearity of codons and amino acids is so commonplace an idea that few even ask how it arose. Readout is done by two distinct sets of proteins, called aminoacyl-tRNA synthetases (AARS).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!