Publications by authors named "Xia-Yu Xia"

Because of the limited effectiveness of prevailing phylogenetic methods when applied to highly divergent protein sequences, the phylogenetic analysis problem remains challenging. Here, we propose a sequence-based evolutionary distance algorithm termed sequence distance (SD), which innovatively incorporates site-to-site correlation within protein sequences into the distance estimation. In protein superfamilies, SD can effectively distinguish evolutionary relationships both within and between protein families, producing phylogenetic trees that closely align with those based on structural information, even with sequence identity less than 20%.

View Article and Find Full Text PDF

The heat-tolerance mechanisms of (hyper)thermophilic proteins provide a unique opportunity to investigate the unsolved protein folding problem. In an attempt to determine whether the interval between residues in sequence might play a role in determining thermostability, we constructed a sequence interval-dependent value function to calculate the residue pair frequency. Additionally, we identified a new sequence arrangement pattern, where like-charged residues tend to be adjacently assembled, while unlike-charged residues are distributed over longer intervals, using statistical analysis of a large sequence database.

View Article and Find Full Text PDF

In the early stages of infection, Human Immunodeficiency Virus Type 1 (HIV-1) generally selects CCR5 as the primary coreceptor for entering the host cell. As infection progresses, the virus evolves and may exhibit a coreceptor-switch to CXCR4. Accurate determination coreceptor usage and identification key mutational patterns associated tropism switch are essential for selection of appropriate therapies and understanding mechanism of coreceptor change.

View Article and Find Full Text PDF
Article Synopsis
  • * The new SCBD assay leverages the clustering of viral sequences from transmitted/founder strains to improve accuracy in identifying incident infections by analyzing early-stage intra-cluster diversity.
  • * Testing with a dataset of 398 incident and 163 chronic cases showed the SCBD method has high sensitivity (99.5%) and specificity (98.8%), indicating its potential as a valuable tool for identifying recent HIV-1 infections, especially in resource-limited settings.
View Article and Find Full Text PDF
Article Synopsis
  • The prediction of protein domain structural classes from primary sequences is challenging due to the gap between sequencing and structural genomics.
  • Traditional predictors have struggled, with a top accuracy of just 74.1% using conventional sequence features on a standard dataset (25PDB).
  • A new multiple linear regression (MLR) model was developed, converting a 440-dimensional sequence feature to a 4-dimensional structural vector, achieving an impressive accuracy of 83.1% through extensive testing on a large dataset.
View Article and Find Full Text PDF

A double mutant cycle (DMC) approach was employed to estimate the effect of temperature on the contribution of two highly conserved salt bridges to protein stability in the hyperthermophilic protein Ssh10b. The coupling free energy were 2.4 +/- 0.

View Article and Find Full Text PDF