Publications by authors named "Yi-Kuo Yu"

Having fast, accurate, and broad spectrum methods for the identification of microorganisms is of paramount importance to public health, research, and safety. Bottom-up mass spectrometer-based proteomics has emerged as an effective tool for the accurate identification of microorganisms from microbial isolates. However, one major hurdle that limits the deployment of this tool for routine clinical diagnosis, and other areas of research such as culturomics, is the instrument time required for the mass spectrometer to analyze a single sample, which can take ∼1 h per sample, when using mass spectrometers that are presently used in most institutes.

View Article and Find Full Text PDF

In recent years, several deep learning-based methods have been proposed for predicting peptide fragment intensities. This study aims to provide a comprehensive assessment of six such methods, namely Prosit, DeepMass:Prism, pDeep3, AlphaPeptDeep, Prosit Transformer, and the method proposed by Guan et al. To this end, we evaluated the accuracy of the predicted intensity profiles for close to 1.

View Article and Find Full Text PDF

Although many user-friendly workflows exist for identifications of peptides and proteins in mass-spectrometry-based proteomics, there is a need of easy to use, fast, and accurate workflows for identifications of microorganisms, antimicrobial resistant proteins, and biomass estimation. Identification of microorganisms is a computationally demanding task that requires querying thousands of MS/MS spectra in a database containing thousands to tens of thousands of microorganisms. Existing software can't handle such a task in a time efficient manner, taking hours to process a single MS/MS experiment.

View Article and Find Full Text PDF

Motivation: The PAM50 signature/method is widely used for intrinsic subtyping of breast cancer samples. However, depending on the number and composition of the samples included in a cohort, the method may assign different subtypes to the same sample. This lack of robustness is mainly due to the fact that PAM50 subtracts a reference profile, which is computed using all samples in the cohort, from each sample before classification.

View Article and Find Full Text PDF

Motivation: The precise diagnosis of the major subtypes, lung adenocarcinoma and lung squamous cell carcinoma, of non-small-cell lung cancer is of practical importance as some treatments are subtype-specific. However, in some cases diagnosis via the commonly-used method, that is staining the specimen using immunohistochemical markers, may be challenging. Hence, having a computational method that complements the diagnosis is desirable.

View Article and Find Full Text PDF

Fast and accurate identifications of pathogenic bacteria along with their associated antibiotic resistance proteins are of paramount importance for patient treatments and public health. To meet this goal from the mass spectrometry aspect, we have augmented the previously published croorganism lassification and entification (MiCId) workflow for this capability. To evaluate the performance of this augmented workflow, we have used MS/MS datafiles from samples of 10 antibiotic resistance bacterial strains belonging to three different species: , , and .

View Article and Find Full Text PDF

Electrostatic interactions among colloidal particles are often described using the venerable (two-particle) Derjaguin-Landau-Verwey-Overbeek (DLVO) approximation and its various modifications. However, until the recent development of a many-body theory exact at the Debye-Hückel level (Yu in Phys Rev E 102:052404, 2020), it was difficult to assess the errors of such approximations and impossible to assess the role of many-body effects. By applying the exact Debye-Hückel level theory, we quantify the errors inherent to DLVO and the additional errors associated with replacing many-particle interactions by the sum of pairwise interactions (even when the latter are calculated exactly).

View Article and Find Full Text PDF

Bloodstream infections (BSIs), the presence of microorganisms in blood, are potentially serious conditions that can quickly develop into sepsis and life-threatening situations. When assessing proper treatment, rapid diagnosis is the key; besides clinical judgement performed by attending physicians, supporting microbiological tests typically are performed, often requiring microbial isolation and culturing steps, which increases the time required for confirming positive cases of BSI. The additional waiting time forces physicians to prescribe broad-spectrum antibiotics and empirically based treatments, before determining the precise cause of the disease.

View Article and Find Full Text PDF

Simple light isotope metabolic labeling (SLIM labeling) is an innovative method to quantify variations in the proteome based on an original labeling strategy. Heterotrophic cells grown in U-[C] as the sole source of carbon synthesize U-[C]-amino acids, which are incorporated into proteins, giving rise to U-[C]-proteins. This results in a large increase in the intensity of the monoisotope ion of peptides and proteins, thus allowing higher identification scores and protein sequence coverage in mass spectrometry experiments.

View Article and Find Full Text PDF

The unequivocal role of electrostatic forces in biological (and colloidal) systems underscores the importance of attaining accurate and rapid calculations of electrostatic forces if one wishes to faithfully simulate the electrostatic aspect of a biological system. This paper makes significant progress toward this aspect as it rigorously incorporates ionic screening at the Debye-Hückel level for an electrolyte system containing dielectric spheres of finite radii. We investigated earlier this system without mobile ions via a surface charge method.

View Article and Find Full Text PDF

Rapid and accurate identification of microorganisms and estimation of their biomasses are of extreme importance to public health. Mass spectrometry has become an important technique for these purposes. Previously we published a workflow named croorganism lassification and entification (MiCId v.

View Article and Find Full Text PDF

Given the crucial role of electrostatic forces in biological systems, accurate and rapid calculations of electrostatic forces are imperative in faithfully simulating biological systems. More than a decade ago, we proposed a surface charge method, applied it to a system of an arbitrary number of charged dielectric spheres, and obtained an exact solution for arbitrary configuration of the spheres. The precision depends only on the number of terms kept in a series expansion, and can therefore be controlled at will.

View Article and Find Full Text PDF

Reprogramming of somatic cells to induced pluripotent stem cells, by overexpressing certain factors referred to as the reprogramming factors, can revolutionize regenerative medicine. To provide a coherent description of induced pluripotency from the gene regulation perspective, we use 35 microarray datasets to construct a reprogramming gene regulatory network. Comprising 276 nodes and 4471 links, the resulting network is, to the best of our knowledge, the largest gene regulatory network constructed for human fibroblast reprogramming and it is the only one built using a large number of experimental datasets.

View Article and Find Full Text PDF

Mass spectrometry-based proteomics starts with identifications of peptides and proteins, which provide the bases for forming the next-level hypotheses whose "validations" are often employed for forming even higher level hypotheses and so forth. Scientifically meaningful conclusions are thus attainable only if the number of falsely identified peptides/proteins is accurately controlled. For this reason, RAId continued to be developed in the past decade.

View Article and Find Full Text PDF

Efficiency of reprogramming of human cells into induced pluripotent stem cells (iPSCs) has remained low. We report that individual adult human CD49f long-term hematopoietic stem cells (LT-HSCs) can be reprogrammed into iPSCs at close to 50% efficiency using Sendai virus transduction. This exquisite sensitivity to reprogramming is specific to LT-HSCs, since it progressively decreases in committed progenitors.

View Article and Find Full Text PDF

Thermal expansion of H_{2}O and D_{2}O ice Ih with relative resolution of 1 ppb is reported. A large transition in the thermal expansion coefficient at 101 K in H_{2}O moves to 125 K in D_{2}O, revealing one of the largest-known isotope effects. Rotational oscillatory modes that couple poorly to phonons, i.

View Article and Find Full Text PDF

Quantum spin chains with composite spins have been used to approximate conventional chains with higher spins. For instance, a spin 1 (or [Formula: see text]) chain was sometimes approximated by a chain with two (or three) spin [Formula: see text]'s per site. However, little examination has been given as to whether this approximation, effectively assuming the first Hund rule per site, is valid and why.

View Article and Find Full Text PDF

Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present.

View Article and Find Full Text PDF

Objective: RAId is a software package that has been actively developed for the past 10 years for computationally and visually analyzing MS/MS data. Founded on rigorous statistical methods, RAId's core program computes accurate E-values for peptides and proteins identified during database searches. Making this robust tool readily accessible for the proteomics community by developing a graphical user interface (GUI) is our main goal here.

View Article and Find Full Text PDF

Parametrizing a curved surface with flat triangles in electrostatics problems creates a diverging electric field. One way to avoid this is to have curved areal elements. However, charge density integration over curved patches appears difficult.

View Article and Find Full Text PDF

A previously developed classical model of electrostatic interactions, based on a formalism of dielectric spheres, which has been found to have surprising accuracy for S state atoms, is extended by allowing higher-order moments of the intrinsic charge distribution. Two methods to introduce the charge distribution (point moments at the center vs surface charge) are shown to be equivalent and are compared with another common model for polarizable atoms that utilizes polarizable point dipoles. Unlike the polarizable point dipole model, the polarizable spheres models do not suffer from a divergence at small separation of atoms and are easily generalized to higher multipoles.

View Article and Find Full Text PDF

A simple and easy to implement method for improving the convergence of a power series is presented. We observe that the most obvious or analytically convenient point about which to make a series expansion is not always the most computationally efficient. Series convergence can be dramatically improved by choosing the center of the series expansion to be at or near the average value at which the series is to be evaluated.

View Article and Find Full Text PDF

Motivation: There is a growing trend for biomedical researchers to extract evidence and draw conclusions from mass spectrometry based proteomics experiments, the cornerstone of which is peptide identification. Inaccurate assignments of peptide identification confidence thus may have far-reaching and adverse consequences. Although some peptide identification methods report accurate statistics, they have been limited to certain types of scoring function.

View Article and Find Full Text PDF

We calculate the polarization portion of electrostatic interactions at the atomic scale using quantum mechanical methods such as density functional theories (DFT) and the coupled cluster approach, and using classical methods such as a surface charge method and a polarizable force field. The agreement among various methods is investigated. Using the coupled clusters method CCSD(T) with large basis sets as the reference, we find that for systems comprising two to six atoms and ions in S-states the classical surface charge method performs much better than commonly used DFT methods with moderate basis sets such as B3LYP/6-31G(d,p).

View Article and Find Full Text PDF

In recent years several methods have been proposed to assign pairwise mechanism- based similarity scores to human diseases. Despite their differences in approach and performance, these methods work in a somewhat similar manner: first a set of biomolecules (genes, proteins, chemicals, etc.) is associated with each disease, and then a measure is defined to calculate the similarity between the sets assigned to a pair of diseases.

View Article and Find Full Text PDF