Publications by authors named "Jiyun Zhou"

The interaction between microbes and drugs encompasses the sourcing of pharmaceutical compounds, microbial drug degradation, the development of , and the impact of on host drug metabolism and immune modulation. These interactions significantly impact drug efficacy and the evolution of drug resistance. In this study, we propose a novel predictive model, termed GCGACNN.

View Article and Find Full Text PDF

DNA methylation (DNAm), a crucial epigenetic mark, plays a key role in gene regulation, mammalian development, and various human diseases. Single-cell technologies enable the profiling of DNAm states at cytosines within the DNA sequence of individual cells, but they often suffer from limited coverage of CpG sites. In this study, we introduce scMeFormer, a transformer-based deep learning model designed to impute DNAm states for each CpG site in single cells.

View Article and Find Full Text PDF

DNA methylation (DNAm) is essential for brain development and function and potentially mediates the effects of genetic risk variants underlying brain disorders. We present INTERACT, a transformer-based deep learning model to predict regulatory variants impacting DNAm levels in specific brain cell types, leveraging existing single-nucleus DNAm data from the human brain. We show that INTERACT accurately predicts cell type-specific DNAm profiles, achieving an average area under the Receiver Operating Characteristic curve of 0.

View Article and Find Full Text PDF

Background: is an anaerobic Gram-negative bacterium that can lead to opportunistic infections, including Lemierre's syndrome and less common presentations of metastatic diseases. However, liver abscesses infected by in clinical settings are rarely reported, particularly in people with normal immune function.

Case Presentation: A 35-year-old Chinese man was admitted with hyperthermia and abdominal pain that had persisted for three days.

View Article and Find Full Text PDF

There is growing evidence for the role of DNA methylation (DNAm) quantitative trait loci (mQTLs) in the genetics of complex traits, including psychiatric disorders. However, due to extensive linkage disequilibrium (LD) of the genome, it is challenging to identify causal genetic variations that drive DNAm levels by population-based genetic association studies. This limits the utility of mQTLs for fine-mapping risk loci underlying psychiatric disorders identified by genome-wide association studies (GWAS).

View Article and Find Full Text PDF

Transcription factor binding sites (TFBSs) play an important role in gene expression regulation. Many computational methods for TFBS prediction need sufficient labeled data. However, many transcription factors (TFs) lack labeled data in cell types.

View Article and Find Full Text PDF

Motivation: The prediction of transcription factor binding sites (TFBSs) is crucial for gene expression analysis. Supervised learning approaches for TFBS predictions require large amounts of labeled data. However, many TFs of certain cell types either do not have sufficient labeled data or do not have any labeled data.

View Article and Find Full Text PDF

Most proposed methods for TF-binding site (TFBS) predictions only use low order dependencies for predictions due to the lack of efficient methods to extract higher order dependencies. In this work, we first propose a novel method to extract higher order dependencies by applying CNN on histone modification features. We then propose a novel TFBS prediction method, referred to as CNN_TF, by incorporating low order and higher order dependencies.

View Article and Find Full Text PDF

Most past works for DNA-binding residue prediction did not consider the relationships between residues. In this paper, we propose a novel approach for DNA-binding residue prediction, referred to as EL_LSTM, which includes two main components. The first component is the Long Short-Term Memory (LSTM), which learns pairwise relationships between residues through a bi-gram model and then learns feature vectors for all residues.

View Article and Find Full Text PDF

Background: Protein secondary structure is the three dimensional form of local segments of proteins and its prediction is an important problem in protein tertiary structure prediction. Developing computational approaches for protein secondary structure prediction is becoming increasingly urgent.

Results: We present a novel deep learning based model, referred to as CNNH_PSS, by using multi-scale CNN with highway.

View Article and Find Full Text PDF

Background: Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues.

Results: In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues.

View Article and Find Full Text PDF

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space.

View Article and Find Full Text PDF

Background: DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Identification of DNA-binding proteins is one of the major challenges in the field of genome annotation. There have been several computational methods proposed in the literature to deal with the DNA-binding protein identification.

View Article and Find Full Text PDF

Identification of DNA-binding proteins is an important problem in biomedical research as DNA-binding proteins are crucial for various cellular processes. Currently, the machine learning methods achieve the-state-of-the-art performance with different features. A key step to improve the performance of these methods is to find a suitable representation of proteins.

View Article and Find Full Text PDF

DNA-binding proteins are crucial for various cellular processes and hence have become an important target for both basic research and drug development. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to establish an automated method for rapidly and accurately identifying DNA-binding proteins based on their sequence information alone. Owing to the fact that all biological species have developed beginning from a very limited number of ancestral species, it is important to take into account the evolutionary information in developing such a high-throughput tool.

View Article and Find Full Text PDF

Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called "iDNA-Prot|dis", was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector.

View Article and Find Full Text PDF

DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance.

View Article and Find Full Text PDF