Publications by authors named "Philippe Fournier-Viger"

Article Synopsis
  • Genomic data is expanding rapidly, leading to challenges in analyzing and classifying sequences, especially for identifying dangerous viruses that could cause pandemics.
  • Current classification models often overlook the importance of nucleotide and amino acid sequencing, which are crucial for understanding the structure and function of genomes.
  • The new method, GenoAnaCla, employs sequential pattern mining to analyze RNA virus genome sequences, utilizing eight classifiers and achieving a 3.18% improvement in accuracy compared to existing techniques.
View Article and Find Full Text PDF

Heat shock proteins (HSPs) from different families and sub-types play a vital role in the folding and unfolding of proteins, in maintaining cellular health, and in preventing serious disorders. Previous computational methods for HSP classification have yielded promising performance. However, most of the existing methods rely heavily on amino acid composition features and still face challenges related to interpretability and accuracy.

View Article and Find Full Text PDF

This paper presents a dataset comprising 700 video sequences encoded in the two most popular video formats (codecs) of today, H.264 and H.265 (HEVC).

View Article and Find Full Text PDF

Genome sequence analysis and classification play critical roles in properly understanding an organism's main characteristics, functionalities, and changing (evolving) nature. However, the rapid expansion of genomic data makes genome sequence analysis and classification a challenging task due to the high computational requirements, proper management, and understanding of genomic data. Recently proposed models yielded promising results for the task of genome sequence classification.

View Article and Find Full Text PDF

This paper presents a novel framework, called PSAC-PDB, for analyzing and classifying protein structures from the Protein Data Bank (PDB). PSAC-PDB first finds, analyze and identifies protein structures in PDB that are similar to a protein structure of interest using a protein structure comparison tool. Second, the amino acids (AA) sequences of identified protein structures (obtained from PDB), their aligned amino acids (AAA) and aligned secondary structure elements (ASSE) (obtained by structural alignment), and frequent AA (FAA) patterns (discovered by sequential pattern mining), are used for the reliable detection/classification of protein structures.

View Article and Find Full Text PDF

An obvious defect of extreme learning machine (ELM) is that its prediction performance is sensitive to the random initialization of input-layer weights and hidden-layer biases. To make ELM insensitive to random initialization, GPRELM adopts the simple an effective strategy of integrating Gaussian process regression into ELM. However, there is a serious overfitting problem in kernel-based GPRELM (GPRELM).

View Article and Find Full Text PDF

High occupancy pattern mining has been recently studied as an improved method for frequent pattern mining. It considers the proportion of each pattern in the transactions where the pattern occurred. The results of high occupancy pattern mining can be employed for automated control systems in order to make decisions.

View Article and Find Full Text PDF

Online learning is playing an increasingly important role in education. Massive open online course (MOOC) platforms are among the most important tools in online learning, and record historical learning data from an extremely large number of learners. To enhance the learning experience, a promising approach is to apply sequential pattern mining (SPM) to discover useful knowledge in these data.

View Article and Find Full Text PDF

The genome of the novel coronavirus (COVID-19) disease was first sequenced in January 2020, approximately a month after its emergence in Wuhan, capital of Hubei province, China. COVID-19 genome sequencing is critical to understanding the virus behavior, its origin, how fast it mutates, and for the development of drugs/vaccines and effective preventive strategies. This paper investigates the use of artificial intelligence techniques to learn interesting information from COVID-19 genome sequences.

View Article and Find Full Text PDF

High-utility sequential pattern (HUSP) mining is an emerging topic in the field of knowledge discovery in databases. It consists of discovering subsequences that have a high utility (importance) in sequences, which can be referred to as HUSPs. HUSPs can be applied to many real-life applications, such as market basket analysis, e-commerce recommendations, click-stream analysis, and route planning.

View Article and Find Full Text PDF
Article Synopsis
  • Privacy-preserving data mining is crucial for safeguarding sensitive information while still allowing for meaningful data analysis, but it poses significant challenges as an NP-hard problem.
  • Many existing evolutionary algorithms address this issue but tend to focus on single-objective functions with set weight values, limiting their effectiveness.
  • The paper introduces a new multiple objective particle swarm optimization method (CMPSO) that utilizes a density clustering approach, demonstrating better performance in hiding sensitive information by adapting to user preferences through extensive testing on two datasets.
View Article and Find Full Text PDF

Mining useful patterns from varied types of databases is an important research topic, which has many real-life applications. Most studies have considered the frequency as sole interestingness measure to identify high-quality patterns. However, each object is different in nature.

View Article and Find Full Text PDF

High-utility sequential pattern mining (HUSPM) has become an important issue in the field of data mining. Several HUSPM algorithms have been designed to mine high-utility sequential patterns (HUPSPs). They have been applied in several real-life situations such as for consumer behavior analysis and event detection in sensor networks.

View Article and Find Full Text PDF