Publications by Polikar R | LitMetric

Publications by authors named "Polikar R"

Page 1 of 3

The Naïve Bayes classifier++ for metagenomic taxonomic classification-query evaluation.

Haozhe Neil Duan Gavin Hearne Robi Polikar Gail L Rosen

Bioinformatics

December 2024

Motivation: This study examines the query performance of the NBC++ (Incremental Naive Bayes Classifier) program for variations in canonicality, k-mer size, databases, and input sample data size. We demonstrate that both NBC++ and Kraken2 are influenced by database depth, with macro measures improving as depth increases. However, fully capturing the diversity of life, especially viruses, remains a challenge.

View Article and Find Full Text PDF

Complet+: a computationally scalable method to improve completeness of large-scale protein sequence clustering.

Rachel Nguyen Bahrad A Sokhansanj Robi Polikar Gail L Rosen

PeerJ

February 2023

A major challenge for clustering algorithms is to balance the trade-off between homogeneity, , the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters.

View Article and Find Full Text PDF

Machine Learning Analysis of Digital Clock Drawing Test Performance for Differential Classification of Mild Cognitive Impairment Subtypes Versus Alzheimer's Disease.

Russell Binaco Nicholas Calzaretto Jacob Epifano Sean McGuire Muhammad Umer

J Int Neuropsychol Soc

August 2020

Objective: To determine how well machine learning algorithms can classify mild cognitive impairment (MCI) subtypes and Alzheimer's disease (AD) using features obtained from the digital Clock Drawing Test (dCDT).

Methods: dCDT protocols were administered to 163 patients diagnosed with AD(n = 59), amnestic MCI (aMCI; n = 26), combined mixed/dysexecutive MCI (mixed/dys MCI; n = 43), and patients without MCI (non-MCI; n = 35) using standard clock drawing command and copy procedures, that is, draw the face of the clock, put in all of the numbers, and set the hands for "10 after 11." A digital pen and custom software recorded patient's drawings.

View Article and Find Full Text PDF

Extensions to Online Feature Selection Using Bagging and Boosting.

Gregory Ditzler Joseph LaBarck James Ritchie Gail Rosen Robi Polikar

IEEE Trans Neural Netw Learn Syst

September 2018

Feature subset selection can be used to sieve through large volumes of data and discover the most informative subset of variables for a particular learning problem. Yet, due to memory and other resource constraints (e.g.

View Article and Find Full Text PDF

A Sequential Learning Approach for Scaling Up Filter-Based Feature Subset Selection.

Gregory Ditzler Robi Polikar Gail Rosen Gregory Ditzler Robi Polikar

IEEE Trans Neural Netw Learn Syst

June 2018

Increasingly, many machine learning applications are now associated with very large data sets whose sizes were almost unimaginable just a short time ago. As a result, many of the current algorithms cannot handle, or do not scale to, today's extremely large volumes of data. Fortunately, not all features that make up a typical data set carry information that is relevant or useful for prediction, and identifying and removing such irrelevant features can significantly reduce the total data size.

View Article and Find Full Text PDF

Identifying amyloid pathology-related cerebrospinal fluid biomarkers for Alzheimer's disease in a multicohort study.

Yuk Yee Leung Jon B Toledo Alexey Nefedov Robi Polikar Nandini Raghavan

Alzheimers Dement (Amst)

September 2015

Introduction: The dynamic range of cerebrospinal fluid (CSF) amyloid β (Aβ) measurement does not parallel to cognitive changes in Alzheimer's disease (AD) and cognitively normal (CN) subjects across different studies. Therefore, identifying novel proteins to characterize symptomatic AD samples is important.

Methods: Proteins were profiled using a multianalyte platform by Rules Based Medicine (MAP-RBM).

View Article and Find Full Text PDF

Multi-Layer and Recursive Neural Networks for Metagenomic Classification.

Gregory Ditzler Robi Polikar Gail Rosen

IEEE Trans Nanobioscience

September 2015

Recent advances in machine learning, specifically in deep learning with neural networks, has made a profound impact on fields such as natural language processing, image classification, and language modeling; however, feasibility and potential benefits of the approaches to metagenomic data analysis has been largely under-explored. Deep learning exploits many layers of learning nonlinear feature representations, typically in an unsupervised fashion, and recent results have shown outstanding generalization performance on previously unseen data. Furthermore, some deep learning methods can also represent the structure in a data set.

View Article and Find Full Text PDF

P50: A candidate ERP biomarker of prodromal Alzheimer's disease.

Deborah L Green Lisa Payne Robi Polikar Paul J Moberg David A Wolk

Brain Res

October 2015

Introduction: Reductions of cerebrospinal fluid (CSF) amyloid-beta (Aβ42) and elevated phosphorylated-tau (p-Tau) reflect in vivo Alzheimer's disease (AD) pathology and show utility in predicting conversion from mild cognitive impairment (MCI) to dementia. We investigated the P50 event-related potential component as a noninvasive biomarker of AD pathology in non-demented elderly.

Methods: 36 MCI patients were stratified into amyloid positive (MCI-AD, n=17) and negative (MCI-Other, n=19) groups using CSF levels of Aβ42.

View Article and Find Full Text PDF

A bootstrap based Neyman-Pearson test for identifying variable importance.

Gregory Ditzler Robi Polikar Gail Rosen

IEEE Trans Neural Netw Learn Syst

April 2015

Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant.

View Article and Find Full Text PDF

COMPOSE: A semisupervised learning framework for initially labeled nonstationary streaming data.

Karl B Dyer Robert Capo Robi Polikar

IEEE Trans Neural Netw Learn Syst

January 2014

An increasing number of real-world applications are associated with streaming data drawn from drifting and nonstationary distributions that change over time. These applications demand new algorithms that can learn and adapt to such changes, also known as concept drift. Proper characterization of such data with existing approaches typically requires substantial amount of labeled instances, which may be difficult, expensive, or even impractical to obtain.

View Article and Find Full Text PDF

Guest editorial. Learning in nonstationary and evolving environments.

Robi Polikar Cesare Alippi

IEEE Trans Neural Netw Learn Syst

January 2014

View Article and Find Full Text PDF

Hemodynamic response to repeated noxious cold pressor tests measured by functional near infrared spectroscopy on forehead.

Zeinab Barati Patricia A Shewokis Meltem Izzetoglu Robi Polikar George Mychaskiw

Ann Biomed Eng

February 2013

The objective of this research was to assess the utility of a simple near infrared spectroscopy (NIRS) technology for objective assessment of the hemodynamic response to acute pain. For this exploration, we used functional near infrared spectroscopy (fNIRS) to measure the hemodynamic response on the forehead during three trials of a cold pressor test (CPT) in 20 adults. To measure hemodynamic changes at the superficial tissues as well as the intracranial tissues, two configurations of 'far' and 'near' source-detector separations were used.

View Article and Find Full Text PDF

Ordering samples along environmental gradients using particle swarm optimization.

Steven Essinger Robi Polikar Gail Rosen

Annu Int Conf IEEE Eng Med Biol Soc

July 2012

Due to the enormity of the solution space for sequential ordering problems, non-exhaustive heuristic techniques have been the focus of many research efforts, particularly in the field of operations research. In this paper, we outline an ecologically motivated problem in which environmental samples have been obtained along a gradient (e.g.

View Article and Find Full Text PDF

Analysis of complexity based EEG features for the diagnosis of Alzheimer's disease.

Tyler Staudinger Robi Polikar

Annu Int Conf IEEE Eng Med Biol Soc

June 2012

As life expectancy increases, particularly in the developed world, so does the prevalence of Alzheimer's Disease (AD). AD is a neurodegenerative disorder characterized by neurofibrillary plaques and tangles in the brain that leads to neuronal death and dementia. Early diagnosis of AD is still a major unresolved health concern: several biomarkers are being investigated, among which the electroencephalogram (EEG) provides the only option for an electrophysiological information.

View Article and Find Full Text PDF

Incremental learning of concept drift in nonstationary environments.

Ryan Elwell Robi Polikar

IEEE Trans Neural Netw

October 2011

We introduce an ensemble of classifiers-based approach for incremental learning of concept drift, characterized by nonstationary environments (NSEs), where the underlying data distributions change over time. The proposed algorithm, named Learn(++). NSE, learns from consecutive batches of data without making any assumptions on the nature or rate of drift; it can learn from such environments that experience constant or variable rate of drift, addition or deletion of concept classes, as well as cyclical drift.

View Article and Find Full Text PDF

Information-theoretic approaches to SVM feature selection for metagenome read classification.

Elaine Garbarine Joseph DePasquale Vinay Gadia Robi Polikar Gail Rosen

Comput Biol Chem

June 2011

Analysis of DNA sequences isolated directly from the environment, known as metagenomics, produces a large quantity of genome fragments that need to be classified into specific taxa. Most composition-based classification methods use all features instead of a subset of features that may maximize classifier accuracy. We show that feature selection methods can boost performance of taxonomic classifiers.

View Article and Find Full Text PDF

Discovering the unknown: improving detection of novel species and genera from short reads.

Gail L Rosen Robi Polikar Diamantino A Caseiro Steven D Essinger Bahrad A Sokhansanj

J Biomed Biotechnol

August 2011

High-throughput sequencing technologies enable metagenome profiling, simultaneous sequencing of multiple microbial species present within an environmental sample. Since metagenomic data includes sequence fragments ("reads") from organisms that are absent from any database, new algorithms must be developed for the identification and annotation of novel sequence fragments. Homology-based techniques have been modified to detect novel species and genera, but, composition-based methods, have not been adapted.

View Article and Find Full Text PDF

Multimodal EEG, MRI and PET data fusion for Alzheimer's disease diagnosis.

Robi Polikar Christopher Tilley Brendan Hillis Chris M Clark

Annu Int Conf IEEE Eng Med Biol Soc

March 2011

Alarmingly increasing prevalence of Alzheimer's disease (AD) due to the aging population in developing countries, combined with lack of standardized and conclusive diagnostic procedures, make early diagnosis of Alzheimer's disease a major public health concern. While no current medical treatment exists to stop or reverse this disease, recent dementia specific pharmacological advances can slow its progression, making early diagnosis all the more important. Several noninvasive biomarkers have been proposed, including P300 based EEG analysis, MRI volumetric analysis, PET based metabolic activity analysis, as alternatives to neuropsychological evaluation, the current gold standard of diagnosis.

View Article and Find Full Text PDF

Signal processing for metagenomics: extracting information from the soup.

Gail L Rosen Bahrad A Sokhansanj Robi Polikar Mary Ann Bruns Jacob Russell

Curr Genomics

November 2009

Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data.

View Article and Find Full Text PDF

Heart failure with normal ejection fraction (HFNEF): is it worth considering?

Coralie Blanche Thierry Fumeaux Ralf Polikar

Swiss Med Wkly

February 2010

A significant proportion of patients with heart failure happen to have a normal ventricular ejection fraction at echocardiography during examination. Previously called diastolic heart failure, it is nowadays referred to as heart failure with normal ejection fraction (HFNEF) or HF with preserved ejection fraction. The European Society of Cardiology, recognizing the importance of this type of heart failure, recently issued new definition criteria for it.

View Article and Find Full Text PDF

ERP based decision fusion for AD diagnosis across cohorts.

Metin Ahiskali Deborah Green John Kounios Christopher M Clark Robi Polikar

Annu Int Conf IEEE Eng Med Biol Soc

March 2010

As the average life expectancy increases, particularly in developing countries, prevalence of neurodegenerative diseases has also increased. This trend is especially alarming for Alzheimer's disease (AD); as there is no cure to stop or reverse the effects of AD. However, recent pharmacological advances can slow the progression of AD, but only if AD is diagnosed at early stages.

View Article and Find Full Text PDF

EEG and MRI data fusion for early diagnosis of Alzheimer's disease.

Tejash Patel Robi Polikar Christos Davatzikos Christopher M Clark

Annu Int Conf IEEE Eng Med Biol Soc

May 2009

The prevalence of Alzheimer's disease (AD) is rising alarmingly as the average age of our population increases. There is no treatment to halt or slow the pathology responsible for AD, however, new drugs are promising to reduce the rate of progression. On the other hand, the efficacy of these new medications critically depends on our ability to diagnose AD at the earliest stage.

View Article and Find Full Text PDF

Learn ++.NC: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes.

Michael D Muhlbaier Apostolos Topalis Robi Polikar

IEEE Trans Neural Netw

January 2009

We have previously introduced an incremental learning algorithm Learn(++), which learns novel information from consecutive data sets by generating an ensemble of classifiers with each data set, and combining them by weighted majority voting. However, Learn(++) suffers from an inherent "outvoting" problem when asked to learn a new class omega(new) introduced by a subsequent data set, as earlier classifiers not trained on this class are guaranteed to misclassify omega(new) instances. The collective votes of earlier classifiers, for an inevitably incorrect decision, then outweigh the votes of the new classifiers' correct decision on omega(new) instances--until there are enough new classifiers to counteract the unfair outvoting.

View Article and Find Full Text PDF

Model comparison for automatic characterization and classification of average ERPs using visual oddball paradigm.

A C Merzagora M Butti R Polikar M Izzetoglu S Bunce

Clin Neurophysiol

February 2009

Objective: To determine whether automated classifiers can be used for correctly identifying target categorization responses from averaged event-related potentials (ERPs) along with identifying appropriate features and classification models for computer-assisted investigation of attentional processes.

Methods: ERPs were recorded during a target categorization task. Automated classification of average target ERPs versus average non-target ERPs was performed by extracting different combinations of features from the P300 and N200 components, which were used to train six classifiers: Euclidean classifier (EC), Mahalanobis discriminant (MD), quadratic classifier (QC), Fisher linear discriminant (FLD), multi-layer perceptron neural network (MLP) and support vector machine (SVM).

View Article and Find Full Text PDF

Local classifier weighting by quadratic programming.

Hakan Cevikalp Robi Polikar

IEEE Trans Neural Netw

October 2008

It has been widely accepted that the classification accuracy can be improved by combining outputs of multiple classifiers. However, how to combine multiple classifiers with various (potentially conflicting) decisions is still an open problem. A rich collection of classifier combination procedures -- many of which are heuristic in nature -- have been developed for this goal.

View Article and Find Full Text PDF