Protein sequence is a wealth of experimental information which is yet to be exploited to extract information on protein homologues. Consequently, it is observed from publications that dynamic programming, heuristics and HMM profile-based alignment techniques along with the alignment free techniques do not directly utilize ordered profile of physicochemical properties of a protein to identify its homologue. Also, it is found that these works lack crucial bench-marking or validation in absence of which their incorporation in search engines may appears to be questionable. In this direction this research approach offers fixed dimensional numerical representation of protein sequences extending the concept of periodicity count value of nucleotide types (2017) to accommodate Euclidean distance as direct similarity measure between two proteins. Instead of bench-marking with BLAST and PSI-BLAST only, this new similarity measure was also compared with Needleman-Wunsch and Smith-Waterman. For enhancing the strength of comparison, this work for the first time introduces two novel benchmarking methods based on correlation of "similarity scores" and "proximity of ranked outputs from a standard sequence alignment method" between all possible pairs of search techniques including the new one presented in this paper. It is found that the novel and unique numerical representation of a protein can reduce computational complexity of protein sequence search to the tune of O(log(n)). It may also help implementation of various other similarity-based operation possible, such as clustering, phylogenetic analysis and classification of proteins on the basis of the properties used to build this numerical representation of protein.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12539-020-00380-wDOI Listing

Publication Analysis

Top Keywords

numerical representation
16
representation protein
16
protein sequence
12
fixed dimensional
8
dimensional numerical
8
protein
8
sequence search
8
similarity measure
8
sequence
5
protpcv fixed
4

Similar Publications

Background: In the United States, Black people represent 12% of the total US population and account for 19.3% of dementia cases. Social determinants of health (SDOH) and vascular comorbidities contribute to Black/African Americans having a higher risk of Alzheimer's disease and related dementias (ADRD).

View Article and Find Full Text PDF

Background: Black Americans (BAs), Hispanics/Latinos (H/Ls), and Africans (As) face a disproportionate burden of aging and Alzheimer's Disease and Related Dementias (AD/ADRD), coupled with underrepresentation in research. Further, researchers also report a lack of compliance on sensitive social determinants of health data for AD/ADRD research. For instance, the PRAPARE tool reports a low completion rate in community and clinical settings.

View Article and Find Full Text PDF

Background: Underdiagnosis of Alzheimer's disease and related dementias (ADRD) leads to lost opportunities for timely intervention, increased healthcare costs, and underestimation of the true burden of disease. To address this problem, we developed an AI algorithm, Decipher-AI (DEtection of Cognitive Impairment PHenotypes in EHR), to screen primary care patients for undiagnosed cognitive impairment (CI). We evaluated performance across sociodemographic groups using 3 years of EHR data before the first diagnosis or most recent visit.

View Article and Find Full Text PDF

Background: Few normative data for computerized measures administered in unsupervised remote environments are available. We aimed to determine what variables to include in normative models for remote self-administered assessments, develop normative data for measures administered through Mayo Test Drive (MTD, a multi-device remote cognitive assessment platform) and evaluate application of norms.

Method: 1240 adults ages 33-100 (96% White) from the Mayo Clinic Study of Aging and Mayo Alzheimer's Disease Research Center met normative sample inclusion criteria that included a concordant Cognitively Unimpaired (CU) diagnosis (3 independent raters all diagnosed CU) and CDR = 0 (see Table 1 for sample characteristics).

View Article and Find Full Text PDF

Background: Underdiagnosis of Alzheimer's disease and related dementias (ADRD) leads to lost opportunities for timely intervention, increased healthcare costs, and underestimation of the true burden of disease. To address this problem, we developed an AI algorithm, Decipher-AI (DEtection of Cognitive Impairment PHenotypes in EHR), to screen primary care patients for undiagnosed cognitive impairment (CI). We evaluated performance across sociodemographic groups using 3 years of EHR data before the first diagnosis or most recent visit.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!