A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Extended many-item similarity indices for sets of nucleotide and protein sequences. | LitMetric

Extended many-item similarity indices for sets of nucleotide and protein sequences.

Comput Struct Biotechnol J

Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117 Budapest, Hungary.

Published: June 2021

Quantification of similarities between protein sequences or DNA/RNA strands is a (sub-)task that is ubiquitously present in bioinformatics workflows, and is usually accomplished by pairwise comparisons of sequences, utilizing simple ( percent identity) or more intricate concepts ( substitution scoring matrices). Complex tasks (such as clustering) rely on a large number of pairwise comparisons under the hood, instead of a direct quantification of set similarities. Based on our recently introduced framework that enables multiple comparisons of binary molecular fingerprints (, direct calculation of the similarity of fingerprint sets), here we introduce novel symmetric similarity indices for analogous calculations on sets of character sequences with more than two () possible items ( DNA/RNA sequences with  = 4, or protein sequences with  = 20). The features of these new indices are studied in detail with analysis of variance (ANOVA), and demonstrated with three case studies of protein/DNA sequences with varying degrees of similarity (or evolutionary proximity). The Python code for the extended many-item similarity indices is publicly available at: https://github.com/ramirandaq/tn_Comparisons.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8253954PMC
http://dx.doi.org/10.1016/j.csbj.2021.06.021DOI Listing

Publication Analysis

Top Keywords

similarity indices
12
protein sequences
12
extended many-item
8
many-item similarity
8
pairwise comparisons
8
sequences
7
similarity
5
indices
4
indices sets
4
sets nucleotide
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!