Severity: Warning
Message: fopen(/var/lib/php/sessions/ci_sessionjisd6ricbseljm9qogmgoal2a0iq8ssd): Failed to open stream: No space left on device
Filename: drivers/Session_files_driver.php
Line Number: 177
Backtrace:
File: /var/www/html/index.php
Line: 316
Function: require_once
Severity: Warning
Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)
Filename: Session/Session.php
Line Number: 137
Backtrace:
File: /var/www/html/index.php
Line: 316
Function: require_once
Severity: Warning
Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 197
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3145
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 316
Function: require_once
Motivation: Compared to eukaryotes, prokaryote genomes are more diverse through different mechanisms, including a higher mutation rate and horizontal gene transfer. Therefore, using a linear representative reference can cause a reference bias. Graph-based pangenome methods have been developed to tackle this problem. However, comparisons in DNA space are still challenging due to this high diversity. In contrast, amino acid sequences have higher similarity due to evolutionary constraints, whereby a single amino acid may be encoded by several synonymous codons. Coding regions cover the majority of the genome in prokaryotes. Thus, panproteomes present an attractive alternative leveraging the higher sequence similarity while not losing much of the genome in non-coding regions.
Results: We present PanPA, a method that takes a set of multiple sequence alignments of protein sequences, indexes them, and builds a graph for each multiple sequence alignment. In the querying step, it can align DNA or amino acid sequences back to these graphs. We first showcase that PanPA generates correct alignments on a panproteome from 1350 . To demonstrate that panproteomes allow comparisons at longer phylogenetic distances, we compare DNA and protein alignments from 1073 assemblies against reference genome, pangenome, and panproteome using BWA, GraphAligner, and PanPA, respectively; with PanPA aligning around 22% more sequences. We also aligned a DNA short-reads whole genome sequencing (WGS) sample from against the reference with BWA and the panproteome with PanPA, where PanPA was able to find alignment for 68% of the reads compared to 5% with BWA.
Availalability And Implementation: PanPA is available at https://github.com/fawaz-dabbaghieh/PanPA.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10748787 | PMC |
http://dx.doi.org/10.1093/bioadv/vbad167 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!