A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

LLM4THP: a computing tool to identify tumor homing peptides by molecular and sequence representation of large language model based on two-layer ensemble model strategy. | LitMetric

AI Article Synopsis

  • Tumor homing peptides (THPs) specifically target tumor cells, offering a promising method for cancer treatment and detection, but traditional detection methods are slow and costly.
  • To address these challenges, LLM4THP is introduced as a novel computational tool that leverages large language models (LLMs) to efficiently detect THPs using encoded peptide sequences and various sequence characteristics.
  • LLM4THP employs an ensemble learning strategy, combining multiple machine learning techniques for improved THP identification, and demonstrates superior performance metrics compared to current methods, with the source code available online.

Article Abstract

Tumor homing peptides (THPs) have a distinctive capacity to specifically attach to tumor cells, providing a promising approach for targeted cancer treatment and detection. Although THPs have the potential for significant impact, their detection by conventional methods is both time-consuming and expensive. To tackle this issue, we provide LLM4THP, an innovative computational approach that utilizes large language models (LLMs) to quickly and effectively detect THPs. LLM4THP utilizes two protein LLMs, ESM2 and Prot_T5_XL_UniRef50, to encode peptide sequences. This allows for the capture of complex patterns and relationships within the peptide data. In addition, we utilize inherent sequence characteristics such as Amino Acid Composition (AAC), Pseudo Amino Acid Composition (PAAC), Amphiphilic Pseudo Amino Acid Composition (APAAC), and Composition, Transition, and Distribution (CTD) to improve the representation of peptides. The RDKitDescriptors feature representation approach transforms peptide sequences into molecular objects and computes chemical characteristics, resulting in enhanced THP identification. The LLM4THP ensemble strategy incorporates various features into a two-layer learning architecture. The first layer consists of LightGBM, XGBoost, Random Forest, and Extremely Randomized Trees, which generate a set of meta results. The second layer utilizes Logistic Regression to further refine the identification of sequences as either THP or non-THP. LLM4THP exhibits exceptional performance compared to the most advanced methods, showcasing enhancements in accuracy, Matthew's correlation coefficient, F1 score, area under the curve, and average precision. The source code and dataset can be accessed at the following URL: https://github.com/abcair/LLM4THP.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480143PMC
http://dx.doi.org/10.1007/s00726-024-03422-5DOI Listing

Publication Analysis

Top Keywords

amino acid
12
acid composition
12
tumor homing
8
homing peptides
8
large language
8
peptide sequences
8
pseudo amino
8
llm4thp
5
llm4thp computing
4
computing tool
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!