Comparative analysis of BERT and FastText representations on crowdfunding campaign success prediction.

PeerJ Comput Sci

Software Engineering Department, Kocaeli University, Kocaeli, Marmara, Turkey.

Published: September 2024

Crowdfunding has become a popular financing method, attracting investors, businesses, and entrepreneurs. However, many campaigns fail to secure funding, making it crucial to reduce participation risks using artificial intelligence (AI). This study investigates the effectiveness of advanced AI techniques in predicting the success of crowdfunding campaigns on Kickstarter by analyzing campaign blurbs. We compare the performance of two widely used text representation models, bidirectional encoder representations from transformers (BERT) and FastText, in conjunction with long-short term memory (LSTM) and gradient boosting machine (GBM) classifiers. Our analysis involves preprocessing campaign blurbs, extracting features using BERT and FastText, and evaluating the predictive performance of these features with LSTM and GBM models. All experimental results show that BERT representations significantly outperform FastText, with the highest accuracy of 0.745 achieved using a fine-tuned BERT model combined with LSTM. These findings highlight the importance of using deep contextual embeddings and the benefits of fine-tuning pre-trained models for domain-specific applications. The results are benchmarked against existing methods, demonstrating the superiority of our approach. This study provides valuable insights for improving predictive models in the crowdfunding domain, offering practical implications for campaign creators and investors.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11419673PMC
http://dx.doi.org/10.7717/peerj-cs.2316DOI Listing

Publication Analysis

Top Keywords

bert fasttext
12
campaign blurbs
8
bert
5
comparative analysis
4
analysis bert
4
fasttext
4
fasttext representations
4
crowdfunding
4
representations crowdfunding
4
campaign
4

Similar Publications

The widespread fake news challenges the management of low-quality information, making effective detection strategies necessary. This study addresses this critical issue by advancing fake news detection in Arabic and overcoming limitations in existing approaches. Deep learning models, Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM), EfficientNetB4, Inception, Xception, ResNet, ConvLSTM and a novel voting ensemble framework combining CNN and LSTM are employed for text classification.

View Article and Find Full Text PDF

Multi-label text classification via secondary use of large clinical real-world data sets.

Sci Rep

November 2024

Research Unit for Digital Surgery, Division of Plastic, Aesthetic and Reconstructive Surgery, Department of Surgery, Medical University of Graz, Auenbruggerplatz 29/4, 8036, Graz, Austria.

Article Synopsis
  • Procedural coding is a significant challenge for clinicians, but advancements in natural language processing (NLP) can help create applications that ease their administrative tasks by predicting procedure codes from operative notes.
  • This study focused on adapting existing German medical BERT models to perform multi-label classification for predicting procedure codes, using approximately 350,000 coded surgery notes to train various models, including support vector machines and fastText.
  • The results indicated that support vector machines generally outperformed other models for longer reports, while fastText showed comparable performance to BERT models and was easier to explain, highlighting the potential for effective coding support applications.
View Article and Find Full Text PDF

Social media has emerged as a dominant platform where individuals freely share opinions and communicate globally. Its role in disseminating news worldwide is significant due to its easy accessibility. However, the increase in the use of these platforms presents severe risks for potentially misleading people.

View Article and Find Full Text PDF

Automatic authorship identification is a challenging task that has been the focus of extensive research in natural language processing. Regardless of the progress made in attributing authorship, the need for corpora in under-resourced languages impedes advancing and examining present methods. To address this gap, we investigate the problem of authorship attribution in Albanian.

View Article and Find Full Text PDF

Fine-Tuning Large Language Models to Enhance Programmatic Assessment in Graduate Medical Education.

J Educ Perioper Med

September 2024

The following authors are in both the Department of Anesthesiology, Uniformed Services University, Bethesda, MD, and Department of Anesthesiology and Pain Medicine, Naval Medical Center Portsmouth, Portsmouth, VA: Gregory J. Booth is an Associate Professor at Uniformed Services University and Program Director, Anesthesiology Residency at Naval Medical Center Portsmouth; Mike Mynes and Elizabeth Slama are Assistant Professors at Uniformed Services University and Staff Anesthesiologists at Naval Medical Center Portsmouth; Jeffrey Moore is an Assistant Professor at Uniformed Services University and Program Director, Pain Medicine Fellowship, and Associate Designated Institutional Official at Naval Medical Center Portsmouth. Thomas Hauert is an Anesthesiology Resident Physician at Naval Medical Center Portsmouth, Portsmouth, VA. Ashton Goldman is an Associate Professor at Uniformed Services University, Bethesda, MD, and a Staff Orthopedic Surgeon at the Department of Orthopedic Surgery and Sports Medicine at Naval Medical Center Portsmouth, Portsmouth, VA. John Hodgson is an Associate Professor and Program Director, Anesthesiology Residency at University of South Florida, Tampa, FL.

Background: Natural language processing is a collection of techniques designed to empower computer systems to comprehend and/or produce human language. The purpose of this investigation was to train several large language models (LLMs) to explore the tradeoff between model complexity and performance while classifying narrative feedback on trainees into the Accreditation Council for Graduate Medical Education subcompetencies. We hypothesized that classification accuracy would increase with model complexity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!