The Spectral Underpinning of word2vec.

Front Appl Math Stat

Department of Mathematics, University of Washington, Seattle, WA, United States.

Published: December 2020

Word2vec introduced by Mikolov et al. is a word embedding method that is widely used in natural language processing. Despite its success and frequent use, a strong theoretical justification is still lacking. The main contribution of our paper is to propose a rigorous analysis of the highly nonlinear functional of word2vec. Our results suggest that word2vec may be primarily driven by an underlying spectral method. This insight may open the door to obtaining provable guarantees for word2vec. We support these findings by numerical simulations. One fascinating open question is whether the nonlinear properties of word2vec that are not captured by the spectral method are beneficial and, if so, by what mechanism.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425479PMC
http://dx.doi.org/10.3389/fams.2020.593406DOI Listing

Publication Analysis

Top Keywords

word2vec word2vec
8
spectral method
8
word2vec
6
spectral underpinning
4
underpinning word2vec
4
word2vec introduced
4
introduced mikolov
4
mikolov word
4
word embedding
4
embedding method
4

Similar Publications

The increasing development of technology has led to the increase of digital data in various fields, such as medication-related texts. Sentiment Analysis (SA) in medication is essential to give clinicians insights into patients' feedback about the treatment procedure. Therefore, this study intends to develop Artificial Intelligence (AI) models to predict patients' sentiments.

View Article and Find Full Text PDF

Angiogenesis is imperative for bone regeneration, yet the conventional cytokine therapies have been constrained by prohibitive costs and safety apprehensions. It is urgent to develop a safer and more efficient therapeutic alternative. Herein, utilizing the methodologies of Deep Learning (DL) and Natural Language Processing (NLP), we proposed a paradigm algorithm that amalgamates with a variant, , to deftly discern potential pro-angiogenic peptides from intrinsically disordered regions (IDRs) of 262 related proteins, where are fertile grounds for developing safer and highly promising bioactive peptides.

View Article and Find Full Text PDF

Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques.

Materials And Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexicon was further extended using Word2Vec and GPT 3.

View Article and Find Full Text PDF

Deepstack-ACE: A deep stacking-based ensemble learning framework for the accelerated discovery of ACE inhibitory peptides.

Methods

December 2024

Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand. Electronic address:

Article Synopsis
  • Identifying ACE inhibitory peptides is essential for understanding the renin-angiotensin system and drug development.
  • A novel deep learning framework called Deepstack-ACE uses advanced techniques like word2vec and multiple deep learning models to enhance the identification of these peptides.
  • The model demonstrated superior performance compared to existing methods, achieving high accuracy scores and is supported by a user-friendly web server for wider access.
View Article and Find Full Text PDF

The high-level risk perception diffusion caused by public health emergencies seriously threatens public mental health and social stability. Much scholarly attention focused on the traditional epidemic models or simply combined content and social attributes, overlooking the differences in public individual characteristics. This paper proposes an SSEIIIR model of risk perception diffusion by innovatively subdividing susceptible people and infectious people.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!