Every information retrieval (IR) model embeds in its scoring function a form of term frequency (TF) quantification. The contribution of the term frequency is determined by the properties of the function of the chosen TF quantification, and by its TF normalization. The first defines how independent the occurrences of multiple terms are, while the second acts on mitigating the a priori probability of having a high term frequency in a document (estimation usually based on the document length). New test collections, coming from different domains (e.g. medical, legal), give evidence that not only document length, but in addition, verboseness of documents should be explicitly considered. Therefore we propose and investigate a systematic combination of document verboseness and length. To theoretically justify the combination, we show the duality between document verboseness and length. In addition, we investigate the duality between verboseness and other components of IR models. We test these new TF normalizations on four suitable test collections. We do this on a well defined spectrum of TF quantifications. Finally, based on the theoretical and experimental observations, we show how the two components of this new normalization, document verboseness and length, interact with each other. Our experiments demonstrate that the new models never underperform existing models, while sometimes introducing statistically significantly better results, at no additional computational cost.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6208902PMC
http://dx.doi.org/10.1007/s10791-018-9334-1DOI Listing

Publication Analysis

Top Keywords

term frequency
12
document verboseness
12
verboseness length
12
document length
8
test collections
8
length addition
8
document
6
length
5
verboseness
5
systematic approach
4

Similar Publications

Virulence profiling of Campylobacter spp., C. jejuni and C. fetus subsp. fetus abortions rise in sheep farms in Kashmir, India.

Pol J Vet Sci

June 2024

Campylobacter Laboratory; Division of Veterinary Microbiology and Immunology, Faculty of Veterinary Sciences and Animal Husbandry, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir (SKUAST-K), Shuhama (Aulesteng)-19006, Jammu and Kashmir, India.

Campylobacter spp. are the leading causes of ovine abortions leading to severe economic losses and a source of bacterial food borne illness in humans, posing a major public health concern. This study reports an increase in Brucella negative abortions in sheep farms in Kashmir, India in the last few years.

View Article and Find Full Text PDF

Background: Long-term use of levodopa, a metabolic precursor of dopamine (DA) for alleviation of motor symptoms in Parkinson's disease (PD), can cause a serious side effect known as levodopa-induced dyskinesia (LID). With the development of LID, high-frequency gamma oscillations (~100 Hz) are registered in the motor cortex (MCx) in patients with PD and rats with experimental PD. Studying alterations in the activity within major components of motor networks during transition from levodopa-off state to dyskinesia can provide useful information about their contribution to the development of abnormal gamma oscillations and LID.

View Article and Find Full Text PDF

Unemployment and mental health: a global study of unemployment's influence on diverse mental disorders.

Front Public Health

December 2024

Department of Government and Public Policy, Faculty of Contemporary Studies, National Defence University, Islamabad, Pakistan.

Introduction: Globally, one in five individuals faces unemployment, which substantially increases their risk of developing mental disorders. Understanding the relationship between unemployment and specific mental health outcomes is crucial for formulating effective policy interventions.

Methods: This study examines the relationship between unemployment and mental disorders across 201 countries from 1970 to 2020.

View Article and Find Full Text PDF

Objective: To investigate the impact of freshmen's mental health on their short-and long-term academic performance, as well as to provide empirical evidence for improving university students' mental health intervention tactics and higher education quality.

Methods: A multiple regression model was used to analyze student data from 3 years of enrollment at M University in Fujian Province.

Results: Different mental health problems have a significant impact on academic performance, which varies by gender, enrollment year, and subject.

View Article and Find Full Text PDF

Study Objectives: Polysomnography (PSG) currently serves as the benchmark for evaluating sleep disorders. Its discomfort makes long-term monitoring unfeasible, leading to bias in sleep quality assessment. Hence, less invasive, cost-effective, and portable alternatives need to be explored.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!