Kurdish News Dataset Headlines (KNDH) through multiclass classification.

Data Brief

Faculty of Engineering & Computer Science, Qaiwan International University, Sulaymaniyah, Kurdistan Region-Iraq.

Published: June 2023

The rapid growth of technology has massively increased the amount of text data. The data can be mined and utilized for numerous natural language processing (NLP) tasks, particularly text classification. The core part of text classification is collecting the data for predicting a good model. This paper collects Kurdish News Dataset Headlines (KNDH) for text classification. The dataset consists of 50000 news headlines which are equally distributed among five classes, with 10000 headlines for each class (Social, Sport, Health, Economic, and Technology). The percentage ratio of getting the channels of headlines is distinct, while the numbers of samples are equal for each category. There are 34 distinct channels that are used to collect the different headlines for each class, such as 8 channels for economics, 14 channels for health, 18 channels for science, 15 channels for social, and 5 channels for sport. The dataset is preprocessed using the Kurdish Language Processing Toolkit (KLPT) for tokenizing, spell-checking, stemming, and preprocessing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10147969PMC
http://dx.doi.org/10.1016/j.dib.2023.109120DOI Listing

Publication Analysis

Top Keywords

text classification
12
kurdish news
8
news dataset
8
dataset headlines
8
headlines kndh
8
language processing
8
headlines class
8
channels
7
headlines
6
dataset
4

Similar Publications

Aim: COVID-19 has brought unprecedented challenges to the healthcare system. The rapid spread of the virus, laboratory burn-out, exhausted staff, diagnostic uncertainty and lack of guidelines cumulatively disrupted hospital antimicrobial stewardship (AMS) programs. This scoping review evaluated how the COVID-19 pandemic has impacted the implementation of AMS, particularly within the context of clinical audits.

View Article and Find Full Text PDF

Current diagnostic methods for dyslexia primarily rely on traditional paper-and-pencil tasks. Advanced technological approaches, including eye-tracking and artificial intelligence (AI), offer enhanced diagnostic capabilities. In this paper, we bridge the gap between scientific and diagnostic concepts by proposing a novel dyslexia detection method, called INSIGHT, which combines a visualisation phase and a neural network-based classification phase.

View Article and Find Full Text PDF

The angle of polarized light (AOP) property for optical classification of the crosslinked polymer.

Spectrochim Acta A Mol Biomol Spectrosc

December 2024

Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Federal Territory of Kuala Lumpur, Kuala Lumpur 50603 Malaysia.

Light-matter interaction has been profoundly studied for sample material classification. However, the optical classification of the sample through the polarized light-matter interaction remains underexplored. It is limited to the measurement of intensity instead of the angle of polarized light (AOP) for its degree of polarization.

View Article and Find Full Text PDF

Cell fate decisions, such as proliferation, differentiation, and death, are driven by complex molecular interactions and signaling cascades. While significant progress has been made in understanding the molecular determinants of these processes, historically, cell fate transitions were identified through light microscopy that focused on changes in cell morphology and function. Modern techniques have shifted towards probing molecular effectors to quantify these transitions, offering more precise quantification and mechanistic understanding.

View Article and Find Full Text PDF

Background: Recently, deep learning has become a popular area of research, and has revolutionized the diagnosis and prediction of ocular diseases, especially fundus diseases. This study aimed to conduct a bibliometric analysis of deep learning in the field of ophthalmology to describe international research trends and examine the current research directions.

Methods: This cross-sectional bibliometric analysis examined the development of research on deep learning in the field of ophthalmology and its sub-topics from 2015 to 2024.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!