ANAD: Arabic news article dataset.

Data Brief

Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha'il, Ha'il, 81481, Saudi Arabia.

Published: October 2023

In this paper, we present a modern standard Arabic dataset based on Arabic news articles collected over a one-year period from 01/01/2021 to 12/31/2021. In total, from 12 Arabic news websites, over 500,000 articles were collected, the selection of which was driven by a variety of topics, including sports, economies, local news, politics, tech, tourism, entertainment, cars, health, and art. The development of this dataset will enable data scientists to explore and experiment effectively in the field of natural language processing, and the dataset can also be used to develop machine learning and deep learning models to classify articles according to topic. The dataset is available for download at https://github.com/alaybaa/ArabicArticlesDataset/tree/main.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10415830PMC
http://dx.doi.org/10.1016/j.dib.2023.109460DOI Listing

Publication Analysis

Top Keywords

arabic news
12
articles collected
8
dataset
5
anad arabic
4
news
4
news article
4
article dataset
4
dataset paper
4
paper modern
4
modern standard
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!