Transtech: development of a novel translator for Roman Urdu to English.

Hafsa Masroor Muhammad Saeed Maryam Feroz Kamran Ahsan Khawar Islam

Heliyon

Department of Computer Science, Federal Urdu University of Arts, Science and Technology, Karachi, Pakistan.

Published: May 2019

Advances in machine and language translation immerge new fields and research opportunities for researchers, whereas Natural Language Processing and Computational Linguistics deal with communication between natural languages and their interaction. The objective of this research is to develop and test a novel tactic to solve the issue of translation from Roman Urdu to the English language. The approach used to construct this practical model is divided into three stages; each stage works out to achieve its desired task. Self-maintained corpus alongwith its corresponding tag-set is used for tokenization. The syntactical structure is covered by writing Urdu POS tagger based on grammatical rules. We prepared the grammatical structures of different sentences for Roman Urdu to English translation. Since Roman script can be expressed in numerous ways, our grammatical structures fulfill the maximum possible needs of writing and produce the best possible English translation. We entered a sentence in Roman Urdu which gave the best possible translation in the English language. In comparison with Google Translator, Transtech worked better and gives more accurate results.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6538981	PMC
http://dx.doi.org/10.1016/j.heliyon.2019.e01780	DOI Listing

Publication Analysis

Top Keywords

roman urdu

urdu english

translation roman

english language

grammatical structures

english translation

roman

urdu

english

translation

Similar Publications

A dataset of Roman Urdu text with spelling variations for sentence level sentiment analysis.

Data Brief

December 2024

Department of Information Technology, University of Sindh, Jamshoro, Pakistan.

Mudasar Ahmed Soomro Rafia Naz Memon Asghar Ali Chandio Mehwish Leghari Muhammad Hanif Soomro

Roman Urdu text is very widespread on many websites. People mostly prefer to give their social comments or product reviews in Roman Urdu, and Roman Urdu is counted as non-standard language. The main reason for this is that there is no rule for word spellings within Roman Urdu words, so people create and post their own word spellings, like "2mro" is a nonstandard spelling for tomorrow.

View Article and Find Full Text PDF

Similar Publications

An automated approach to identify sarcasm in low-resource language.

PLoS One

December 2024

Department of Computer Science, Al Ain University, Al Ain, UAE.

Shumaila Khan Iqbal Qasim Wahab Khan Aurangzeb Khan Javed Ali Khan

Sarcasm detection has emerged due to its applicability in natural language processing (NLP) but lacks substantial exploration in low-resource languages like Urdu, Arabic, Pashto, and Roman-Urdu. While fewer studies identifying sarcasm have focused on low-resource languages, most of the work is in English. This research addresses the gap by exploring the efficacy of diverse machine learning (ML) algorithms in identifying sarcasm in Urdu.

View Article and Find Full Text PDF

Similar Publications

Roman urdu hate speech detection using hybrid machine learning models and hyperparameter optimization.

Sci Rep

November 2024

Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, 38541, Republic of Korea.

Waqar Ashiq Samra Kanwal Adnan Rafique Muhammad Waqas Tahir Khurshaid

With the rapid increase of users over social media, cyberbullying, and hate speech problems have arisen over the past years. Automatic hate speech detection (HSD) from text is an emerging research problem in natural language processing (NLP). Researchers developed various approaches to solve the automatic hate speech detection problem using different corpora in various languages, however, research on the Urdu language is rather scarce.

View Article and Find Full Text PDF

Similar Publications

Depression detection with machine learning of structural and non-structural dual languages.

Healthc Technol Lett

August 2024

Department of Computer Science Emerson University Multan Pakistan.

Filza Rehmani Qaisar Shaheen Muhammad Anwar Muhammad Faheem Shahzad Sarwar Bhatti

Depression is a serious mental state that negatively impacts thoughts, feelings, and actions. Social media use is rapidly growing, with people expressing themselves in their regional languages. In Pakistan and India, many people use Roman Urdu on social media.

View Article and Find Full Text PDF

Similar Publications

Language-adaptive artificial intelligence: assessing CHATGPT'S answer to frequently asked questions on total hip arthroplasty questions.

J Pak Med Assoc

April 2024

Department of Surgery, Section of Orthopaedics, Aga Khan University Hospital.

Muhammad Talal Ibrahim Sarah Ashraf Khaskheli Hania Shahzad Shahryar Noordin

ChatGPT is reported to be an acceptable tool to answer a majority of frequently asked patient questions. ChatGPT also converses in other languages including Urdu, which offers immense potential for the education of Pakistani patients. Therefore, this study evaluated ChatGPT's Urdu answers to the ten most frequently asked questions on Total Hip Arthroplasty, which were then rated by an expert.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!