Please use this identifier to cite or link to this item: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/18898
Title: Performing Data Augmentation Experiment to Enhance Model Accuracy: A Case Study of BBC News’ Data
Authors: Ugwuoke, Uchenna Cosmas
Aminu, Enesi Femi
Ekundayo, Ayobami
Keywords: Data augmentation
WordNet
BBC news data
LSTM model
Issue Date: Oct-2022
Publisher: ELSEVIER-SSRN
Series/Report no.: ISSN-1556-5068;
Abstract: In natural language processing, text classification forms an essential task to be performed; as such, the use of machine learning algorithms have constantly become indispensable and significance to the research drive. However, the problem of solving text classification with the traditional models gets more challenging because of ambiguities associated with natural languages. A typical example is synonyms’ concept mismatch, and other related issues that accurately attribute text to their related contexts. While a more robust model with an increased number of hidden layers such as LSTM is essential, because of the volume of data involved; exploration of strategies for data augmentation is highly significant. To this end, this research aims to employs semantic lexical database, called WordNet as strategy to augment the BBC news textual data obtained from kaggle repository. This is to pave way for a more efficient news data classification based on the proposed LSTM model. The total BBC news samples are 2,225 data points, and each data point is grouped into five different news categories, which include, technology news, business news, sport news, entertainment news, and political news. Experimental evaluations are carried out using the benchmark BBC news dataset; and the newly augmented dataset within the scope of this study. Consequently, the accuracy of the classification LSTM model for original news dataset and the augmented dataset are 90% and 95% respectively. Therefore, the proposed data augmentation strategy is promising for textual datasets.
Description: Proceedings of International Conference on Information systems and Emerging Technologies, 2022.
URI: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/18898
ISSN: ELSEVIER-SSRN - ISSN-1556-5068
Appears in Collections:Computer Science

Files in This Item:
File Description SizeFormat 
BBC.pdfPerforming Data Augmentation Experiment to Enhance Model Accuracy561.68 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.