Comparative Study of Various Machine Learning Algorithms for Tweet Classification

Abubakar, Umar; Bashir, Sulaimon Adebayo; Abdullahi, Muhammad Bashir; Adewale, Olawale Surajodeen

Please use this identifier to cite or link to this item: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/12293

Title:	Comparative Study of Various Machine Learning Algorithms for Tweet Classification
Authors:	Abubakar, Umar Bashir, Sulaimon Adebayo Abdullahi, Muhammad Bashir Adewale, Olawale Surajodeen
Keywords:	Social Media Tweets Classification Feature Extraction Machine Learning Artificial Neural Networks Deep Learning
Issue Date:	Feb-2019
Publisher:	i-manager
Citation:	Umar Abubakar, Sulaimon A. Bashir, Muhammad Bashir Abdullahi, and Olawale S. Adewale. Comparative Study of Various Machine Learning Algorithms for Tweet Classification. i-manager’s Journal on Computer Science (JCOM), Vol. 6, No. 4, pp. 12-24, December 2018 - February 2019
Series/Report no.:	Vol. 6, No. 4, December 2018 - February 2019;
Abstract:	Twitter is a social networking platform that has become popular in recent years. It has become a versatile information dissemination tool used by individuals, businesses, celebrities, and news organizations. It allows users to share messages called tweets with one another. These messages can contain different types of information from personal opinions of users, advertisement of products belonging to all kinds of businesses to the news. Tweets can also contain messages that are racist, bigotry, offensive, and of extremist views as shown by research. Manual identification of such tweets is impossible as hundreds of millions of tweets are posted every day and hence a solution to automate the identification of these types of tweets through classification is required for the Twitter administrators or an intelligence and security analyst. This paper presents a comparative study of traditional machine learning algorithms and deep learning algorithms for the task of tweet classification to detect different categories of abusive languages with the aim to determine which algorithm performs best in detecting abusive language that is prevalent on social media. Two approaches for building feature vectors were explored. Feature vectors based on the bag-of-words method and feature vectors based on word embeddings. These two methods of feature representation were evaluated in this paper using tweet messages representing five abusive language categories. The experiments show that the deep learning algorithms trained with word embeddings outperformed all the other machine learning algorithms that were trained with feature vectors based on the bag-of-words approach.
URI:	http://repository.futminna.edu.ng:8080/jspui/handle/123456789/12293
ISSN:	2347-2227
Appears in Collections:	Computer Science

Files in This Item:

File	Description	Size	Format
2018 Comparative Study of Various Machine Learning Algorithms for Tweet Classification.pdf		415.75 kB	Adobe PDF	View/Open

Show full item record