Proposed Discriminative Lexical Features for Real-time Detection of Malware Uniform Resource Locator

Olalere, Morufu; Abdullah, Mohd Taufik; Mahmod &, Ramlan; Abdullah, Azizol

Please use this identifier to cite or link to this item: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/1502

Title:	Proposed Discriminative Lexical Features for Real-time Detection of Malware Uniform Resource Locator
Authors:	Olalere, Morufu Abdullah, Mohd Taufik Mahmod &, Ramlan Abdullah, Azizol
Keywords:	Attackers Blacklist Lexical Features Malware URL Rea-time Malware URL Detection
Issue Date:	Dec-2016
Series/Report no.:	Vol 9(46);
Abstract:	To identify discriminative lexical features of malware URL through manual examination, and to study prevalence of these features thereby leading to proposition of discriminative lexical feature for real-time detection of malware URL. Methods/Statistical Analysis: Manual examination of malware URL using existing blacklist of malware URLs and empirical analysis allowed the authors to identify discriminative lexical features and to determine whether there is consistency in the way the attackers craft malware URLs respectively. Empirical analysis was carried on both the existing blacklisted malware URLs and newly collected malware URLs. Empirical analysis revealed that there is consistency in the way malware URLs is crafted by the attackers. To evaluate performance of our proposed lexical features, two previously used machine learning models were applied on our trained dataset of malware URLS and benign URLs. The essence of using these models is to enable us compare performance of our proposed lexical features with previous studies proposed feature groups. Our comparison shows that our proposed lexical features outperform previously proposed feature groups. Findings: Our first step was to manually examine blacklisted malware URLs. This step led to the identification of 12 discriminative lexical features which was later reduced to 11. The second step was an empirical analysis of the identified features of existing blacklisted malware URLs and newly collected malware URLs. Empirical analysis was carried out to determine whether there was consistency in the way malware URLs are crafted by the attackers. The results of our empirical analysis revealed that there is indeed consistency in the way malware URLs are crafted by the attackers. This implies that our carefully identified lexical features are common features of malware URL. After experimentation, the evaluation results reveal that our proposed lexical features outperform previously proposed feature groups. Applications/Improvements: Discriminative features are required to build real-time malware URLs detection system with machine learning algorithm. The proposed lexical features are set of discriminative feature that rely on textual properties of malware URL.
URI:	http://repository.futminna.edu.ng:8080/jspui/handle/123456789/1502
Appears in Collections:	Cyber Security Science

Files in This Item:

File	Description	Size	Format
Morufu et al 2016_proposed discriminative.pdf		308.61 kB	Adobe PDF	View/Open

Show full item record