Comparative study on SMS spam message detection with different machine learning methods for safety communication
File version
Author(s)
Muthusamy, S
Mirjalili, S
Vignesh, M
Vishnuhari, R
Raja, SKS
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Thillaiarasu, N.
Lata Tripathi, Suman
Dhinakaran, V.
Date
Size
File type(s)
Location
License
Abstract
Over recent years, the utilisation of short message service (SMS) has been growing significantly. Along with it, there is a notable increase in spam messages from spammers. SMS spam is any kind of unsolicited text in the form of promotional content, Web links or any other irrelevant text note that is sent to your mobile phone for advertisement purposes. The low cost of SMS offered by telecom companies is one of the factors for high usage of SMS. The surge in unsolicited messages across all platforms including emails and SMS has created a need for the advancement and refinement to counteract spam messages, especially SMS spam messages. It really disturbs the users. Hence a variety of methods have been used to detect spams. We are using a dataset of real SMS spams from UCI Machine Learning Repository. The dataset contains a total of 5,574 SMS messages, in which 774 SMS messages were spam and 4827 were ham. In preprocessing of the data, we removed the stop words that do not give much significance to the data. For feature extraction, we use Wordnet Lemmatiser for tokenisation and Count Vectoriser for converting the words into vectors. Various machine learning algorithms are deployed to this dataset for training and testing. The machine learning algorithms are as follows: naive Bayes classifier, logistic regression, random forest classifier and decision tree classifier. We evaluate these models with the evaluation metrics such as precision score, recall score and accuracy score. We are taking accuracy as the primary evaluation metric to be considered for the most effective algorithm for detecting SMS spam messages. Among the abovementioned existing machine learning algorithms, random forest classifier is the most suitable algorithm as it possesses a higher accuracy of 98.13 percent for SMS spam detection.
Journal Title
Conference Title
Book Title
Artificial Intelligence for Internet of Things: Design Principle, Modernization, and Techniques
Edition
1st
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Artificial intelligence
Persistent link to this record
Citation
Krishnamoorthy, N; Muthusamy, S; Mirjalili, S; Vignesh, M; Vishnuhari, R; Raja, SKS, Comparative study on SMS spam message detection with different machine learning methods for safety communication, Artificial Intelligence for Internet of Things: Design Principle, Modernization, and Techniques, 2022, pp. 65-73