ARAUJO DE SOUZA, Gabriel and DA COSTA ABREU, Marjory (2020). Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata. In: IEEE World Congress on Computational Intelligence (IEEE WCCI). IEEE. [Book Section]
Documents
26018:544846
PDF
IEEE_WCCI_Artificial_Intelligence.pdf - Accepted Version
Available under License All rights reserved.
IEEE_WCCI_Artificial_Intelligence.pdf - Accepted Version
Available under License All rights reserved.
Download (169kB) | Preview
Abstract
The popularity of social networks has only increased
in recent years. In theory, the use of social media was proposed
so we could share our views online, keep in contact with loved
ones or share good moments of life. However, the reality is
not so perfect, so you have people sharing hate speech-related
messages, or using it to bully specific individuals, for instance,
or even creating robots where their only goal is to target specific
situations or people. Identifying who wrote such text is not easy
and there are several possible ways of doing it, such as using
natural language processing or machine learning algorithms
that can investigate and perform predictions using the metadata associated with it. In this work, we present an initial
investigation of which are the best machine learning techniques
to detect offensive language in tweets. After an analysis of the
current trend in the literature about the recent text classification
techniques, we have selected Linear SVM and Naive Bayes
algorithms for our initial tests. For the preprocessing of data,
we have used different techniques for attribute selection that
will be justified in the literature section. After our experiments,
we have obtained 92% of accuracy and 95% of recall to detect
offensive language with Naive Bayes and 90% of accuracy and
92% of recall with Linear SVM. From our understanding, these
results overcome our related literature and are a good indicative
of the importance of the data description approach we have used.
More Information
Statistics
Downloads
Downloads per month over past year
Metrics
Altmetric Badge
Dimensions Badge
Share
Actions (login required)
View Item |