Content-aware Location Inference and Misinformation in Online Social Networks

AJAO, Oluwaseun (2019). Content-aware Location Inference and Misinformation in Online Social Networks. Doctoral, Sheffield Hallam University.

[img]
Preview
PDF
Ajao_2019_PhD_Content-AwareLocation.pdf - Accepted Version
Creative Commons Attribution Non-commercial No Derivatives.

Download (3MB) | Preview
Link to published version:: https://doi.org/10.7190/shu-thesis-00252

Abstract

Location inference is of potential use in the area of cybercrime prevention and misinformation detection. Inferring locations from user texts in Online Social Networks (OSN) is a non-trivial and challenging problem with regards to public safety. This work proposes LOCINFER - a novel non-uniform grid-based approach for location inference from Twitter messages using Quadtree spatial partitions. The proposed algorithm uses natural language processing (NLP) for semantic understanding and incorporates hybrid similarity measures for feature vector extraction and dimensionality reduction. LOCINFER addresses the sparsity problem which may be associated with training data following a biased clustering approach where densely populated regions within the data are partitioned into larger grids. The clustered grids are then classi�ed using a logistic regression model. The proposed method performed better than the state-of-the art in grid-based content-only location inference by more than 150km in Average Error Distance (AED) and almost 300km in Median Error Distance (MED). It also performed better than by 24% in terms of accuracy at 161km. It was 400km better in prediction for MED and 250km better in terms of AED. Also proposed is SENTDETECT - a technique that detects and classi�es fake news messages from Twitter posts using extensive experiments with machine learning and deep learning models including those without prior knowledge of the domain. Following a text-only approach, SENTDETECT utilises an additional feature of the word sentiments alongside the original text of the messages. Incorporating these engineered features into the feature vector, provides an enrichment of the vector space prior to the deep learning classi�cation task which utilised a Hierarchical Attention Networks (HAN) in pre-trained word embedding. An emotional word ratio (EMORATIO) was deduced following the discovery of a positive relationship between negative emotional words and fake news posts. Finally, the work aimed to perform automatic detection of misinformation posts and rumors. A lot of work has been done in the area of detecting the truthfulness or veracity of posts from OSN messages. This work presents a novel feature-augmented approach using both text and sentiments in enriching features used during prediction. The end result performed better by up to 40% in Recall and F-Measure over the state of the art on benchmark misinformation PHEME dataset which relied on textual features only. The blend of location inference with misinformation detection provides an e�ective tool in the �ght against vices on social media such as curtailing hate speech propagation, cyberbullying and fake news posts.

Item Type: Thesis (Doctoral)
Contributors:
Thesis advisor - Zargari, Shahrzad [0000-0001-6511-7646]
Additional Information: Director of studies: Dr. Shahrzad Zargari
Research Institute, Centre or Group - Does NOT include content added after October 2018: Sheffield Hallam Doctoral Theses
Identification Number: https://doi.org/10.7190/shu-thesis-00252
Depositing User: Colin Knott
Date Deposited: 24 Dec 2019 10:34
Last Modified: 03 May 2023 02:08
URI: https://shura.shu.ac.uk/id/eprint/25593

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics