Identifying Online Conspiracy Theories: A Multiphase Analysis of AI Models

DUBINKO, Nikolai (2025). Identifying Online Conspiracy Theories: A Multiphase Analysis of AI Models. Doctoral, Sheffield Hallam University. [Thesis]

Documents
36738:1163592
[thumbnail of Dubinko_2026_PhD_IdentifyingOnlineConspiracy.pdf]
PDF
Dubinko_2026_PhD_IdentifyingOnlineConspiracy.pdf - Accepted Version
Restricted to Repository staff only until 7 January 2027.
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB)
Abstract
The growth of online platforms has amplified the spread of conspiracy theories, presenting a significant challenge to social cohesion and online safety. While platforms like Twitter and Facebook have received considerable academic attention, the unique ecosystem of YouTube comments remains comparatively underexplored as a site of conspiratorial discourse. This dissertation addresses this gap by presenting a systematic investigation into the detection of conspiratorial narratives within YouTube comments using a comparative evaluation of artificial intelligence techniques. This dissertation presents and validates a multiphase methodological framework through a systematic empirical investigation. The research first conducts an exploratory discourse analysis of a prominent case study, the 2023 Maui wildfires, to identify the distinct linguistic, emotional and topical signatures of conspiratorial content. Building on these findings, a novel dataset of over 7,300 YouTube comments is manually annotated. The core of the study is a rigorous comparative evaluation of three tiers of AI detection methods: (1) traditional machine learning models (e.g., SVM, XGBoost) using engineered text and psycholinguistic features; (2) fine-tuned Transformer models (BERT, BERTweet); and (3) zero-shot classification with Large Language Models (LLMs). The results reveal a clear hierarchy of performance, with LLMs demonstrating superior accuracy, achieving a top F1-score of 0.880. Notably, a hybrid approach combining Google’s foundation model embeddings with an XGBoost classifier also performed exceptionally well, outperforming the fine-tuned Transformer models. A key finding is that the explicit addition of sentiment and emotion features did not consistently improve model performance. This suggests that the rich, high-dimensional embeddings from modern foundation models already implicitly encode these psycholinguistic signals, rendering separate feature engineering mostly redundant. This research concludes that the effective detection of nuanced, narrative-driven phenomena like conspiracy theories is increasingly dependent on deep semantic understanding rather than surface-level linguistic markers. The findings hold significant implications for both computational social science and practical content moderation,advocating for a shift towards leveraging foundation models and suggesting a prag�matic, structured approach for developing scalable moderation systems. Keywords: Conspiracy Theories, Disinformation, YouTube, Natural Language Processing (NLP), Large Language Models (LLMs), Machine Learning, Computational Social Science
More Information
Metrics

Altmetric Badge

Dimensions Badge

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Actions (login required)

View Item View Item