DUBINKO, Nikolai (2025). Identifying Online Conspiracy Theories: A Multiphase Analysis of AI Models. Doctoral, Sheffield Hallam University. [Thesis]
Documents
36738:1163592
PDF
Dubinko_2026_PhD_IdentifyingOnlineConspiracy.pdf - Accepted Version
Restricted to Repository staff only until 7 January 2027.
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Dubinko_2026_PhD_IdentifyingOnlineConspiracy.pdf - Accepted Version
Restricted to Repository staff only until 7 January 2027.
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB)
Abstract
The growth of online platforms has amplified the spread of conspiracy theories,
presenting a significant challenge to social cohesion and online safety. While platforms
like Twitter and Facebook have received considerable academic attention, the unique
ecosystem of YouTube comments remains comparatively underexplored as a site
of conspiratorial discourse. This dissertation addresses this gap by presenting
a systematic investigation into the detection of conspiratorial narratives within
YouTube comments using a comparative evaluation of artificial intelligence techniques.
This dissertation presents and validates a multiphase methodological framework
through a systematic empirical investigation. The research first conducts an exploratory discourse analysis of a prominent case study, the 2023 Maui wildfires, to
identify the distinct linguistic, emotional and topical signatures of conspiratorial
content. Building on these findings, a novel dataset of over 7,300 YouTube comments
is manually annotated. The core of the study is a rigorous comparative evaluation of
three tiers of AI detection methods: (1) traditional machine learning models (e.g.,
SVM, XGBoost) using engineered text and psycholinguistic features; (2) fine-tuned
Transformer models (BERT, BERTweet); and (3) zero-shot classification with Large
Language Models (LLMs).
The results reveal a clear hierarchy of performance, with LLMs demonstrating
superior accuracy, achieving a top F1-score of 0.880. Notably, a hybrid approach
combining Google’s foundation model embeddings with an XGBoost classifier also
performed exceptionally well, outperforming the fine-tuned Transformer models. A
key finding is that the explicit addition of sentiment and emotion features did not consistently improve model performance. This suggests that the rich, high-dimensional
embeddings from modern foundation models already implicitly encode these psycholinguistic signals, rendering separate feature engineering mostly redundant.
This research concludes that the effective detection of nuanced, narrative-driven
phenomena like conspiracy theories is increasingly dependent on deep semantic understanding rather than surface-level linguistic markers. The findings hold significant
implications for both computational social science and practical content moderation,advocating for a shift towards leveraging foundation models and suggesting a prag�matic, structured approach for developing scalable moderation systems.
Keywords: Conspiracy Theories, Disinformation, YouTube, Natural Language Processing (NLP), Large Language Models (LLMs), Machine Learning, Computational
Social Science
More Information
Metrics
Altmetric Badge
Dimensions Badge
Share
Actions (login required)
![]() |
View Item |


Tools
Tools