All together against hate: ensemble based LLMs for multi class hate speech classification in the football context

Santos, Guto Leoni; Gaboardi dos Santos, Vitor; Kearns, Colm; Sinclair, Gary; Black, Jack; Doidge, Mark; Fletcher, Thomas; Kilvington, Daniel; Liston, Katie; Endo, Patricia; Lynn, Theo

All together against hate: ensemble based LLMs for multi class hate speech classification in the football context

Tools

SANTOS, Guto Leoni, GABOARDI DOS SANTOS, Vitor, KEARNS, Colm, SINCLAIR, Gary, BLACK, Jack, DOIDGE, Mark, FLETCHER, Thomas, KILVINGTON, Daniel, LISTON, Katie, ENDO, Patricia and LYNN, Theo (2026). All together against hate: ensemble based LLMs for multi class hate speech classification in the football context. Journal of Big Data, 13 (1): 53. [Article]

[+][-]

Documents

37032:1202573
37032:1233204

[+][-]

37032:1202573

[thumbnail of Black (2026b) Santos et al. (2026) Uploaded Version.pdf]

Preview

PDF
Black (2026b) Santos et al. (2026) Uploaded Version.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

37032:1233204

[thumbnail of Black-AllTogetherAgainstHate(VoR).pdf]

Preview

PDF
Black-AllTogetherAgainstHate(VoR).pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (2MB) | Preview

Abstract

The rise of social media platforms like Twitter has transformed communication, fostering community engagement and knowledge sharing across diverse groups. However, it has also provided a stage for toxic content, including hate speech, which can manifest in harmful ways within specific contexts, such as discussions surrounding football. Hate speech in this domain often targets individuals or groups based on attributes such as race, ethnicity, or nationality, and is exacerbated by the emotionally charged nature of sports discourse. While binary classification models have traditionally been employed to detect hate speech, they struggle to address nuanced and context-specific forms of abuse, including microaggressions and intersectional hate speech. Multi-class classification enables a more detailed understanding by distinguishing between various types of hate speech, but these models face challenges such as lexico-semantic variability and rapidly evolving norms within the football community. In this paper, we propose an ensemble technique leveraging BERT-based transformers to improve hate speech detection in football-related discussions on Twitter. Our method integrates manually-annotated datasets and multiple classifiers within ensemble frameworks to enhance accuracy and robustness. The results demonstrate that our approach significantly improves the identification of diverse forms of hate speech in the football context, contributing to more effective content moderation and fostering safer online communities.

More Information

Official URL:

https://link.springer.com/article/10.1186/s40537-0...

Open Access URL:

https://link.springer.com/content/pdf/10.1186/s405...

Open Access Version:

Published version

Uncontrolled Keywords:

College of Health, Wellbeing and Life Sciences : School of Sport and Physical Activity; Centre for Sport and Exercise Science; Humanities Research Centre; Cultural Communication and Computing Research Institute; Sociology, Politics and Policy Research Group; 08 Information and Computing Sciences; 46 Information and computing sciences

Identifiers

Identification Number:

10.1186/s40537-026-01379-8

ORCID for Jack Black: