A Health-Focused Risk Taxonomy for AI: Assessing Unsafe Content Detection with Small Language Models (SLMs)

HEWITT, L, TAMIMI, AKA, COPELAND, Robert, MOORE, R and JHANJI, S (2025). A Health-Focused Risk Taxonomy for AI: Assessing Unsafe Content Detection with Small Language Models (SLMs). Ceur Workshop Proceedings, 3985, 1-9. [Article]

Documents
36001:1001950
[thumbnail of Copeland-AHealth-FocusedRisk(VoR).pdf]
Preview
PDF
Copeland-AHealth-FocusedRisk(VoR).pdf - Published Version
Available under License Creative Commons Attribution.

Download (321kB) | Preview
Abstract
Large Language Models (LLMs) show promise in healthcare. To make the most of this technology, there is a need to address concerns about computational demands and privacy. Small Language Models (SLMs) offer a privacy-preserving alternative for specialised medical applications due to their lower resource needs and potential for local deployment. This paper examines existing LLM safeguarding frameworks and introduces a novel, health-focused risk taxonomy developed through literature review and co-design with healthcare professionals. Furthermore, the ability of 6 SLMs to detect unsafe content using 2 additional risk taxonomies is evaluated and compared. The 8b-parameter Granite Guardian model showed superior adaptation to the novel risk taxonomy (75% accuracy) even without fine-tuning, representing a promising direction for safe and reliable applications of SLMs in clinical settings.
More Information
Statistics

Downloads

Downloads per month over past year

View more statistics

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Actions (login required)

View Item View Item