Evolving rules for document classification

HIRSCH, Laurence, SAEEDI, M and HIRSCH, R (2005). Evolving rules for document classification. In: Genetic programming. Lecture Notes in Computer Science (3447). Berlin, Springer, 85-95.

[img]
Preview
PDF
EuroGP05TextCatFinal.pdf

Download (118kB) | Preview
Official URL: http://link.springer.com/chapter/10.1007/978-3-540...
Link to published version:: 10.1007/978-3-540-31989-4_8

Abstract

We describe a novel method for using Genetic Programming to create compact classification rules based on combinations of N-Grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that because the induced rules are meaningful to a human analyst they may have a number of other uses beyond classification and provide a basis for text mining applications.

Item Type: Book Section
Additional Information: 8th European Conference, EuroGP 2005, Lausanne, Switzerland, March 30 - April 1, 2005. Proceedings
Research Institute, Centre or Group: Cultural Communication and Computing Research Institute > Communication and Computing Research Centre
Identification Number: 10.1007/978-3-540-31989-4_8
Depositing User: Laurence Hirsch
Date Deposited: 25 Feb 2013 16:37
Last Modified: 09 Nov 2016 22:54
URI: http://shura.shu.ac.uk/id/eprint/6622

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics