HIRSCH, Laurence, SAEEDI, M and HIRSCH, R (2005). Evolving rules for document classification. In: Genetic programming. Lecture Notes in Computer Science (3447). Berlin, Springer, 85-95.
Download (118kB) | Preview
We describe a novel method for using Genetic Programming to create compact classification rules based on combinations of N-Grams (character strings). Genetic programs acquire fitness by producing rules that are effective classifiers in terms of precision and recall when evaluated against a set of training documents. We describe a set of functions and terminals and provide results from a classification task using the Reuters 21578 dataset. We also suggest that because the induced rules are meaningful to a human analyst they may have a number of other uses beyond classification and provide a basis for text mining applications.
|Item Type:||Book Section|
|Additional Information:||8th European Conference, EuroGP 2005, Lausanne, Switzerland, March 30 - April 1, 2005. Proceedings|
|Research Institute, Centre or Group:||Cultural Communication and Computing Research Institute > Communication and Computing Research Centre|
|Depositing User:||Laurence Hirsch|
|Date Deposited:||25 Feb 2013 16:37|
|Last Modified:||09 Nov 2016 22:54|
Actions (login required)
Downloads per month over past year