Text classification by Convolution Networks for Data-Driven Decision Making

RODRIGUES, Marcos (2017). Text classification by Convolution Networks for Data-Driven Decision Making. In: ATINER 4th International Conference on Library and Information Science, Athens, Greece, 24-27 July 2017. ATINER. (In Press)

[img] PDF (Invited speaker submitted version)
Text classification by Convolution Networks for Data.pdf - Draft Version
Restricted to Repository staff only
Available under License All rights reserved.

Download (71kB)
[img] PDF (Letter of invitation)
2017Rodrigues-letter-of-acceptance.pdf - Supplemental Material
Restricted to Repository staff only
Available under License All rights reserved.

Download (181kB)
Official URL: https://www.atiner.gr/library

Abstract

Recent advances in automation and data-driven intelligence from sophisticated Artificial Intelligence (AI) technologies have impacted on all areas of knowledge and economic activity. AI Deep Learning is a method of learning and extracting knowledge from large amounts of data. AI algorithms iteratively learn from data, finding hidden features and providing insights without explicitly programmed features. Text classification can be cast as a generic problem whose solution can have significant impacts on data-driven decision processes and ERP-Enterprise Resource Planning information systems. Normally, classification is carried out from a given taxonomy. The causes for wrong classification may arise from inconsistent taxonomies, incomplete descriptions, wrong interpretation of category, inconsistent language translation, human error, algorithm design and so on. In this paper, we address the issue of automatic product classification from unconstrained textual descriptions using machine learning techniques. Rather than defining words in a vocabulary (as normally is the case for instance, with Google’s word2vec technique) this research focuses on character-based classification through a temporal convolution network as in Crepe (Character-level Convolutional Networks for Text Classification). The advantage is that instead of defining a vocabulary with tens of thousands of words, the vocabulary is made up of a small character set composed of the letters a-z, numbers 0-9, and special characters. Furthermore, because in any language words are defined by a sequence of characters, the relationships between the characters within a word or words are learned from the temporal convolution. This negates the need to learn words per se. The research used product descriptions from 6 categories: bakery, chilled, dairy, drinks, fruit and vegetables, meat and fish. A total of 8612 samples were used which were separated into a training set (7751 samples corresponding to 90% of the data) and unseen test set (861 samples or 10% of the data). The network has 15 convolution layers followed by 2 fully connected layers. The network was implemented using the Torch Framework on a Mac Pro running macOS Sierra 3.5GHz 6-core Intel Xeon E5 processor with 16GB of memory. The achieved overall accuracy of 91% is impressive given that the classification features were extracted from character sequences only and that descriptions are extremely short. It is shown that character-based classification is a valid solution for short descriptions and we are now investigating alternative network designs and expanding the training set.

Item Type: Conference or Workshop Item (Paper)
Research Institute, Centre or Group: Cultural Communication and Computing Research Institute > Communication and Computing Research Centre
Departments: Arts, Computing, Engineering and Sciences > Engineering and Mathematics
Depositing User: Marcos Rodrigues
Date Deposited: 09 May 2017 10:54
Last Modified: 22 Nov 2017 20:30
URI: http://shura.shu.ac.uk/id/eprint/15506

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics