Text classification by Convolution Networks for Data-Driven Decision Making

RODRIGUES, Marcos (2017). Text classification by Convolution Networks for Data-Driven Decision Making. In: PAPANIKOS, Gregory T., (ed.) Abstract book: 4th annual international conference on Library and Information Sciences. Athens, Athens Institute for Education and Research, 30-31.

[img]
Preview
PDF (Invited speaker submitted version)
Text classification by Convolution Networks for Data.pdf - Accepted Version
All rights reserved.

Download (71kB) | Preview
[img] PDF (Letter of invitation)
2017Rodrigues-letter-of-acceptance.pdf - Supplemental Material
Restricted to Repository staff only
All rights reserved.

Download (181kB)
Official URL: https://www.atiner.gr/library
Related URLs:

Abstract

Recent advances in automation and data-driven intelligence from sophisticated Artificial Intelligence (AI) technologies have impacted on all areas of knowledge and economic activity. AI Deep Learning is a method of learning and extracting knowledge from large amounts of data. AI algorithms iteratively learn from data, finding hidden features and providing insights without explicitly programmed features. Text classification can be cast as a generic problem whose solution can have significant impacts on data-driven decision processes and ERP-Enterprise Resource Planning information systems. Normally, classification is carried out from a given taxonomy. The causes for wrong classification may arise from inconsistent taxonomies, incomplete descriptions, wrong interpretation of category, inconsistent language translation, human error, algorithm design and so on. In this paper, we address the issue of automatic product classification from unconstrained textual descriptions using machine learning techniques. Rather than defining words in a vocabulary (as normally is the case for instance, with Google’s word2vec technique) this research focuses on character-based classification through a temporal convolution network as in Crepe (Character-level Convolutional Networks for Text Classification). The advantage is that instead of defining a vocabulary with tens of thousands of words, the vocabulary is made up of a small character set composed of the letters a-z, numbers 0-9, and special characters. Furthermore, because in any language words are defined by a sequence of characters, the relationships between the characters within a word or words are learned from the temporal convolution. This negates the need to learn words per se. The research used product descriptions from 6 categories: bakery, chilled, dairy, drinks, fruit and vegetables, meat and fish. A total of 8612 samples were used which were separated into a training set (7751 samples corresponding to 90% of the data) and unseen test set (861 samples or 10% of the data). The network has 15 convolution layers followed by 2 fully connected layers. The network was implemented using the Torch Framework on a Mac Pro running macOS Sierra 3.5GHz 6-core Intel Xeon E5 processor with 16GB of memory. The achieved overall accuracy of 91% is impressive given that the classification features were extracted from character sequences only and that descriptions are extremely short. It is shown that character-based classification is a valid solution for short descriptions and we are now investigating alternative network designs and expanding the training set.

Item Type: Book Section
Research Institute, Centre or Group - Does NOT include content added after October 2018: Cultural Communication and Computing Research Institute > Communication and Computing Research Centre
Departments - Does NOT include content added after October 2018: Faculty of Science, Technology and Arts > Department of Engineering and Mathematics
Page Range: 30-31
Depositing User: Marcos Rodrigues
Date Deposited: 09 May 2017 10:54
Last Modified: 18 Mar 2021 00:45
URI: https://shura.shu.ac.uk/id/eprint/15506

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics