Text classification by Convolution Networks for Data-Driven Decision Making

RODRIGUES, Marcos (2017). Text classification by Convolution Networks for Data-Driven Decision Making. In: PAPANIKOS, Gregory T., (ed.) Abstract book: 4th annual international conference on Library and Information Sciences. Athens, Athens Institute for Education and Research, 30-31. [Book Section]

Documents
15506:141068
[thumbnail of Invited speaker submitted version]
Preview
PDF (Invited speaker submitted version)
Text classification by Convolution Networks for Data.pdf - Accepted Version
Available under License All rights reserved.

Download (71kB) | Preview
15506:141074
[thumbnail of Letter of invitation]
PDF (Letter of invitation)
2017Rodrigues-letter-of-acceptance.pdf - Supplemental Material
Restricted to Repository staff only
Available under License All rights reserved.

Download (181kB)
Abstract
Recent advances in automation and data-driven intelligence from sophisticated Artificial Intelligence (AI) technologies have impacted on all areas of knowledge and economic activity. AI Deep Learning is a method of learning and extracting knowledge from large amounts of data. AI algorithms iteratively learn from data, finding hidden features and providing insights without explicitly programmed features. Text classification can be cast as a generic problem whose solution can have significant impacts on data-driven decision processes and ERP-Enterprise Resource Planning information systems. Normally, classification is carried out from a given taxonomy. The causes for wrong classification may arise from inconsistent taxonomies, incomplete descriptions, wrong interpretation of category, inconsistent language translation, human error, algorithm design and so on. In this paper, we address the issue of automatic product classification from unconstrained textual descriptions using machine learning techniques. Rather than defining words in a vocabulary (as normally is the case for instance, with Google’s word2vec technique) this research focuses on character-based classification through a temporal convolution network as in Crepe (Character-level Convolutional Networks for Text Classification). The advantage is that instead of defining a vocabulary with tens of thousands of words, the vocabulary is made up of a small character set composed of the letters a-z, numbers 0-9, and special characters. Furthermore, because in any language words are defined by a sequence of characters, the relationships between the characters within a word or words are learned from the temporal convolution. This negates the need to learn words per se. The research used product descriptions from 6 categories: bakery, chilled, dairy, drinks, fruit and vegetables, meat and fish. A total of 8612 samples were used which were separated into a training set (7751 samples corresponding to 90% of the data) and unseen test set (861 samples or 10% of the data). The network has 15 convolution layers followed by 2 fully connected layers. The network was implemented using the Torch Framework on a Mac Pro running macOS Sierra 3.5GHz 6-core Intel Xeon E5 processor with 16GB of memory. The achieved overall accuracy of 91% is impressive given that the classification features were extracted from character sequences only and that descriptions are extremely short. It is shown that character-based classification is a valid solution for short descriptions and we are now investigating alternative network designs and expanding the training set.
More Information
Statistics

Downloads

Downloads per month over past year

View more statistics

Metrics

Altmetric Badge

Dimensions Badge

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Actions (login required)

View Item View Item