RODRIGUES, Marcos (2018). Improving product classification using generative recurrent networks. In: PAPANIKOS, Gregory T., (ed.) Abstracts 2nd International Conference on Electrical Engineering, 23-26 July 2018, Athens, Greece. Athens Institute for Education and Research, 36-37. [Book Section]
Documents
21693:496447
PDF (Abstract)
FORM-ELEDAT-Marcos-Rodrigues-submitted.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.
FORM-ELEDAT-Marcos-Rodrigues-submitted.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.
Download (141kB) | Preview
Abstract
The issue addressed in this paper is related to machine learning techniques for automatic classification of product descriptions. The problem arises when database entries do no perfectly match and so it is questionable whether a description is related or not to the same item, product, or service. A typical example is merging disparate databases that is required, for instance, when one business buys off a competitor. An obvious solution would be to train an AI system to perform classification. The problem is that AI deep learning networks require vast amounts of training data, normally in tens or hundreds of thousand samples and normally such data are not available. We have investigated network models to augment the training data set in a flexible but reliable way. The principle is to train a network with the objective of generating new data similar but not exactly the same as the input data. Validation of the newly generated data is performed by a second network which has been trained on the original data. A simple binary decision (yes/no) is output whether or not generated data has enough or acceptable similarity with the original data. Accepted data would eventually make part of an augmented training set, improving the network ability to classify unseen data. We designed and implemented a recurrent network with Keras, an open source neural network library written in Python. The network is based on the LSTM-Long-Short Term Memory model which has proved useful to a large number of problems with time dependencies. The encoding of product description is character-based so, once trained, the network outputs a character and tries to predict what the next character would be. With an appropriate training set to learn the structure of the data, such networks can output valid vectors. We show that LSTMs are a good solution to the problem together with character-based text encoding and these represent the state-of-the-art in recurrent neural networks. Future work involves improvements to the network design model and testing SimpleRNN or GRU-Gate Recurrent Unit in place of LSTMs and fine-tuning of network parameters.
More Information
Statistics
Downloads
Downloads per month over past year
Metrics
Altmetric Badge
Dimensions Badge
Share
Actions (login required)
View Item |