On initial population generation in feature subset selection

DENIZ, A and KIZILOZ, Hakan (2019). On initial population generation in feature subset selection. Expert Systems with Applications, 137, 11-21.

[img]
Preview
PDF
initial_population_generation.pdf - Accepted Version
Creative Commons Attribution Non-commercial No Derivatives.

Download (975kB) | Preview
Official URL: https://www.sciencedirect.com/science/article/pii/...
Link to published version:: https://doi.org/10.1016/j.eswa.2019.06.063

Abstract

Performance of evolutionary algorithms depends on many factors such as population size, number of generations, crossover or mutation probability, etc. Generating the initial population is one of the important steps in evolutionary algorithms. A poor initial population may unnecessarily increase the number of searches or it may cause the algorithm to converge at local optima. In this study, we aim to find a promising method for generating the initial population, in the Feature Subset Selection (FSS) domain. FSS is not considered as an expert system by itself, yet it constitutes a significant step in many expert systems. It eliminates redundancy in data, which decreases training time and improves solution quality. To achieve our goal, we compare a total of five different initial population generation methods; Information Gain Ranking (IGR), greedy approach and three types of random approaches. We evaluate these methods using a specialized Teaching Learning Based Optimization searching algorithm (MTLBO-MD), and three supervised learning classifiers: Logistic Regression, Support Vector Machines, and Extreme Learning Machine. In our experiments, we employ 12 publicly available datasets, mostly obtained from the well-known UCI Machine Learning Repository. According to their feature sizes and instance counts, we manually classify these datasets as small, medium, or large-sized. Experimental results indicate that all tested methods achieve similar solutions on small-sized datasets. For medium-sized and large-sized datasets, however, the IGR method provides a better starting point in terms of execution time and learning performance. Finally, when compared with other studies in literature, the IGR method proves to be a viable option for initial population generation.

Item Type: Article
Uncontrolled Keywords: 01 Mathematical Sciences; 08 Information and Computing Sciences; 09 Engineering; Artificial Intelligence & Image Processing
Identification Number: https://doi.org/10.1016/j.eswa.2019.06.063
Page Range: 11-21
SWORD Depositor: Symplectic Elements
Depositing User: Symplectic Elements
Date Deposited: 16 May 2022 14:12
Last Modified: 12 Oct 2023 12:15
URI: https://shura.shu.ac.uk/id/eprint/30147

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics