On initial population generation in feature subset selection

DENIZ, A and KIZILOZ, Hakan (2019). On initial population generation in feature subset selection. Expert Systems with Applications, 137, 11-21.

[img]
Preview
PDF
initial_population_generation.pdf - Accepted Version
Creative Commons Attribution Non-commercial No Derivatives.

Download (975kB) | Preview
Official URL: https://www.sciencedirect.com/science/article/pii/...
Link to published version:: https://doi.org/10.1016/j.eswa.2019.06.063
Related URLs:

    Abstract

    Performance of evolutionary algorithms depends on many factors such as population size, number of generations, crossover or mutation probability, etc. Generating the initial population is one of the important steps in evolutionary algorithms. A poor initial population may unnecessarily increase the number of searches or it may cause the algorithm to converge at local optima. In this study, we aim to find a promising method for generating the initial population, in the Feature Subset Selection (FSS) domain. FSS is not considered as an expert system by itself, yet it constitutes a significant step in many expert systems. It eliminates redundancy in data, which decreases training time and improves solution quality. To achieve our goal, we compare a total of five different initial population generation methods; Information Gain Ranking (IGR), greedy approach and three types of random approaches. We evaluate these methods using a specialized Teaching Learning Based Optimization searching algorithm (MTLBO-MD), and three supervised learning classifiers: Logistic Regression, Support Vector Machines, and Extreme Learning Machine. In our experiments, we employ 12 publicly available datasets, mostly obtained from the well-known UCI Machine Learning Repository. According to their feature sizes and instance counts, we manually classify these datasets as small, medium, or large-sized. Experimental results indicate that all tested methods achieve similar solutions on small-sized datasets. For medium-sized and large-sized datasets, however, the IGR method provides a better starting point in terms of execution time and learning performance. Finally, when compared with other studies in literature, the IGR method proves to be a viable option for initial population generation.

    Item Type: Article
    Uncontrolled Keywords: 01 Mathematical Sciences; 08 Information and Computing Sciences; 09 Engineering; Artificial Intelligence & Image Processing
    Identification Number: https://doi.org/10.1016/j.eswa.2019.06.063
    Page Range: 11-21
    SWORD Depositor: Symplectic Elements
    Depositing User: Symplectic Elements
    Date Deposited: 16 May 2022 14:12
    Last Modified: 16 May 2022 14:12
    URI: https://shura.shu.ac.uk/id/eprint/30147

    Actions (login required)

    View Item View Item

    Downloads

    Downloads per month over past year

    View more statistics