Determining optimal centroids in repeated K-means simulations

MWITONDI, Kassim and MOUSTAFA, Rida (2012). Determining optimal centroids in repeated K-means simulations. CODATA Data Science Journal. (Unpublished)

Full text not available from this repository.
Official URL:


Most data mining problems involve data with heavy tails or data overlaps and so one of the main challenges data scientists face is to separate noise from meaningful data. In most applications standard non-linear approaches have not always been successful in detecting naturally arising structures in data. We propose an adaptive method for determining an optimal number of centroids in repeated K-Means simulations. The method derives from the particle swarm optimisation (PSO) but rather than following the conventional heuristc approach of PSO it adapts a mathematically formalised convergent approach. Applications on solar magnetic activity cycles data show that the proposed method can optimally identify K-Means centroids. We illustrated how the method can be extended to applications in seismology and oceanography. Key Words: Data Mining, K-Means, PSO, Solar Magnetic Activity Cycles, Supervised Modelling, Unsupervised Modelling

Item Type: Article
Research Institute, Centre or Group: Cultural Communication and Computing Research Institute > Communication and Computing Research Centre
Depositing User: Kassim Mwitondi
Date Deposited: 24 Sep 2012 17:21
Last Modified: 24 Sep 2012 17:21

Actions (login required)

View Item


Downloads per month over past year

View more statistics