A robust machine learning approach to SDG data segmentation

MWITONDI, Kassim S., MUNYAKAZI, Isaac and GATSHENI, Barnabas N. (2020). A robust machine learning approach to SDG data segmentation. Journal of Big Data, 7 (1), p. 97.

Mwitondi_RobustMachineLearning(VoR).pdf - Published Version
Creative Commons Attribution.

Download (2MB) | Preview
Open Access URL: https://journalofbigdata.springeropen.com/articles... (Published version)
Link to published version:: https://doi.org/10.1186/s40537-020-00373-y


In the light of the recent technological advances in computing and data explosion, the complex interactions of the Sustainable Development Goals (SDG) present both a challenge and an opportunity to researchers and decision makers across fields and sectors. The deep and wide socio-economic, cultural and technological variations across the globe entail a unified understanding of the SDG project. The complexity of SDGs interactions and the dynamics through their indicators align naturally to technical and application specifics that require interdisciplinary solutions. We present a consilient approach to expounding triggers of SDG indicators. Illustrated through data segmentation, it is designed to unify our understanding of the complex overlap of the SDGs by utilising data from different sources. The paper treats each SDG as a Big Data source node, with the potential to contribute towards a unified understanding of applications across the SDG spectrum. Data for five SDGs was extracted from the United Nations SDG indicators data repository and used to model spatio-temporal variations in search of robust and consilient scientific solutions. Based on a number of pre-determined assumptions on socio-economic and geo-political variations, the data is subjected to sequential analyses, exploring distributional behaviour, component extraction and clustering. All three methods exhibit pronounced variations across samples, with initial distributional and data segmentation patterns isolating South Africa from the remaining five countries. Data randomness is dealt with via a specially developed algorithm for sampling, measuring and assessing, based on repeated samples of different sizes. Results exhibit consistent variations across samples, based on socio-economic, cultural and geo-political variations entailing a unified understanding, across disciplines and sectors. The findings highlight novel paths towards attaining informative patterns for a unified understanding of the triggers of SDG indicators and open new paths to interdisciplinary research.

Item Type: Article
Uncontrolled Keywords: 08 Information and Computing Sciences
Identification Number: https://doi.org/10.1186/s40537-020-00373-y
Page Range: p. 97
SWORD Depositor: Symplectic Elements
Depositing User: Symplectic Elements
Date Deposited: 11 Nov 2020 16:31
Last Modified: 17 Mar 2021 20:45
URI: https://shura.shu.ac.uk/id/eprint/27584

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics