Parallelization of formal concept analysis algorithms

KODAGODA, Gamhewage Nuwan (2018). Parallelization of formal concept analysis algorithms. Doctoral, Sheffield Hallam University. [Thesis]

Documents
24465:529190
[thumbnail of Kodagoda_2018_phd_ParallelizationOfFormal.pdf]
Preview
PDF
Kodagoda_2018_phd_ParallelizationOfFormal.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (9MB) | Preview
Abstract
Formal Concept Analysis provides the mathematical notations for representing concepts and concept hierarchies making use of order and lattice theory. This has now been used in numerous applications which include software engineering, linguistics, sociology, information sciences, information technology, genetics, biology and in engineering. The algorithms derived from Kustenskov's CbO were found to provide the most efficient means of computing formal concepts in several research papers. In this thesis key enhancements to the original CbO algorithms are discussed in detail. The effects of these key features are presented in both isolation and combination. Eight different variations of the CbO algorithms highlighting the key features were compared in a level playing field by presenting them using the same notation and implementing them from the notation in the same way. The three main enhancements considered are the partial closure with incremental closure of intents, inherited canonicity test failures and using a combined depth first and breadth first search. The algorithms were implemented in an un-optimized way to focus on the comparison on the algorithms themselves and not on any efficiencies provided by optimizing code. One of the findings were that there is a significant performance improvement when partial closure with incremental closure of intents is used in isolation. However there is no significant performance improvement when the combined depth and breadth first search or the inherited canonicity test failure feature is used in isolation. The inherited canonicity test failure needs to be combined with the combined depth and breadth first feature to obtain a performance increase. Combining all the three enhancements brought the best performance. The main contribution of the thesis are the four new parallel In-Close3 algorithms. The shared memory algorithms Direct Parallel In-Close3, the Queue Parallel In-Close3 algorithm and the Distributed Memory In-Close3 algorithm showed significant potential. The shared memory algorithms were implemented using OpenMP and the distributed memory algorithm was implemented using MPI. All implementations were validated and showed scalability. Experiments were carried to test the features of the parallel algorithms and their implementations using the UK National Super Computer Archer and Colfax Clusters. The thesis presents the key parallelization strategies used and presents experimental results of the parallelization.
More Information
Statistics

Downloads

Downloads per month over past year

Metrics

Altmetric Badge

Dimensions Badge

Share
Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Actions (login required)

View Item View Item