KODAGODA, Gamhewage Nuwan (2018). Parallelization of formal concept analysis algorithms. Doctoral, Sheffield Hallam University. [Thesis]
Documents
24465:529190
PDF
Kodagoda_2018_phd_ParallelizationOfFormal.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Kodagoda_2018_phd_ParallelizationOfFormal.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (9MB) | Preview
Abstract
Formal Concept Analysis provides the mathematical notations for representing concepts and
concept hierarchies making use of order and lattice theory. This has now been used in
numerous applications which include software engineering, linguistics, sociology, information
sciences, information technology, genetics, biology and in engineering. The algorithms
derived from Kustenskov's CbO were found to provide the most efficient means of computing
formal concepts in several research papers. In this thesis key enhancements to the original
CbO algorithms are discussed in detail. The effects of these key features are presented in both
isolation and combination. Eight different variations of the CbO algorithms highlighting the
key features were compared in a level playing field by presenting them using the same notation
and implementing them from the notation in the same way. The three main enhancements
considered are the partial closure with incremental closure of intents, inherited canonicity test
failures and using a combined depth first and breadth first search. The algorithms were
implemented in an un-optimized way to focus on the comparison on the algorithms themselves
and not on any efficiencies provided by optimizing code.
One of the findings were that there is a significant performance improvement when partial
closure with incremental closure of intents is used in isolation. However there is no significant
performance improvement when the combined depth and breadth first search or the inherited
canonicity test failure feature is used in isolation. The inherited canonicity test failure needs
to be combined with the combined depth and breadth first feature to obtain a performance
increase. Combining all the three enhancements brought the best performance.
The main contribution of the thesis are the four new parallel In-Close3 algorithms. The shared
memory algorithms Direct Parallel In-Close3, the Queue Parallel In-Close3 algorithm and the
Distributed Memory In-Close3 algorithm showed significant potential. The shared memory
algorithms were implemented using OpenMP and the distributed memory algorithm was
implemented using MPI. All implementations were validated and showed scalability.
Experiments were carried to test the features of the parallel algorithms and their
implementations using the UK National Super Computer Archer and Colfax Clusters. The
thesis presents the key parallelization strategies used and presents experimental results of the
parallelization.
More Information
Statistics
Downloads
Share
Actions (login required)
View Item |