PRATIWI, Lustiana, CHOO, Yun-Huoy, MUDA, Azah Kamilah and PRATAMA, Satrya Fajri (2022). Swarm Intelligence-based Hierarchical Clustering for Identification of ncRNA using Covariance Search Model. International Journal of Advanced Computer Science and Applications (IJACSA), 13 (11): Paper 95, 822-831.
|
PDF
Pratama -SwarmIntelligenceBasedHierarchicalClustering(VoR).pdf - Published Version Creative Commons Attribution. Download (673kB) | Preview |
Abstract
Covariance Model (CM) has been quite effective in finding potential members of existing families of non-coding Ribonucleic Acid (ncRNA) identification and has provided ex-cellent accuracy in genome sequence database. However, it has significant drawbacks with family-specific search. An existing Hierarchical Agglomerative Clustering (HAC) technique merged overlapping sequences which is known as combined CM (CCM). However, the structural information will be discarded, and the sequence features of each family will be significantly diluted as the number of original structures increases. Additionally, it can only find members of the existing families and is not useful in finding potential members of novel ncRNA families. Furthermore, it is also important to construct generic sequence models which can be used to recognize new potential members of novel ncRNA families and define unknown ncRNA sequence as the potential members for known families. To achieve these objectives, this study proposes to implement Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) to ensure the CCMs have the best quality for every level of dendrogram hierarchy. This study will also apply distance matrix as the criteria to measure the compatibility between two CMs. The proposed techniques will be using five gene families with fifty sequences from each family from Rfam database which will be divided into training and testing dataset to test CMs combination method. The proposed techniques will be compared to the existing HAC in terms of identification accuracy, sum of bit-scores, and processing time, where each of these performance measurements will be statistically validated.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Covariance model; ncRNA identification; swarm intelligence; hierarchical clustering; 0803 Computer Software; 1005 Communications Technologies; 46 Information and computing sciences |
Identification Number: | https://doi.org/10.14569/IJACSA.2022.0131195 |
Page Range: | 822-831 |
SWORD Depositor: | Symplectic Elements |
Depositing User: | Symplectic Elements |
Date Deposited: | 07 Mar 2023 12:50 |
Last Modified: | 07 Mar 2023 16:27 |
URI: | https://shura.shu.ac.uk/id/eprint/31631 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year