A fuzzy neural network based dynamic data allocation model on heterogeneous multi-GPUs for large-scale computations

ZHANG, Chaolong, XU, Yuanping, XU, Zhijie, HE, Jia, WANG, Jing and ADU, Jianhua (2018). A fuzzy neural network based dynamic data allocation model on heterogeneous multi-GPUs for large-scale computations. International Journal of Automation and Computing, 1-13.

[img]
Preview
PDF
IJAC-ACAFI-2017-10-238.pdf - Accepted Version
All rights reserved.

Download (9MB) | Preview
Official URL: https://link.springer.com/article/10.1007/s11633-0...
Link to published version:: https://doi.org/10.1007/s11633-018-1120-4

Abstract

The parallel computation capabilities of modern GPU (Graphics Processing Unit) processors have attracted increasing attention from researchers and engineers who have been conducting high computational throughput studies. However, current single GPU based engineering solutions are often struggle to fulfill their real-time requirements. Thus, the multi-GPU-based approach has become a popular and cost-effective choice for tackling the demands. In those cases, the computational load balancing over multiple GPU “nodes” is often the key and bottleneck that affect the quality and performance of the runtime system. The existing load balancing approaches are mainly based on the assumption that all GPU nodes in the same computer framework are of equal computational performance, which are often not the case due to cluster design and other legacy issues. This paper presents a novel dynamic load balancing (DLB) model for rapid data division and allocation on heterogeneous GPU nodes based on an innovative fuzzy neural network (FNN). In this research, a 5-state parameter feedback mechanism defining the overall cluster and node performances is proposed. The corresponding FNN-based DLB model will be capable of monitoring and predicting individual node performance under different workload scenarios. A real-time adaptive scheduler has been devised to reorganize the data inputs to each node when necessary to maintain their runtime computational performances. The devised model has been implemented on two dimensional (2D) discrete wavelet transform (DWT) tasks for evaluation. Experiment results show that this DLB model has enabled a high computational throughput while ensuring real-time and precision requirements from complex computational tasks.

Item Type: Article
Departments - Does NOT include content added after October 2018: Faculty of Science, Technology and Arts > Department of Computing
Identification Number: https://doi.org/10.1007/s11633-018-1120-4
Page Range: 1-13
Depositing User: Jing Wang
Date Deposited: 13 Mar 2018 15:33
Last Modified: 18 Mar 2021 06:25
URI: https://shura.shu.ac.uk/id/eprint/18882

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics