A methodology for speeding up matrix vector multiplication for single/multi-core architectures

KELEFOURAS, Vasileios, KRITIKAKOU, Angeliki, PAPADIMA, Elissavet and GOUTIS, Constantinos E. (2015). A methodology for speeding up matrix vector multiplication for single/multi-core architectures. Journal of Supercomputing, 71 (7), 2644-2667.

Kelefouras-MethodologyForSpeedingUpMatrixVector(AM).pdf - Accepted Version
All rights reserved.

Download (1MB) | Preview
Official URL: https://link.springer.com/article/10.1007/s11227-0...
Link to published version:: https://doi.org/10.1007/s11227-015-1409-9


In this paper, a new methodology for computing the Dense Matrix Vector Multiplication, for both embedded (processors without SIMD unit) and general purpose processors (single and multi-core processors, with SIMD unit), is presented. This methodology achieves higher execution speed than ATLAS state-of-the-art library (speedup from 1.2 up to 1.45). This is achieved by fully exploiting the combination of the software (e.g., data reuse) and hardware parameters (e.g., data cache associativity) which are considered simultaneously as one problem and not separately, giving a smaller search space and high-quality solutions. The proposed methodology produces a different schedule for different values of the (i) number of the levels of data cache; (ii) data cache sizes; (iii) data cache associativities; (iv) data cache and main memory latencies; (v) data array layout of the matrix and (vi) number of cores.

Item Type: Article
Departments - Does NOT include content added after October 2018: Faculty of Science, Technology and Arts > Department of Computing
Identification Number: https://doi.org/10.1007/s11227-015-1409-9
Page Range: 2644-2667
Depositing User: Vasileios Kelefouras
Date Deposited: 27 Mar 2018 11:43
Last Modified: 18 Mar 2021 15:19
URI: https://shura.shu.ac.uk/id/eprint/18332

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics