Home  | Publications | PPB20

Improved Data Locality Using Morton-Order Curve on the Example of LU Decomposition

MCML Authors

Christian Böhm

Prof. Dr.

Principal Investigator

* Former Principal Investigator

Abstract

The LU decomposition is an essential element used in many linear algebra applications. Furthermore, it is used in LINPACK to benchmark the performance of modern multi-core processor environments. These processors offer a large memory hierarchy including multiple registers and various levels of cache. Registers or L1 data cache are small in size but also very fast. The L2 or L3 cache memory is usually shared among other cores and larger but slower. For the LU decomposition, the latency of fetching data from the main memory to the registers to perform a calculation also depends on the input matrix's memory access pattern. Here, we look at the block factorization algorithm, where the LU decomposition performance depends on the performance of the matrix multiplication. In both cases, the LU decomposition and the matrix multiplication, such a matrix is traversed by three nested loops. In this paper, we propose to traverse such loops in an order defined by a space-filling curve. This traversal dramatically improves data locality and offers effective exploitation of the memory hierarchy. Besides the canonical (or line-by-line) access pattern, we demonstrate the traversal in Hilbert-, Peano and Morton order. Our extensive experiments show that the Morton order (or Z -order) and the inverse Morton order (or Z-order) have a better runtime performance compared to the others.

inproceedings


IEEE BigData 2020

IEEE International Conference on Big Data. Virtual, Dec 10-13, 2020.

Authors

M. Perdacher • C. Plant • C. Böhm

Links

DOI

Research Area

 A3 | Computational Models

BibTeXKey: PPB20

Back to Top