Tile QR factorization with parallel panel processing for multicore architectures

Bilel Hadri*, Hatem Ltaief, Emmanuel Agullo, Jack Dongarra

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

23 Scopus citations

Abstract

To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of tasks of fine granularity where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on moderate and large square matrices, their way of processing a panel in sequence leads to limited performance when factorizing tall and skinny matrices or small square matrices. We present a new fully asynchronous method for computing a QR factorization on shared-memory multicore architectures that overcomes this bottleneck. Our contribution is to adapt an existing algorithm that performs a panel factorization in parallel (named Communication-Avoiding QR and initially designed for distributed-memory machines), to the context of tile algorithms using asynchronous computations. An experimental study shows significant improvement (up to almost 10 times faster) compared to state-of-the-art approaches. We aim to eventually incorporate this work into the Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) library.

Original languageEnglish (US)
Title of host publicationProceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
DOIs
StatePublished - Jul 1 2010
Event24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010 - Atlanta, GA, United States
Duration: Apr 19 2010Apr 23 2010

Publication series

NameProceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010

Other

Other24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010
CountryUnited States
CityAtlanta, GA
Period04/19/1004/23/10

Keywords

  • Communication avoiding
  • Dynamic scheduling
  • Multicore
  • QR factorization
  • Tile algorithms

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Tile QR factorization with parallel panel processing for multicore architectures'. Together they form a unique fingerprint.

Cite this