Random Matrix Theory: Selected Applications from Statistical Signal Processing and Machine Learning

  • Khalil Elkhalil

Student thesis: Doctoral Thesis


Random matrix theory is an outstanding mathematical tool that has demonstrated its usefulness in many areas ranging from wireless communication to finance and economics. The main motivation behind its use comes from the fundamental role that random matrices play in modeling unknown and unpredictable physical quantities. In many situations, meaningful metrics expressed as scalar functionals of these random matrices arise naturally. Along this line, the present work consists in leveraging tools from random matrix theory in an attempt to answer fundamental questions related to applications from statistical signal processing and machine learning. In a first part, this thesis addresses the development of analytical tools for the computation of the inverse moments of random Gram matrices with one side correlation. Such a question is mainly driven by applications in signal processing and wireless communications wherein such matrices naturally arise. In particular, we derive closed-form expressions for the inverse moments and show that the obtained results can help approximate several performance metrics of common estimation techniques. Then, we carry out a large dimensional study of discriminant analysis classifiers. Under mild assumptions, we show that the asymptotic classification error approaches a deterministic quantity that depends only on the means and covariances associated with each class as well as the problem dimensions. Such result permits a better understanding of the underlying classifiers, in practical large but finite dimensions, and can be used to optimize the performance. Finally, we revisit kernel ridge regression and study a centered version of it that we call centered kernel ridge regression or CKRR in short. Relying on recent advances on the asymptotic properties of random kernel matrices, we carry out a large dimensional analysis of CKRR under the assumption that both the data dimesion and the training size grow simultaneiusly large at the same rate. We particularly show that both the empirical and prediction risks converge to a limiting risk that relates the performance to the data statistics and the parameters involved. Such a result is important as it permits a better undertanding of kernel ridge regression and allows to efficiently optimize the performance.
Date of AwardJun 2019
Original languageEnglish (US)
Awarding Institution
  • Computer, Electrical and Mathematical Science and Engineering
SupervisorTareq Al-Naffouri (Supervisor)


  • Random matrix theory
  • discriminant analysis
  • kernel regression
  • High dimensional statistics

Cite this