DNA sequencing techniques bring novel tools and also statistical challenges to genetic research. In addition to detecting differentially expressed genes, testing the significance of gene sets or pathway analysis has been recognized as an equally important problem. Owing to the “large pp small nn” paradigm, the traditional Hotelling’s T2T2 test suffers from the singularity problem and therefore is not valid in this setting. In this paper, we propose a shrinkage-based diagonal Hotelling’s test for both one-sample and two-sample cases. We also suggest several different ways to derive the approximate null distribution under different scenarios of pp and nn for our proposed shrinkage-based test. Simulation studies show that the proposed method performs comparably to existing competitors when nn is moderate or large, but it is better when nn is small. In addition, we analyze four gene expression data sets and they demonstrate the advantage of our proposed shrinkage-based diagonal Hotelling’s test.
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Numerical Analysis