In this paper, we consider the problem of de-signing Differentially Private (DP) algorithms for Stochastic Convex Optimization (SCO) on heavy-Tailed data. The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods, re-sulting in failure to provide the DP guarantees. To better understand this type of challenges, we provide in this paper a comprehensive study of DP-SCO under various settings. First, we con-sider the case where the loss function is strongly convex and smooth. For this case, we propose a method based on the sample-And-Aggregate framework, which has an excess population risk of O( d3 n4 ) (after omitting other factors), where n is the sample size and d is the dimensional-ity of the data. Then, we show that with some additional assumptions on the loss functions, it is possible to reduce the expected excess popula-tion risk to O ( d2 n2 ). To lift these additional condi-tions, we also provide a gradient smoothing and trimming based scheme to achieve excess popula-tion risks of O ( d2 n2 ) and O( d 2 3 (n2) 13 ) for strongly convex and general convex loss functions, respec-tively, with high probability. Experiments sug-gest that our algorithms can effectively deal with the challenges caused by data irregularity.
|Original language||English (US)|
|Title of host publication||37th International Conference on Machine Learning, ICML 2020|
|Publisher||International Machine Learning Society (IMLS)|
|Number of pages||11|
|State||Published - Jan 1 2020|