To achieve good generalization in supervised learning, the training and testing examples are usually required to be drawn from the same source distribution. In this paper we propose a method to relax this requirement in the context of logistic regression. Assuming Dp and Da are two sets of examples drawn from two mismatched distributions, where D a are fully labeled and Dp partially labeled, our objective is to complete the labels of Dp. We introduce an auxiliary variable μ for each example in Da to reflect its mismatch with Dp. Under an appropriate constraint the μ's are estimated as a byproduct, along with the classifier. We also present an active learning approach for selecting the labeled examples in Dp. The proposed algorithm, called "Migratory-Logit" or M-Logit, is demonstrated successfully on simulated as well as real data sets.
|Original language||English (US)|
|Title of host publication||ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning|
|Publisher||Association for Computing Machineryacmhelp@acm.org|
|Number of pages||8|
|State||Published - Jan 1 2005|