Learning to Rank with Contextual Information

  • Peng Han

Student thesis: Doctoral Thesis

Abstract

Learning to rank is utilized in many scenarios, such as disease-gene association, information retrieval and recommender system. Improving the prediction accuracy of the ranking model is the main target of existing works. Contextual information has a significant influence in the ranking problem, and has been proved effective to increase the prediction performance of ranking models. Then we construct similarities for different types of entities that could utilize contextual information uniformly in an extensible way. Once we have the similarities constructed by contextual information, how to uti- lize them for different types of ranking models will be the task we should tackle. In this thesis, we propose four algorithms for learning to rank with contextual informa- tion. To refine the framework of matrix factorization, we propose an area under the ROC curve (AUC) loss to conquer the sparsity problem. Clustering and sampling methods are used to utilize the contextual information in the global perspective, and an objective function with the optimal solution is proposed to exploit the contex- tual information in the local perspective. Then, for the deep learning framework, we apply the graph convolutional network (GCN) on the ranking problem with the combination of matrix factorization. Contextual information is utilized to generate the input embeddings and graph kernels for the GCN. The third method in this thesis is proposed to directly exploit the contextual information for ranking. Laplacian loss is utilized to solve the ranking problem, which could optimize the ranking matrix directly. With this loss, entities with similar contextual information will have similar ranking results. Finally, we propose a two-step method to solve the ranking problem of the sequential data. The first step in this two-step method is to generate the em- beddings for all entities with a new sampling strategy. Graph neural network (GNN) and long short-term memory (LSTM) are combined to generate the representation of sequential data. Once we have the representation of the sequential data, we could solve the ranking problem of them with pair-wise loss and sampling strategy.
Date of AwardNov 15 2021
Original languageEnglish (US)
Awarding Institution
  • Computer, Electrical and Mathematical Science and Engineering
SupervisorXiangliang Zhang (Supervisor)

Keywords

  • learning to rank
  • matrix factorization
  • deep learning
  • Laplacian regularization

Cite this

'