Extra Material for the paper :

“Regret-based Online Ranking for a Growing Digital Library”

(pdf)

 

Abstract :

 

The most common environment in which ranking is used takes a very specific form. Users

sequentially generate queries in a digital library. For each query, ranking is applied to order

a set of relevant items from which the user selects his favorite. This is the case when

ranking search results for pages on the World Wide Web or for merchandize on an

e-commerce site. In this work, we present a new online ranking algorithm, called NoRegret

KLRank. Our algorithm is designed to use “clickthrough” information as it is provided by

the users to improve future ranking decisions. More importantly, we show that its long term

average performance will converge to the best rate achievable by any competing fixed

ranking policy selected with the benefit of hindsight. We show how to ensure that this

property continues to hold as new items are added to the set thus requiring a richer class of

ranking policies. Finally, our empirical results show that, while in some context NoRegret

KLRank might be considered conservative, a greedy variant of this algorithm actually

outperforms many popular ranking algorithms.

 

Original CORA database can be found in :  http://www.cs.umass.edu/~mccallum/data/cora-classify.tar.gz

(A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. Information Retrieval, 3(2):127–163, 2000.)

 

List of 22000 queries used in our paper : CoraOnlineRankingQueries_22000.tar.gz

 

Updated May 1st, 2009