2012年6月8日 星期五

[Paper Summary Lec_13] Learning to rank: from pairwise approach to listwise approach

As the previous one, this work aims to solve the ranking problem, but refine the setup of pairwise formulation into list-wise formulation.

This work point out four problem of the previous one:

First, the objective of learning is formalized as minimizing errors in classification of document pairs,
rather than minimizing errors in ranking of documents. Second, the training process is computationally costly, as
the number of document pairs is very large. Third, the as-sumption of that the document pairs are generated i.i.d. is also too strong. Fourth, the number of generated document pairs varies largely from query to query, which will result in training a model biased toward queries with more document pairs.

The list-wise procedure can be summarized by the following picture,



where q are set of queries, d(i) associated with respect to query q(i), The judgment y(i)j represents the relevance degree of d(i)j to q(i).
And x are the feature vectors and z are the list of scores obtained, the objective of learning is formalized as minimization of the total losses with respect to the training data.


To measure the loss function, they first convert the ranking into probability scores, 
by calculating the permutation probability of the lists.

but to simplify the computation procedure, they calculate the top one permutation probability instead.
And with the use of top one probability, given two lists of scores we can use any metric to represent the distance (list-wise loss function) between the two score lists.


For simplicity again, they transformed the top one probability into exponential form,

And finally they use Neural Network and gradient method to solve the regression problem,
the algorithm is as summarized as following:











[Paper Summary Lec_12] Support vector learning for ordinal regression

"Support vector learning for ordinal regression,
" R. Herbrich, ICANN, 1999.

This work aims to reduce the problem of ordinal regression (ranking) to classification problem.
In ordinal regression, the loss function is defined by the ordering of data in original data space and the mapping space, while in classification problem it's just 0 or 1.

The goal is to find the mapping function that maps the data to it's rank, and the loss function is defined as:

given samples 


And the loss is formulated like following:

The idea that this work transformed the problem of ranking to classification can be showed as following:

S is the original data space, and S' is the transformed data that was feed into the SVM for classification learning process.


To derive the linear mapping function, 
and to solve the w, 
is where zi  is the label for training.


And the result will be: