Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2019, KDD, Fairness-Aware Ranking in Search & RecSys with application to LinkedIn Talent Search #26

Open
Rounique opened this issue Jun 10, 2022 · 5 comments
Assignees
Labels
literature-review Summary of the paper related to the work

Comments

@Rounique
Copy link
Contributor

No description provided.

@Rounique Rounique self-assigned this Jun 10, 2022
@Rounique
Copy link
Contributor Author

Rounique commented Jun 10, 2022

This paper presents a framework for detecting and mitigating algorithmic bias in recommendation systems.
First, they use a measure to quantify bias with respect to protected attributes such as gender and age. Then they present algorithms for computing fairness-aware re-ranking of results.
For a given query, the algorithms tries to achieve the desired distribution of top-ranked results with respect to one or more protected attributes and then they present the online A/B testing results from applying the framework on LinkedIn Talent Search. This can help recruiters in the hiring process.

In our work we have a predicted list of authors(T’), first, we show that there already exists bias in it and then by re-ranking the list, we will decrease or mitigate the bias.

In the Linkedin Talent Search work, the fairness requirements are given in terms of a desired distribution over protected attributes (Gender, Age or both), and the proposed algorithms do a re-ranking to satisfy the fairness constraints.
Since we want to detect and mitigate popularity Bias in our work our protected attribute, for now, is popular or non-popular.

Measures for Bias Evaluation
The first measure which is used for evaluating bias in recommendations computes the extent to which the set of top k ranked results for a search or recom task differ over an attribute value with respect to a desired proportion of that attribute value.
(still writing...)

Capture

table

The Skew basically calculates the ratio of the existing proportion of candidates with attribute value ai to the desired proportion of ai. The more this ratio is closer to 1 the less unfairness exists in the distribution. Also after we calculate the log of this ratio it should be closer to zero in order to be fair.
Negative results show under-representation and positive, over-representation of that specific attribute.
For example for male/female attribute if the desired ratio is F=0.8, M=0.2. and we have 50K females and 40K Males. Let's say in our top-100 ranked list we have 45 female candidates.
Skew female @100 = loge((45/100)/(50K/90K))=loge(0.81) = -0.21 which shows that the female candidates are represented 81% lower than desired.
It is easy to infer information from this measure but has two disadvantages,

  • Only defined for one attribute at a time
  • It is based on k so has to be calculated multiple times for various k values to understand the extent of the bias.

To solve the first problem two more measures have been provided.

  1. MinSkew@k 2) MaxSkew@k

Since Skew is only calculated over an attribute, these new measures compute the min/max of skew over all the attribute values.
A large absolute value of Min/Max skew shows unfairness anyways, which in this paper they have been called worst disadvantage and unfair advantage.

To solve the second problem another ranking measure has been presented.
This formula is based on Kullback-Leibler divergence. It basically compares the desired distribution of our candidates to the already existing one. The ideal case is 0 which means the distribution that we have exactly matched what we desire.

NDKL

NDKL also has two disadvantages:

  1. As we discussed it is not calculated on a specific attribute so when it shows the difference between the desired distribution we can not say which attribute value has been under or over-represented in the top rank list.
  2. Compared to skew it is not easy to interpret.

@hosseinfani
Copy link
Member

@Rounique
please rewrite based on your understanding! Please resist the temptation to copy-paste.

@hosseinfani hosseinfani added the literature-review Summary of the paper related to the work label Jun 10, 2022
@Rounique
Copy link
Contributor Author

@hosseinfani
I wrote it myself! Just a few lines are from the paper.

@Rounique
Copy link
Contributor Author

@hosseinfani
Could you please review it again now?

@hosseinfani
Copy link
Member

@hosseinfani Could you please review it again now?

good job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
literature-review Summary of the paper related to the work
Projects
None yet
Development

No branches or pull requests

2 participants