このページは http://www.slideshare.net/shima__shima/future-directions-of-fairnessaware-data-mining-recommendation-causality-and-theoretical-aspects の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

1年以上前 (2015/07/09)にアップロードinテクノロジー

Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspec...

Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects

Invited Talk @ Workshop on Fairness, Accountability, and Transparency in Machine Learning

In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015

Web Site: http://www.kamishima.net/fadm/

Handnote: http://www.kamishima.net/archive/2015-ws-icml-HN.pdf

The goal of fairness-aware data mining (FADM) is to analyze data while taking into account potential issues of fairness. In this talk, we will cover three topics in FADM:

1. Fairness in a Recommendation Context: In classification tasks, the term "fairness" is regarded as anti-discrimination. We will present other types of problems related to the fairness in a recommendation context.

2. What is Fairness: Most formal definitions of fairness have a connection with the notion of statistical independence. We will explore other types of formal fairness based on causality, agreement, and unfairness.

3. Theoretical Problems of FADM: After reviewing technical and theoretical open problems in the FADM literature, we will introduce the theory of the generalization bound in terms of accuracy as well as fairness.

Joint work with Jun Sakuma, Shotaro Akaho, and Hideki Asoh

- Future directions of

Fairness-aware Data Mining

Recommendation, Causality, and Theoretical Aspects

Toshihiro Kamishima*1 and Kazuto Fukuchi*2

joint work with Shotaro Akaho*1, Hideki Asoh*1, and Jun Sakuma*2,3

*1National Institute of Advanced Industrial Science and Technology (AIST), Japan

*2University of Tsukuba, and *3JST CREST

Workshop on Fairness, Accountability, and Transparency in Machine Learning

In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015

1 - Outline

Foods for discussion about new directions of fairness in DM / ML

New Applications of Fairness-Aware Data Mining

Applications of FADM techniques, other than anti-discrimination,

especially in a recommendation context

New Directions of Fairness

Relations of existing formal fairness with causal inference and

information theory

Introducing an idea of a fair division problem and avoiding unfair

treatments

Generalization Bound in terms of Fairness

Theoretical aspects of fairness not on training data, but on test data

✤ We use the term “fairness-aware” instead of “discrimination-aware,” because the word “discrimination”

means classification in a ML context, and this technique applicable to tasks other than avoiding

discriminative decisions

2 - PART Ⅰ

Applications of

Fairness-Aware Data Mining

3 - Fairness-Aware Data Mining

Fairness-aware Data mining (FADM)

data analysis taking into account potential issues of fairness

Two major tasks of FADM

[Romei+ 2014]

Unfairness Detection: Finding unfair treatments in database

Unfairness Prevention: Building a model to provide fair outcomes

Unfairness Prevention

S : sensitive feature: representing information that is wanted not to

influence outcomes

Other factors: Y: target variable, X: non-sensitive feature

Learning a statistical model from potentially unfair data sets so

that the sensitive feature does not influence the model’s outcomes

4 - Anti-Discrimination

[Sweeney 13]

obtaining socially and legally anti-discriminative outcomes

Advertisements indicating arrest records were more frequently

displayed for names that are more popular among individuals of

African descent than those of European descent

African descent names

European descent names

Located:

Arrested?

sensitive feature = users’ socially sensitive demographic information

anti-discriminative outcomes

5 - Unbiased Information

[Pariser 2011, TED Talk by Eli Pariser, http://www.filterbubble.com, Kamishima+ 13]

avoiding biased information that doesn’t meet a user’s intention

Filter Bubble: a concern that personalization technologies narrow

and bias the topics of information provided to people

To fit for Pariser’s preference, conservative people are eliminated from

his friend recommendation list in a social networking service

sensitive feature = political conviction of a friend candidate

unbiased information in terms of candidates’ political conviction

6 - Fair Trading

[Kamishima+ 12, Kamishima+ 13]

equal treatment of content providers

Online retail store

The site owner directly sells items

The site is rented to tenants, and the tenants also sells items

In the recommendation on the retail store, the items sold by the site

owner are constantly ranked higher than those sold by tenants

Tenants will complain about this unfair treatment

sensitive feature = a content provider of a candidate item

site owner and its tenants are equally treated in recommendation

7 - Ignoring Uninteresting Information

[Gondek+ 04]

non-r

ignore information unwanted by a user

edundant clustering : find clusters that are as independent

from a given uninteresting partition as possible

clustering facial images

A simple clustering method finds two

clusters: one contains only faces, and the

other contains faces with shoulders

A data analyst considers this clustering is

useless and uninteresting

By ignoring this uninteresting information,

more meaningful female- and male-like

clusters could be obtained

sensitive feature = uninteresting information

ignore the influence of uninteresting information

8 - Part Ⅰ: Summary

A belief introduction of FADM and a unfairness prevention task

Learning a statistical model from potentially unfair data sets so that

the sensitive feature does not influence the model’s outcomes

FADM techniques are widely applicable

There are many FADM applications other than anti-discrimination,

such as providing unbiased information, fair trading, and ignoring

uninteresting information

9 - PART Ⅱ

New Directions of Fairness

10 - PART Ⅱ: Outline

Discussion about formal definitions and treatments of fairness

in data mining and machine learning contexts

Related Topics of a Current Formal Fairness

connection between formal fairness and causal inference

interpretation in view of information theory

New Definitions of Formal Fairness

Why statistical independence can be used as fairness

Introducing an idea of a fair division problem

New Treatments of Formal Fairness

methods for avoiding unfair treatments instead of enhancing

fairness

11 - Causality

[Žliobaitė+ 11, Calders+ 13]

Unfairness Prevention task

optimization of accuracy under causality constraints

An example of university admission in [Žliobaitė+ 11]

sensitive feature: S

target variable: Y

gender

acceptance

male / female

accept / not accept

Fair determination: the gender does not influence the acceptance

statistical independence: Y ? S

12 - Information Theoretic Interpretation

Information theoretical view of a fairness condition

Sensitive: S

H(Y )

Target: Y

H(S | Y )

I(S; Y )

H(Y | S)

H(S)

statistical independence between S and Y implies

zero mutual information: I(S; Y) = 0

the degree of influence S to Y can be measured by I(S; Y)

13 - Causality with Explainable Features

[Žliobaitė+ 11, Calders+ 13]

An example of fair determination

even if S and Y are not independent

sensitive feature: S

target variable: Y

gender

acceptance

male / female

accept / not accept

female → medicine=high

medicine → acceptance=low

male → computer=high

computer → acceptance=high

explainable feature: E

(confounding feature)

program

medicine / computer

Removing the pure influence of S to Y, excluding the eﬀect of E

conditional statistical independence: Y ? S | E

14 - Information Theoretic Interpretation

Sensitive: S

H(Y )

H(S | Y, E)

Target: Y

I(S; Y | E)

H(Y | S, E)

I(S; E | Y )

I(S; Y ; E)

H(S)

I(Y ; E | S)

H(E | S, Y )

Explainable: E

H(E)

the degree of conditional independence between Y and S given E

conditional mutual information: I(S; Y | E)

We can exploit additional information I(S; Y; E) to obtain outcomes

15 - Why outcomes are assumed

as being fair?

Why outcomes are assumed as being fair,

if a sensitive feature does not influence the outcomes?

All parties agree with the use of this criterion,

may be because this is objective and reasonable

Is there any way for making an agreement?

In this view, [Brendt+ 12]’s approach is

regarded as a way of making

agreements in a wisdom-of-crowds way.

The size and color of circles indicate the

size of samples and the risk of

discrimination, respectively

To further examine new directions, we introduce a fair division problem

16 - Fair Division

Alice and Bob want to divide this swiss-roll FAIRLY

20cm

Total length of this swiss-roll is 20cm

Alice and Bob get half each based on agreed common measure

This approach is adopted in current FADM techniques

17 - Fair Division

Alice and Bob want to divide this swiss-roll FAIRLY

10cm

10cm

divide the swiss-roll into 10cm each

Alice and Bob get half each based on agreed common measure

This approach is adopted in current FADM techniques

17 - Fair Division

Unfortunately, Alice and Bob don’t have a scale

Alice cut the swiss-roll exactly in halves based on her own feeling

envy-free division: Alice and Bob get a equal or larger piece

based on their own measure

18 - Fair Division

Unfortunately, Alice and Bob don’t have a scale

Bob

Bob pick a larger piece based on his own feeling

envy-free division: Alice and Bob get a equal or larger piece

based on their own measure

18 - Fair Division

There are n parties

Every party i has one’s own measure mi(Pj) for each piece Pj

Fairness in a fair division context

Envy-Free Division: Every party gets a equal or larger piece than

other parties’ pieces based on one’s own measure

mi(Pi)

mi(Pj), 8i, j

Proportional Division: Every party gets an equal or larger piece than 1/n

based on one’s own measure; Envy-free division is proportional division

mi(Pi)

1/n, 8i

Exact Division: Every party gets a equal-sized piece

mi(Pj) = 1/n, 8i, j

19 - Envy-Free in a FADM Context

Current FADM techniques adopt common agreed measure

Can we develop FADM techniques using an envy-free approach?

This technique can be applicable without agreements on fairness criterion

FADM under envy-free fairness

Maximize the utility of analysis, such as prediction accuracy,

under the envy-free fairness constraints

A Naïve method for Classification

Among n candidates k ones can be classified as positive

Among all nCk classifications, enumerate those satisfying envy-free

conditions based on parties’ own utility measures

ex. Fair classifiers with diﬀerent sets of explainable features

Pick the classification whose accuracy is maximum

Open Problem: Can we develop a more eﬃcient algorithm?

20 - Fairness Guardian

Current fairness prevention methods are designed so as

to be fair

Example: Logistic Regression + Prejudice Remover

[Kamishima+ 12]

The objective function is composed of

classification loss and fairness constraint terms

P lnPr[Y

D

| X, S; ⇥] + k⇥k2

2

2 + ⌘ I(Y ; S)

Fairness Guardian Approach

Unfairness is prevented by enhancing fairness of outcomes

21 - Fair Is Not Unfair?

A reverse treatment of fairness:

not to be unfair

One possible formulation of a unfair classifier

Outcomes are determined ONLY by a sensitive feature

Pr[Y | S; ⇤]

Ex. Your paper is rejected, just because you are not handsome

Penalty term to maximize the KL divergence between

a pre-trained unfair classifier and a target classifier

DKL[Pr[Y | S; ⇤]k Pr[Y | X, S; ⇥]]

22 - Unfairness Hater

Unfairness Hater Approach

Unfairness is prevented by avoiding unfair outcomes

This approach was almost useless for obtaining fair outcomes, but…

Better Optimization

The fairness-enhanced objective function tends to be non-convex;

thus, adding a unfairness hater may help for avoiding local minima

Avoiding Unfair Situation

There would be unfair situations that should be avoided;

Ex. Humans’ photos were mistakenly labeled as gorilla in auto-

tagging [Barr 2015]

There would be many choices between to be fair and not to be unfair

that should be examined

23 - Part Ⅱ: Summary

Relation of fairness with causal inference and information theory

We review a current formal definition of fairness by relating it with

Rubin’s causal inference; and, its interpretation based on information

theory

New Directions of formal fairness without agreements

We showed the possibility of formal fairness that does not presume a

common criterion agreed between concerned parties

New Directions of treatment of fairness by avoiding unfairness

We discussed that FADM techniques for avoiding unfairness, instead

of enhancing fairness.

24 - PART Ⅲ

Generalization Bound

in terms of Fairness

25 - Part Ⅲ: Introduction

There are many technical problems to solve in a FADM literature,

because tools for excluding specific information has not been

developed actively.

Types of Sensitive Features

Non-binary sensitive feature

Analysis Techniques

Analysis methods other than classification or regression

Optimization

Constraint terms make objective functions non-convex

Fairness measure

Interpretable to humans and having convenient properties

Learning Theory

Generalization ability in terms of fairness

26 - Kazuto Fukuchi’s Talk

27 - Conclusion

Applications of Fairness-Aware Data Mining

Applications other than anti-discrimination: providing unbiased

information, fair trading, and excluding unwanted information

New Directions of Fairness

Relation of fairness with causal inference and information theory

Formal fairness introducing an idea of a fair division problem

Avoiding unfair treatment, instead of enhancing fairness

Generalization bound in terms of fairness

Generalization bound in terms of fairness based on f-divergence

Additional Information and codes

http://www.kamishima.net/fadm

Acknowledgments: This work is supported by MEXT/JSPS KAKENHI Grant Number 24500194, 24680015,

25540094, 25540094, and 15K00327

28 - Bibliography I

A. Barr.

Google mistakenly tags black people as ‘gorillas,’ showing limits of algorithms.

The Wall Street Journal, 2015.

⟨http://on.wsj.com/1CaCNlb⟩.

B. Berendt and S. Preibusch.

Exploring discrimination: A user-centric evaluation of discrimination-aware data mining.

In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,

pages 344–351, 2012.

T. Calders, A. Karim, F. Kamiran, W. Ali, and X. Zhang.

Controlling attribute eﬀect in linear regression.

In Proc. of the 13th IEEE Int’l Conf. on Data Mining, pages 71–80, 2013.

T. Calders and S. Verwer.

Three naive Bayes approaches for discrimination-free classification.

Data Mining and Knowledge Discovery, 21:277–292, 2010.

C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel.

Fairness through awareness.

In Proc. of the 3rd Innovations in Theoretical Computer Science Conf., pages 214–226,

2012.

1 / 4 - Bibliography II

K. Fukuchi and J. Sakuma.

Fairness-aware learning with restriction of universal dependency using f-divergences.

arXiv:1104.3913 [cs.CC], 2015.

D. Gondek and T. Hofmann.

Non-redundant data clustering.

In Proc. of the 4th IEEE Int’l Conf. on Data Mining, pages 75–82, 2004.

T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.

Considerations on fairness-aware data mining.

In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,

pages 378–385, 2012.

T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.

Enhancement of the neutrality in recommendation.

In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems,

pages 8–14, 2012.

T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.

Fairness-aware classifier with prejudice remover regularizer.

In Proc. of the ECML PKDD 2012, Part II, pages 35–50, 2012.

[LNCS 7524].

2 / 4 - Bibliography III

T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.

Eﬃciency improvement of neutrality-enhanced recommendation.

In Proc. of the 3rd Workshop on Human Decision Making in Recommender Systems,

pages 1–8, 2013.

E. Pariser.

The filter bubble.

⟨http://www.thefilterbubble.com/⟩.

E. Pariser.

The Filter Bubble: What The Internet Is Hiding From You.

Viking, 2011.

A. Romei and S. Ruggieri.

A multidisciplinary survey on discrimination analysis.

The Knowledge Engineering Review, 29(5):582–638, 2014.

L. Sweeney.

Discrimination in online ad delivery.

Communications of the ACM, 56(5):44–54, 2013.

I. ˇ

Zliobait˙e, F. Kamiran, and T. Calders.

Handling conditional discrimination.

In Proc. of the 11th IEEE Int’l Conf. on Data Mining, 2011.

3 / 4 - Bibliography IV

R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork.

Learning fair representations.

In Proc. of the 30th Int’l Conf. on Machine Learning, 2013.

4 / 4