このページは http://www.slideshare.net/rikija/predicting-47969285 の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

1年以上前 (2015/05/10)にアップロードin政治・経済

Talk material for the following paper

http://jmlr.org/proceedings/papers/v38/takahashi15.html

- References

Predicting Preference Reversals

via Gaussian Process Uncertainty Aversion

Rikiya Takahashi1

Tetsuro Morimura2

1SmartNews, Inc.

rikiya.takahashi@smartnews.com

2IBM Research - Tokyo

tetsuro@jp.ibm.com

May 10, 2015

AISTATS 2015

Predicting Preference Reversals via Gaussian Process Uncertainty Aversion - Discrete Choice Modelling

Goal: predict prob. of choosing an option from a choice set.

Why solving this problem?

For business: brand positioning among competitors

For business: sales promotion (yet involving some abuse)

To deeply understand how human makes decisions

AISTATS 2015

Random Utility Theory

Each human is a maximizer of random utility.

i ’s choice from Si = arg max fi (vj ) +

εij

j ∈Si

mean utility

random noise

Si : choice set for i, vj: vector of j’s attributes, fi : i’s

mean utility function

Assuming independence among every option’s attractiveness

For both mean and noise: (e.g., logit (McFadden, 1980))

For only mean: (e.g., nested logit (Williams, 1977))

AISTATS 2015

Why Random Utility Theory has been Used?

Voices from friends (machine learners & econometricians)

1

Rationality of independence assumption

Attributes of unchosen options are irrelevant to the

chosen option’s benefit.

I bought diamond. This is the best. It’s ridiculous to

think that other dirty stones affected my final choice.

2

Computational practicality

Unless scoring each option, how to decide the best one?

Formalizing data likelihood is straight and easy.

AISTATS 2015

Complexity of Real Human’s Choice

An example of choosing PC (Kivetz et al., 2004)

Each subject chooses 1 option from a choice set

Choice Set

#subjects

A

B

C

D

E

{A, B, C}

36:176:144

CPU [MHz]

250

300

350

400

450

{B, C, D}

56:177:115

Mem. [MB]

192

160

128

96

64

{C, D, E}

94:181:109

Can random utility theory still explain the preference reversals?

B

C or C

B?

AISTATS 2015

Agenda

1

Introduction of the Goal and Issues

2

Irrational Context Effects

Similarity Effect

Attraction Effect

Compromise Effect

Prior Work

3

Proposing a Bayesian Model of Mental Conflict

4

Numerical Studies

5

Conclusion

AISTATS 2015

Similarity Effect (Tversky, 1972)

Top-share choice can change due to correlated utilities.

E.g., one color from {Blue, Red} or {Violet, Blue, Red}?

AISTATS 2015

Attraction Effect (Huber et al., 1982)

Introduction of an absolutely-inferior option A− (=decoy)

causes irregular increase of option A’s attractiveness.

Despite the natural guess that decoy never affects the choice.

If D

A, then D

A

A−.

If A

D, then A is superior to both A− and D.

AISTATS 2015

Compromise Effect (Simonson, 1989)

Moderate options within each chosen set are preferred.

Different from non-linear utility function involving

√

√

diminishing returns (e.g.,

inexpensiveness +

quality ).

AISTATS 2015

Positioning of Our Work in Literature

Sim.: similarity, Attr.: attraction, Com.: compromise

Sim.

Attr.

Com.

Mechanism

Predict. for

Likelihood

Test Set

Maximization

SPM

OK

NG

NG

correlation

OK

MCMC

MDFT

OK

OK

OK

dominance & indifference

OK

MCMC

PD

OK

OK

OK

nonlinear pairwise comparison

OK

MCMC

MMLM

OK

NG

OK

none

OK

Non-convex

NLM

OK

NG

NG

hierarchy

NG

Non-convex

BSY

OK

OK

OK

Bayesian

OK

MCMC

LCA

OK

OK

OK

loss aversion

OK

MCMC

MLBA

OK

OK

OK

nonlinear accumulation

OK

Non-convex

Proposed

OK

NG

OK

Bayesian

OK

Convex

MDFT: Multialternative Decision Field Theory (Roe et al., 2001)

PD: Proportional Difference Model (Gonz´

alez-Vallejo, 2002)

MMLM: Mixed Multinomial Logit Model (McFadden and Train, 2000)

SPM: Structured Probit Model (Yai, 1997; Dotson et al., 2009)

NLM: Nested Logit Models (Williams, 1977; Wen and Koppelman, 2001)

BSY: Bayesian Model of (Shenoy and Yu, 2013)

LCA: Leaky Competing Accumulator Model (Usher and McClelland, 2004)

MLBA: Multiattribute Linear Ballistic Accumulator Model (Trueblood, 2014)

AISTATS 2015

Agenda

1

Introduction of the Goal and Issues

2

Irrational Context Effects

3

Proposing a Bayesian Model of Mental Conflict

Utility Estimation as Dual Personality

Irrationality by Bayesian Shrinkage

Convex Optimization when using Posterior Mean

4

Numerical Studies

5

Conclusion

AISTATS 2015

Utility Estimation as Dual Personality

How about regarding utilities as samples in statistics?

Assumption 1: Utility function is partially disclosed to DMS.

1

UC computes the sample value of every option’s utility,

and sends only these samples to DMS.

2

DMS statistically estimates the utility function.

AISTATS 2015

Mental Conflict as Bayesian Shrinkage

Assumption 2: DMS does Bayesian shrinkage estimation.

i ∈ {1, . . . , n}: context, yi ∈ {1, . . . , m[i ]}: final choice

X

dX

i

(xi1 ∈ R , . . . , xim[i]) : features of m[i] options

Objective Data: values of random utilities

vi

(vi1, . . . , vim[i]) ∼ N µ , σ2I

φ (x

i

m[i ]

, vij = b+wφ

ij )

µ

m[i ]

i : R

: vec. of the true mean utility, σ2: noise level

b: bias term, φ :

d

d

R X → R φ : mapping function. wφ: vec. of coefficients

Subjective Prior: choice-set-dependent Gaussian process

µ ∼ N 0

m[i ]×m[i ]

i

m[i ], σ2K(Xi )

s.t. K(Xi ) = (K (xij , xij )) ∈ R

µ ∈ m[i]

i

R

: vec. of random utilities, K (·, ·): similarity between options

Final choice: based on (Posterior mean u∗ + i.i.d. noise) as

i

−1

u∗ = K(X

b1

i

i )

Im[i]+K(Xi )

m[i ] +Φi wφ

,

yi = arg max(u∗ + ε

ij

ij ) where ∀j εij ∼ Gumbel .

j

AISTATS 2015

Irrationality by Bayesian Shrinkage

Implication of (1): similarity-dependent discounting

−1

u∗ = K(X

b1

.

(1)

i

i )

Im[i]+K(Xi )

m[i ] +Φi wφ

shrinkage factor

vec. of utility samples

Under RBF kernel K (x, x ) = exp(−γ x − x 2),

an option dissimilar to others involves high uncertainty.

Strongly shrunk into prior mean 0.

Context effects as Bayesian uncertainty aversion

1.4

{A,D}

1.4

{A,B,C}

1.2

{A,A-,D}

1.2

{B,C,D}

1

1

0.8

0.8

0.6

0.6

0.4

0.4

Final Evaluation 0.2

Final Evaluation 0.2

A- A

D

A

B

C

D

0

0

1

2

3

4

1

2

3

4

X1=(5-X2)

X1=(5-X2)

AISTATS 2015

Convex Optimization when using Posterior Mean

Global fitting of the parameters using data (Xi , yi )ni=1

Fix the mapping and similarity functions during updates.

Shrinkage factor Hi

K(Xi )(Im[i] + K(Xi ))−1 is constant!

Obtaining a MAP estimate is convex w.r.t. (b, wφ).

n

c

max

(

bH

2

i 1m[i ] +Hi Φi wφ

, yi ) −

wφ

b,wφ

2

i =1

Context−specific Hi is multiplied.

Exploiting the log-concavity of multinomial logit

exp(u∗ )

(u∗, y

iyi

i

i )

log

m[i ] exp(u∗ )

j =1

ij

AISTATS 2015

Agenda

1

Introduction of the Goal and Issues

2

Irrational Context Effects

3

Proposing a Bayesian Model of Mental Conflict

4

Numerical Studies

5

Conclusion

AISTATS 2015

Experimental Settings

Evaluates accuracy & log-likelihood for real choice data.

Dataset #1: PC (n = 1, 088, dX = 2)

Dataset #2: SP (n = 972, dX = 2)

Subjects are asked of choosing a speaker.

Choice Set

#subjects

A

B

C

D

E

{A, B, C}

45:135:145

Power [Watt]

50

75

100

125

150

{B, C, D}

58:137:111

Price [USD]

100

130

160

190

220

{C, D, E}

95:155: 91

Dataset #3: SM (n = 10, 719, dX = 23)

SwissMetro dataset (Antonini et al., 2007)

Subjects are asked of choosing one transportation, either

from {train, car, SwissMetro} or {train, SwissMetro}.

Attribute of option: cost, travel time, headway, seat

type, and type of transportation.

AISTATS 2015

Cross-Validation Performances

High predictability in addition to the interpretable mechanism.

For SP, successfully detected combination of compromise

effect & prioritization of power.

1st best for PC & SP.

2nd best for higher-dimensional SM: slightly worse than

highly expressive nonparametric version of mixed

multinomial logit (McFadden and Train, 2000).

LinLogit

LinLogit

-0.8

4

A

B

C

D

E

NpLogit

0.7 NpLogit

LinMix

LinMix

-0.9

NpMix

0.6

NpMix

GPUA

GPUA

0.5

3

Obj. Eval.

-1

Evaluation

{A,B,C}

0.4

{B,C,D}

{C,D,E}

-1.1

0.3

2

Average Log-Likelihood

PC

SP

SM

Classification Accuracy

PC

SP

SM

100

150

200

Dataset

Dataset

Price [USD]

AISTATS 2015

Conclusion

Introduced a simple & interpretable Bayesian choice model.

Bayesian shrinkage involving mental conflict

Irrational choice-set-dependent Gaussian process prior

Uncertain aversion as a cause of context effects

Accurate prediction when absolute preference and

compromise effect are mixed.

AISTATS 2015

Future Directions

More active Bayesianism for realistic human models

Integration with other Bayesian discrete choice models

(e.g., (Shenoy and Yu, 2013))

Explaining attraction effect

Current limitation: decoy gets high share due to

symmetric similarity to target option.

Extension to time-series decision making models

E.g., emulating how human plays multi-armed bandit

(Zhang and Yu, 2013)

Choice-set optimization avoiding irrational context effects

News channel = set of news articles

Diversified item recommendation (Ziegler et al., 2005)

Via linear submodular bandits (Yue and Guestrin, 2011)

AISTATS 2015

References I

Antonini, G., Gioia, C., Frejinger, E., and Th´

emans, M. (2007).

Swissmetro: description of the data.

http://biogeme.epfl.ch/swissmetro/examples.html.

Dotson, J. P., Lenk, P., Brazell, J., Otter, T., Maceachern, S. N.,

and Allenby, G. M. (2009). A probit model with structured

covariance for similarity effects and source of volume

calculations. http://ssrn.com/abstract=1396232.

Gonz´

alez-Vallejo, C. (2002). Making trade-offs: A probabilistic and

context-sensitive model of choice behavior. Psychological

Review, 109:137–154.

Huber, J., Payne, J. W., and Puto, C. (1982). Adding

asymmetrically dominated alternatives: Violations of regularity

and the similarity hypothesis. Journal of Consumer Research,

9:90–98.

AISTATS 2015

References II

Kivetz, R., Netzer, O., and Srinivasan, V. S. (2004). Alternative

models for capturing the compromise effect. Journal of

Marketing Research, 41(3):237–257.

McFadden, D. and Train, K. (2000). Mixed MNL models for

discrete response. Journal of Applied Econometrics,

15:447

–470.

McFadden, D. L. (1980). Econometric models of probabilistic

choice among products. Journal of Business, 53(3):13–29.

Roe, R. M., Busemeyer, J. R., and Townsend, J. T. (2001).

Multialternative decision field theory: A dynamic connectionist

model of decision making. Psychological Review, 108:370–392.

Shenoy, P. and Yu, A. J. (2013). A rational account of contextual

effects in preference choice: What makes for a bargain? In

Proceedings of the Cognitive Science Society Conference.

AISTATS 2015

References III

Simonson, I. (1989). Choice based on reasons: The case of

attraction and compromise effects. Journal of Consumer

Research, 16:158–174.

Trueblood, J. S. (2014). The multiattribute linear ballistic

accumulator model of context effects in multialternative choice.

Psychological Review, 121(2):179–

205.

Tversky, A. (1972). Elimination by aspects: A theory of choice.

Psychological Review, 79:281–299.

Usher, M. and McClelland, J. L. (2004). Loss aversion and

inhibition in dynamical models of multialternative choice.

Psychological Review, 111:757–

769.

Wen, C.-H. and Koppelman, F. (2001). The generalized nested

logit model. Transportation Research Part B, 35:627–641.

AISTATS 2015

References IV

Williams, H. (1977). On the formulation of travel demand models

and economic evaluation measures of user benefit. Environment

and Planning A, 9(3):285–344.

Yai, T. (1997). Multinomial probit with structured covariance for

route choice behavior. Transportation Research Part B:

Methodological, 31(3):195–207.

Yue, Y. and Guestrin, C. (2011). Linear submodular bandits and

their application to diversified retrieval. In Shawe-taylor, J.,

Zemel, R., Bartlett, P., Pereira, F., and Weinberger, K., editors,

Advances in Neural Information Processing Systems 24, pages

2483–2491.

AISTATS 2015

References V

Zhang, S. and Yu, A. J. (2013). Forgetful Bayes and myopic

planning: Human learning and decision-making in a bandit

setting. In Burges, C., Bottou, L., Welling, M., Ghahramani, Z.,

and Weinberger, K., editors, Advances in Neural Information

Processing Systems 26, pages 2607–2615. Curran Associates,

Inc.

Ziegler, C.-N., McNee, S. M., Konstan, J. A., and Lausen, G.

(2005). Improving recommendation lists through topic

diversification. In Proceedings of the 14th international

conference on World Wide Web (WWW 2005), pages 22–32.

ACM.

AISTATS 2015