このページは http://www.slideshare.net/tamurashinichi/20140611-prml-slidesrev の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

1年以上前 (2015/06/15)にアップロードinテクノロジー

The presentation material for the reading club of Pattern Recognition and Machine Learning by Bis...

The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.

The contents of the sections cover

- Exponential Family and its ML estimation

- Overview of Nonparametric methods density estimation

- Kernel Density Estimators

- Nearest-neighbour methods and its application for classification

-------------------------------------------------------------------------

研究室でのBishop著『パターン認識と機械学習』（PRML）の輪講用発表資料（ぜんぶ英語）です。

担当範囲は

・指数型分布族とその最尤推定

・密度推定のためのノンパラメトリック法の概要

・カーネル密度推定法

・最近傍法とその分類への応用

- PRML 2.4-2.5

The exponential family

&

Nonparametric methods

June 11, 2014

by Shinichi TAMURA - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli, multinomial,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli, multinomial, Gaussian,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli, multinomial, Gaussian,

beta,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli, multinomial, Gaussian,

beta, gamma,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

The exponential family

Bernoulli, multinomial, Gaussian,

beta, gamma, von Mises...etc.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

Almost all of the distributions we studied so far belong

to a single class, namely the exponential family.

Parametric distributions

The exponential family

Bernoulli, multinomial, Gaussian,

beta, gamma, von Mises...etc.

Gaussian mixture...etc.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

The exponential family over x given η

is a class of distributions which form is

p(x|η) = h(x)g(η) exp ηTu(x)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

The exponential family over x given η

is a class of distributions which form is

p(x|η) = h(x)g(η) exp ηTu(x)

Natural parameter

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

The exponential family over x given η

is a class of distributions which form is

p(x|η) = h(x)g(η) exp ηTu(x)

Natural parameter

Where and

x

η

come across

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

The exponential family over x given η

is a class of distributions which form is

p(x|η) = h(x)g(η) exp ηTu(x)

Natural parameter

Where and

x

η

come across

Normalizing constant

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 1) The Bernoulli Distribution

p(x|η) = µx(1 − µ)1−x

= σ(−η) exp(ηx)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 1) The Bernoulli Distribution

h(x) = 1

g(η)

p(x|η) = µx(1 − µ)1−x

u(x)

= σ(−η) exp(ηx)

where

η = ln

µ

1 − µ

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

p(x|η) =

µxk

k

= exp(ηTx)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

h(x) = 1

g(η) = 1

p(x|η) =

µxk

k

u(x)

= exp(ηTx)

where

η = (ln µ1, . . . , ln µM )T

⇒

exp(ηk) =

µk = 1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

h(x) = 1

g(η) = 1

p(x|η) =

µxk

k

u(x)

= exp(ηTx)

where

η = (ln µ1, . . . , ln µM )T

⇒

exp(ηk) =

µk = 1

It's inconvenient!

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by µM = 1 − M−1 µ

x

k=1

k, xM = 1 −

M −1

k=1

k

M −1

M −1

M −1

p(x|µ) = exp

xk ln µk + 1 −

xk ln 1 −

µk

k=1

k=1

k=1

M −1

M −1

= exp

xk ln

µk

+ ln 1 −

µ

1

k

µ

k=1

−

M −1

k=1

k

k=1

M −1

M −1

= 1 −

µk exp

xk ln

µk

.

1

µ

k=1

k=1

−

M −1

k=1

k

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by µM = 1 − M−1 µ

x

k=1

k, xM = 1 −

M −1

k=1

k

M −1

M −1

M −1

p(x|µ) = exp

xk ln µk + 1 −

xk ln 1 −

µk

k=1

k=1

k=1

M −1

M −1

= exp

xk ln

µk

+ ln 1 −

µ

1

k

µ

k=1

−

M −1

k=1

k

k=1

M −1

M −1

= 1 −

µk exp

xk ln

µk

.

1

µ

k=1

k=1

−

M −1

k=1

k

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2) The Multinomial Distribution

Remove the constraint by µM = 1 − M−1 µ

x

k=1

k, xM = 1 −

M −1

k=1

k

M −1

M −1

M −1

p(x|µ) = exp

xk ln µk + 1 −

xk ln 1 −

µk

k=1

k=1

k=1

M −1

M −1

= exp

xk ln

µk

+ ln 1 −

µ

1

k

µ

k=1

−

M −1

k=1

k

k=1

M −1

M −1

= 1 −

µk exp

xk ln

µk

.

1

µ

k=1

k=1

−

M −1

k=1

k

Therefore...

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 2') The Multinomial Distribution

h(x) = 1

w/o constraint

g(η)

u(x)

p(x|η) =

µxk

k

M −1

−1

= 1 +

exp(ηk)

exp(ηTx)

k=1

where

T

η = ln

µ1

P

, . . . , ln

µM−1

P

, 0

1−

µ

1

µ

j

j

−

j

j

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 3) The Gaussian Distribution

1

1

p(x|η) =

exp

(

(2

−

x − µ)2

πσ2)1/2

2σ2

= (2

x

π)−1/2(−2η2)1/2 exp

η21

exp

4

η

η

1

η2

2

x2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

The Exponential Family

E.g. 3) The Gaussian Distribution

h(x) = 1

g(η)

1

1

p(

exp

(

x|η) = (2

−

x − µ)2

πσ2)1/2

2σ2

u(x)

= (2

x

π)−1/2(−2η2)1/2 exp

η21

exp

4

η1 η2

x2

η2

where

1

T

η =

µ ,−

σ2

2σ2

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

OK, we know what EF looks like.

Then, how to estimate the parameter?

Maximize likelihood!

Frequentist way.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Suppose we have i.i.d. data ,

X = {x1, . . . , xN}

The log-likelihood of i

η s

N

ln p(X|η) = ln

p(xn|η)

n=1

N

= ln

h(xn)g(η) exp ηTu(xn)

n=1

N

N

=

ln h(xn) + N ln g(η) + ηT

u(xn).

n=1

n=1

N

∴ ∇η ln p(X|η) = N∇η ln g(η) +

u(xn).

−→ 0

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Suppose we have i.i.d. data ,

X = {x1, . . . , xN}

The log-likelihood of i

η s

N

ln p(X|η) = ln

p(xn|η)

n=1

N

= ln

h(xn)g(η) exp ηTu(xn)

n=1

N

N

=

ln h(xn) + N ln g(η) + ηT

u(xn).

n=1

n=1

N

∴ ∇η ln p(X|η) = N∇η ln g(η) +

u(xn).

−→ 0

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Suppose we have i.i.d. data ,

X = {x1, . . . , xN}

The log-likelihood of i

η s

N

ln p(X|η) = ln

p(xn|η)

n=1

N

= ln

h(xn)g(η) exp ηTu(xn)

n=1

N

N

=

ln h(xn) + N ln g(η) + ηT

u(xn).

n=1

n=1

N

∴ ∇η ln p(X|η) = N∇η ln g(η) +

u(xn).

−→ 0

n=1

By putting this to zero

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Therefore

1 N

−∇η ln g(ηML) =

u(x

N

n).

n=1

Here, i

ηML s determined only through ,

u(xn)

n

so it is called “sufficient statistics”.

We need to store only for e

u(x

n

n)

stimation.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

E.g.) Gaussian distribution

By a

g(η) = (−2η2)1/2 exp η2

nd ,

1 /4η2

u(x) = (x, x2)T

− η1

2η2

−∇ ln g(η) =

=

µ

− 1 + η21

σ2 + µ2 .

2η2

4η2

2

1

∴ µML =

x

N

n,

n

2

1

1

σ2

=

ML

x2

x

.

N

n −

N

n

n

n

That's what we already know.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

By the way, we want to know

the relation between a

ηML nd .

η

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Gradient of h(x)g(η) exp ηTu(x) dx = 1

by gi

η

ves

∇g(η)

h(x) exp ηTu(x) dx

+

h(x)g(η) exp ηTu(x) u(x)dx = 0.

⇔

−∇ ln g(η) = E [u(x)] .

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

Gradient of h(x)g(η) exp ηTu(x) dx = 1

by gi

η

ves

∇g(η)

h(x) exp ηTu(x) dx

+

h(x)g(η) exp ηTu(x) u(x)dx = 0.

⇔

−∇ ln g(η) = E [u(x)] .

N

Similar to

1

−∇η ln g(ηML) =

u(x

N

n)

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

According to LLN, sample mean will converge to the

expectation, so w

ηML ill converge to .

η

1 N

−∇η ln g(ηML) =

u(x

N

n)

n=1

−∇ ln g(η) = E [u(x)]

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

According to LLN, sample mean will converge to the

expectation, so w

ηML ill converge to .

η

1 N

−∇η ln g(ηML) =

u(x

N

n)

n=1

Converge

−∇ ln g(η) = E [u(x)]

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Maximum likelihood for EF

According to LLN, sample mean will converge to the

expectation, so w

ηML ill converge to .

η

1 N

−∇η ln g(ηML) =

u(x

N

n)

n=1

Converge

Converge

−∇ ln g(η) = E [u(x)]

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF

If you want to use the Bayesian inference,

a prior distribution is needed.

Then, how to decide it,

if we don't know anything about the parameter?

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF

Three candidates:

1. Conjugate priors

2. Uniform distributions

3. Noninformative priors

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions

3. Noninformative priors

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions ... Principle of indifference

3. Noninformative priors

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions ... Principle of indifference

3. Noninformative priors ... Make effects of priors little

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions ... Principle of indifference

3. Noninformative priors ... Make effects of priors little

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is

ν

p(η|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is

Correspond

ν

p(η|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is Normalizing constant ν

p(η|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is

ν

p(η|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is

ν

p(η

|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

It will give posteriors as follows.

N

p(η|X, X , ν) ∝

h(xn)g(η) exp ηTu(xn) × g(η)ν exp{ηTX }

n=1

N

∝ g(η)N+ν exp ηT

u(xn) + νX

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Conjugate priors

Distributions of EF has factors of ,

g(η) exp(ηTu)

so conjugate priors is

ν

p(η

|X , ν) = f(X , ν) g(η) exp{ηTX }

= f(X , ν)g(η)ν exp{νηTX }.

It will give posteriors as follows.

Correspond

N

p(η|X, X , ν) ∝

h(xn)g(η) exp ηTu(xn) × g(η)ν exp{ηTX }

n=1

N

∝ g(η)N+ν exp ηT

u(xn) + νX

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions ... Principle of indifference

3. Noninformative priors ... Make effects of priors little

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

The uniform distribution is common choice for discrete

bounded variable.

C.f.: Principle of insufficient reason (or Principle of indifference)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

The uniform distribution is common choice for discrete

bounded variable.

C.f.: Principle of insufficient reason (or Principle of indifference)

But two problems arise when it is applied to continuous

variables:

1. The normalization problem

2. The transformation problem

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

1. Normalization Problem

If the parameter is unbounded

∞

∞

p(λ)dλ =

const dλ → ∞

−∞

−∞

These priors are called “improper”.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

1. Normalization Problem

If the parameter is unbounded

∞

∞

p(λ)dλ =

const dλ → ∞

−∞

−∞

These priors are called “improper”.

Note that these priors can give proper posteriors,

because posteriors are proportional to likelihood,

which can be normalized.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

2. Transformation problem

Non-linear transformation gives non-constant priors.

E.g.)

p(λ) = 1

√

η= λ

dλ

p(η) = p(λ)

= 2

d

η

η

(Sometimes, the posteriors are not sensitive to the difference.)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

2. Transformation problem

Non-linear transformation gives non-constant priors.

E.g.)

p(λ) = 1

√

η= λ

dλ

p(η) = p(λ)

= 2η Not constant for η

d

η

(Sometimes, the posteriors are not sensitive to the difference.)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

2. Transformation problem

Non-linear transformation gives non-constant priors.

E.g.)

p(λ) = 1

√

η= λ

dλ

p(η) = p(λ)

= 2η Not constant for η

d

η

Think "constant for what?"

(Sometimes, the posteriors are not sensitive to the difference.)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Uniform distributions

Keep these problems in mind:

1. The normalization problem

2. The transformation problem

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

Three candidates:

1. Conjugate priors

... Easy to handle

2. Uniform distributions ... Principle of indifference

3. Noninformative priors ... Make effects of priors little

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

Two examples of noninformative priors:

1. Priors for location parameters

2. Priors for scale parameters

These are constructed to make effects to posteriors

as little as possible, so that the inference would be

objective.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

If the density form is

p(x|µ) = f(x − µ),

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

If the density form is

p(x|µ) = f(x − µ),

the constant shift gi

x = x + c

ves same density:

p(x|µ) = f(x − µ).

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

If the density form is

p(x|µ) = f(x − µ),

the constant shift gi

x = x + c

ves same density:

p(x|µ) = f(x − µ).

This property is “translation invariance” and

these parameter is “location parameter”.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

To reflect the translation invariance, priors should be

A

A

p(µ)dµ =

p(µ − c)dµ

for∀A, B.

B

B

⇐⇒ p(µ) = p(µ − c).

⇐⇒ p(µ) = constant.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

To reflect the translation invariance, priors should be

A

A

p(µ)dµ =

p(µ − c)dµ

for∀A, B.

B

B

⇐⇒ p(µ) = p(µ − c).

⇐⇒ p(µ) = constant.

We obtained uniform distributions after all.

But unlike before, we know when to use it.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

E.g.) The mean in Gaussian

1

1

p(x|µ) =

exp

(

(2

−

x − µ)2

πσ2)1/2

2σ2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

E.g.) The mean in Gaussian

This form is f

(x − µ)

1

1

p(x|µ) =

exp

(

(2

−

x − µ)2

πσ2)1/2

2σ2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

1. Priors for location parameters

E.g.) The mean in Gaussian

This form is f

(x − µ)

1

1

p(x|µ) =

exp

(

(2

−

x − µ)2

πσ2)1/2

2σ2

This prior is also obtained as a limit of conjugates.

p(µ) = N (µ|µ0, σ2)

σ2

0 →∞

0

−−−−→const.,

µN =

σ2

µ

µ

N σ2 +

0 +

N σ20

+

ML

→µML,

0

σ2

N σ20

σ2

1

1

=

+ N

N

→

.

σ2

σ2

σ2

σ2

N

0

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

If the density form is

1

x

p(x|σ) = f

σ

σ

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

If the density form is

1

x

p(x|σ) = f

σ

σ

the constant scale gi

x = cx

ves same density:

1

x

p(x|σ) = f

σ

σ

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

If the density form is

1

x

p(x|σ) = f

σ

σ

the constant scale gi

x = cx

ves same density:

1

x

p(x|σ) = f

σ

σ

This property is “scale invariance” and

these parameter is “scale parameter”.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

To reflect the scale invariance, priors should be

A

A

1

dσ

p(σ)dσ =

p

σ

d

d(

σ

for∀A, B.

B

B

c

cσ)

1

1

⇐⇒ p(σ) = p

σ .

c

c

1

⇐⇒ p(σ) ∝ .

σ

⇐⇒ p(ln σ) = const.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

E.g.) The deviation in Gaussian

1

1

p(x|σ) =

exp

(2

−

x2

πσ2)1/2

2σ2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

E.g.) The deviation in Gaussian

This form is 1 f x

σ

σ

1

1

p(x|σ) =

exp

(2

−

x2

πσ2)1/2

2σ2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

2. Priors for scale parameters

E.g.) The deviation in Gaussian

This form is 1 f x

σ

σ

1

1

p(x|σ) =

exp

(2

−

x2

πσ2)1/2

2σ2

This prior is also obtained as a limit of conjugates.

const

p(λ) = Gam(λ|a0, b0)

a0,b0→∞

−−−−−−→

,

λ

N

aN = a0 + N

2

→ 2 ,

N

bN = b0 + N

2 σ2ML

→ 2 σ2ML,

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Priors for EF – Noninformative priors

Two examples of noninformative priors:

1. Priors for location parameters

p(x|µ) = f(x − µ)

=⇒ p(µ) = const.

2. Priors for scale parameters

1

x

1

p(x|σ) = f

=⇒ p(σ) ∝

σ

σ

σ

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

We learned

“parametric approach”

June 11, 2014

PRML 2.4-2.5

Shinichi TAMURA

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

We learned

“parametric approach”

vs.

We will learn

“nonparametric approach”

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

We learned

“parametric approach”

vs.

We will learn

“nonparametric approach”

What is the difference?

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Parametric

Nonparametric

Assume a specific form Put few assumption about

of the distribution

the form of distribution

Simple

Complex

(depend on data size)

Poor

Rich / Flexible

Efficient

Inefficient

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Parametric

Nonparametric

Assume a specific form Put few assumption about

of the distribution

the form of distribution

Simple

Complex

(depend on data size)

Poor

Rich / Flexible

Efficient

Inefficient

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

We will learn:

1. Histogram methods

2. Kernel density estimators

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods

Split the space into grids (or bins), and count data points.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods

Split the space into grids (or bins), and count data points.

p(x) = pi = ni

(x ∈ i-th bin),

N ∆i

where

∆i = Width of ith bin (usually same for all i),

ni = # of observations which is assigned to ith bin,

N = Total # of observations.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods

Split the space into grids (or bins), and count data points.

p(x) = pi = ni

(x ∈ i-th bin),

N ∆i

where

∆i = Width of ith bin (usually same for all i),

ni = # of observations which is assigned to ith bin,

N = Total # of observations.

This is piecewise constant, hence discontinuous.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

5

∆ = 0.04

00

0.5

1

5

∆ = 0.08

00

0.5

1

5

∆ = 0.25

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

5

∆ = 0.04

Too narrow to catch enough points

Too spiky (noisy)

00

0.5

1

5

∆ = 0.08

00

0.5

1

5

∆ = 0.25

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

# of bins = MD (curse of dimensionality)

5

∆ = 0.04

Too narrow to catch enough points

Too spiky (noisy)

00

0.5

1

5

∆ = 0.08

00

0.5

1

5

∆ = 0.25

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

# of bins = MD (curse of dimensionality)

5

∆ = 0.04

Too narrow to catch enough points

Too spiky (noisy)

00

0.5

1

5

∆ = 0.08

Good intermediate value

00

0.5

1

5

∆ = 0.25

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

# of bins = MD (curse of dimensionality)

5

∆ = 0.04

Too narrow to catch enough points

Too spiky (noisy)

00

0.5

1

5

∆ = 0.08

Good intermediate value

00

0.5

1

5

∆ = 0.25

Too wide to express the data

Too smooth (less info)

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

1. Histogram methods – Example

i

∆ s...

# of bins = MD (curse of dimensionality)

5

∆ = 0.04

Too narrow to catch enough points

Too spiky (noisy)

00

0.5

1

5

∆ = 0.08

Good intermediate value

00

0.5

1

5

∆ = 0.25

Too wide to express the data

Too smooth (less info)

00

0.5

1

Find good value is very important!

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Estimate density at a particular point

from data points of small local region.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Estimate density at a particular point

from data points of small local region.

The regions are defined by “smoothing

parameter”, which control the

complexity in relation with data size.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Estimate density at a particular point

from data points of small local region.

The regions are defined by “smoothing

parameter”, which control the

complexity in relation with data size.

Other problems

• Discontinuity

• Not scalable (curse of dimensionality)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Let's consider a small local region , t

R hen

Pr(K out of N data ∈ R) =

N !

K!(N − K)! P K(1 − P )N−K,

where .

P =

p(x)dx

R

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Let's consider a small local region , t

R hen

Pr(K out of N data ∈ R) =

N !

K!(N − K)! P K(1 − P )N−K,

where .

P =

p(x)dx

R

If

1. K is large enough (smoother not too small)

2. N is constant over (s

R moother small enough)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Let's consider a small local region , t

R hen

Pr(K out of N data ∈ R) =

N !

K!(N − K)! P K(1 − P )N−K,

where .

P =

p(x)dx

R

If

Contradictory

1. K is large enough (smoother not too small)

2. N is constant over (s

R moother small enough)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Let's consider a small local region , t

R hen

Pr(K out of N data ∈ R) =

N !

K!(N − K)! P K(1 − P )N−K,

where .

P =

p(x)dx

R

If Depend on data size

Contradictory

1. K is large enough (smoother not too small)

2. N is constant over (s

R moother small enough)

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Lessons from histogram methods

Let's consider a small local region , t

R hen

Pr(K out of N data ∈ R) =

N !

K!(N − K)! P K(1 − P )N−K,

where .

P =

p(x)dx

R

If

1. K is large enough (smoother not too small)

2. N is constant over (s

R moother small enough)

⇒

p(x) = K .

N V

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h)

and count data by kernel function k(u) (Parzen window).

1, |u

k(u) =

i|

1/2, (i = 1, . . . D)

0, otherwise.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h)

and count data by kernel function k(u) (Parzen window).

Centered on origin,

side is 1

1, |u

k(u) =

i|

1/2, (i = 1, . . . D)

0, otherwise.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Fix a region (e.g., hypercube centered on x, side is h)

and count data by kernel function k(u) (Parzen window).

1, |u

k(u) =

i|

1/2, (i = 1, . . . D)

0, otherwise.

Discontinuous kernel

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Fix a region (e.g., hypercube centred on x, side is h)

and count data by kernel function k(u) (Parzen window).

1, |u

k(u) =

i|

1/2, (i = 1, . . . D)

0, otherwise.

N

x − x

K =

k

n

,

h

n=1

V = hD,

1 N 1

x

∴

− x

p(x) =

k

n

.

N

hD

h

n=1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Symmetry of k(u) let us re-interpret the result.

N data points in the single

cube centered on x

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Symmetry of k(u) let us re-interpret the result.

N data points in the single

N cubes centered on xn

cube centered on x

around x

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Other choice of k(u): Gaussian

1

u 2

k(u) =

exp

(2

− 2

.

π)D/2

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Other choice of k(u): Gaussian

1

u 2

k(u) =

exp

(2

− 2

.

π)D/2

This kernel give continuous density.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Other choice of k(u): Gaussian

1

u 2

k(u) =

exp

(2

− 2

.

π)D/2

You can use anything as long as it holds

k(u)

0,

k(u)du = 1.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Kernel density estimators

Example

Again, we can see that

5

h = 0.005

smooth parameter h controls

the outcome of estimations.

00

0.5

1

5

h = 0.07

00

0.5

1

5

h = 0.2

00

0.5

1

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use a sphere as a region which centred on x and

contains K (fixed number) data points.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use a sphere as a region which centred on x and

contains K (fixed number) data points.

p(x) =

K

N V (x) ,

where V(x) denotes the volume

of the sphere.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Note that this density can not be normalized.

From x* where faraway from all data points, the radius

of the sphere is inversely proportional to x, thus integral

diverge.

∞

d

∞ d

x

x

−∞ r(x)

x∗

r(x)

∞

dx

x∗

x − x†

→ ∞.

dx

∴

K

dx ∝

RD N V (x)

RD r(x)D

→ ∞.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour estimators

Example

Here again, smooth parameter 5 K = 1

K controls the outcome of

00

0.5

1

estimations.

5

K = 5

00

0.5

1

5

K = 30

00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour estimators

Example

Here again, smooth parameter 5 K = 1

K controls the outcome of

00

0.5

1

estimations.

5

K = 5

0

Furthermore, we can observe 0

0.5

1

5

K = 30

that i

p(x) → ∞ n K=1 case. 00

0.5

1

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Another problem of Kernels and NNs

These methods need all observed data for estimation,

so both time and space complexity is O(N). It is very

inefficient.

On that point, parametric methods are quite efficient

(c.f., sufficient statistics).

Histograms are also efficient.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Histograms

Kernels

NNs

K

Not fixed

Not fixed

Fixed

V

Not fixed

Fixed

Not fixed

Smoother

∆

h

V

Continuity

No

It depends

Yes*

Dimensionality Suffer

Scalable

Scalable

Normalization

Proper

Proper

Improper

Data set

Discard

Keep

Keep

* If K=1, not continuous

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Histograms

Kernels

NNs

K

Not fixed

Not fixed

Fixed

V

Not fixed

Fixed

Not fixed

Smoother

∆

h

V

Continuity

No

It depends

Yes*

Dimensionality Suffer

Scalable

Scalable

Normalization

Proper

Proper

Improper

Data set

Discard

Keep

Keep

* If K=1, not continuous

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nonparametric methods

Histograms

Kernels

NNs

K

Not fixed

Not fixed

Fixed

V

Not fixed

Fixed

Not fixed

Smoother

∆

h

V

Continuity

No

It depends

Yes*

Dimensionality Suffer

Scalable

Scalable

Normalization

Proper

Proper

Improper

Data set

Discard

Keep

Keep

* If K=1, not continuous

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Nonparametric methods

Histograms

Kernels

NNs

K

Not fixed

Not fixed

Fixed

V

Not fixed

Fixed

Not fixed

Smoother

∆

h

V

Continuity

No

It depends

Yes*

Dimensionality Suffer

Scalable

Scalable

Normalization

Proper

Proper

Improper

Data set

Discard

Keep

Keep

* If K=1, not continuous

June 11, 2014

PRML 2.4-2.5- THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier

To do this, use the sphere contains

K points irrespective to the class.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier

To do this, use the sphere contains

K points irrespective to the class.

p(x|Ck) = Kk ,

N

kV

p(x) = K ,

N V

where K is # in class k and sphere.

k

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier

To do this, use the sphere contains

K points irrespective to the class.

p(x|Ck) = Kk ,

N

kV

p(x) = K ,

N V

where K is # in class k and sphere.

k

Class priors are , s

p(Ck) = Nk/N o

p(Ck|x) = p(x|Ck)p(Ck) = Kk .

p(x)

K

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier

Therefore, x will be classified to

the greatest majority among x's

K-nearest neighbours.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier

Therefore, x will be classified to

the greatest majority among x's

K-nearest neighbours.

If K=1, it is called “nearest-

neighbour rule”.

June 11, 2014

PRML 2.4-2.5 - THE EXPONENTIAL FAMILY

NONPARAMETRIC METHODS

Nearest-neighbour methods

Use NNs as classifier – Example

K = 1

K = 3

K = 3 1

2

2

2

x

7

x7

x7

1

1

1

0

0

0

0

1

x

2

0

1

2

0

1

2

6

x6

x6

Same as the discussion so far, here K acts as

smooth parameter.

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5

NONPARAMETRIC METHODS

Today's topics

1. The exponential family

1. What is exponential family?

2. Maximum likelihood for EF

3. How to decide priors for EF

2. Nonparametric methods

1. What is the point of nonparametric methods ?

2. Kernel density estimator

3. Nearest-neighbour methods

June 11, 2014

PRML 2.4-2.5