このページは http://www.slideshare.net/tamurashinichi/prml-49407244 の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

1年以上前 (2015/06/15)にアップロードinテクノロジー

The presentation material for the reading club of Pattern Recognition and Machine Learning by Bis...

The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.

The contents of the sections cover

- K-Means Clustering and its application for Image Compression

- Introduction of Latent Variables

- Mixtures of Gaussians and its update using EM-algorithm

-------------------------------------------------------------------------

研究室でのBishop著『パターン認識と機械学習』（PRML）の輪講用発表資料（ぜんぶ英語）です。

担当範囲は

・K-meansクラスタリングとその画像圧縮への応用

・隠れ変数の導入

・混合ガウス分布とEMアルゴリズムによる更新

- PRML勉強会 #51年以上前 by jkfishlover
- PRML復々習レーン＃14 ver.2約3年前 by Takuya Fukagai
- Bishop prml 9.3_wk77_100408-15045年以上前 by Wataru Kishimoto

- PRML 9.1-9.2

K-means Clustering

&

Mixtures of Gaussians

July 16, 2014

by Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Clustering Problem

An unsupervised machine learning problem

Divide data in some group (=cluster) where

ü similar data

>

same group

ü dissimilar data >

different group

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Clustering Problem

Divide data in some group (=cluster) where

ü similar data

>

same group

ü dissimilar data >

different group

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Clustering Problem

Divide data in some group (=cluster) where

ü similar data

>

same group

ü dissimilar data >

different group

N

Minimize

x

2

n − µk(n)

n=1

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Clustering Problem

Divide data in some group (=cluster) where

ü similar data

>

same group

ü dissimilar data >

different group

Center of the cluster

N

Minimize

x

2

n − µk(n)

n=1

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Clustering Problem

Given data set a

X = {x1, . . . , xN} nd # of cluster K

Let be

µk

cluster representative and be

rnk

assignment indicator ( ),

rnk = 1 if x ∈ Ck

N

K

Minimize J =

r

2

nk xn − µk

n=1 k=1

Here, J is called “distortion measure”.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

K-means Clustering

How to solve that?

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

How to solve that?

a

µk nd a

rnk re dependent each other

> No closed form solution

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

How to solve that?

a

µk nd a

rnk re dependent each other

> No closed form solution

Use iterative algorithm !

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Strategy

a

µk nd c

rnk an't be updated simultaneously

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Strategy

a

µk nd c

rnk an't be updated simultaneously

> Update them one by one

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (as

rnk

signment)

Since each c

x an be determined independently,

n

J will be minimum if they are assigned to the

nearest .

µk

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (as

rnk

signment)

Since each c

x an be determined independently,

n

J will be minimum if they are assigned to the

nearest .

µk Therefore,

1 if k = arg min

2,

r

j

xn − µj

nk =

0 otherwise.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µk s obtained by setting derivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

N

⇐⇒

2

rnk(xn − µk) = 0.

n=1

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µk s obtained by setting derivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

N

⇐⇒

2

rnk(xn − µk) = 0.

n=1

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µk s obtained by setting derivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

N

⇐⇒

2

rnk(xn − µk) = 0.

n=1

Mean of the cluster

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µk s obtained by setting derivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

N

⇐⇒

2

rnk(xn − µk) = 0.

n=1

Mean of the cluster

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µµ

k k

is s

t obt

he a

m i

ene

a d by s

n of t e

he tt

c i

l ng de

uster rivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

Cost function J corresponds to

N

the su ⇐

m ⇒

of 2

innerr-class variance!

nk(xn − µk) = 0.

n=1

Mean of the cluster

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Update of (p

µk arameter estimation)

Optimal i

µµ

k k

is s

t obt

he a

m i

ene

a d by s

n of t e

he tt

c i

l ng de

uster rivative 0.

∂

N

K

r

2 = 0.

∂µ

nk

xn − µk

k n=1 k =1

Cost function J corresponds to

N

the su ⇐

m ⇒

of 2

innerr-class variance!

nk(xn − µk) = 0.

n=1

Mean of the cluster

N

1

∴

r

µ

n=1 nkxn

k =

=

x

N

n.

r

N

n=1 nk

k xn∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

K-means algorithm

1. Initialize ,

µk rnk

2. Repeat following two steps until converge

i) Assign each t

xn o closest µk

ii) Update t

µk o the mean of the cluster

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

K-means algorithm

1. Initialize ,

µk rnk

2. Repeat following two steps until converge

i) Assign each t

xn o closest µk E step

ii) Update t

µk o the mean of the cluster

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

K-means algorithm

1. Initialize ,

µk rnk

2. Repeat following two steps until converge

i) Assign each t

xn o closest µk

ii) Update t

µk o the mean of the cluster

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Convergence property

Both steps never increase J, so we can obtain

better result in every iteration.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Convergence property

Both steps never increase J, so we can obtain

better result in every iteration.

Since i

rnk s finite, algorithm converge after

finite iterations.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

E step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

E step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

E step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

E step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

K-means Clustering

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

K-means Clustering

Calculation performance

E step ... Comparison of every data point

xn

and every cluster mean µk

> O(KN)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Calculation performance

E step ... Comparison of every data point

xn

and every cluster mean µk

> O(KN)

Not good

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Calculation performance

E step ... Comparison of every data point

xn

and every cluster mean µk

> O(KN)

Not good

Improve with kd-tree,

triangle inequality...etc

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Calculation performance

E step ... Comparison of every data point

xn

and every cluster mean µk

> O(KN)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Calculation performance

E step ... Comparison of every data point

xn

and every cluster mean µk

> O(KN)

M step ... Calculation of mean for every cluster

> O(N)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Here, two variation will be introduced:

1. On-line version

2. General dissimilarity

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Here, two variation will be introduced:

1. On-line version

2. General dissimilarity

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 1. On-line version

The case where one datum is observed at once.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 1. On-line version

The case where one datum is observed at once.

> Apply Robbins-Monro algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 1. On-line version

The case where one datum is observed at once.

> Apply Robbins-Monro algorithm

µnew =

+

)

k

µold

k

ηn(xn − µold

k

.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 1. On-line version

The case where one datum is observed at once.

> Apply Robbins-Monro algorithm

µnew =

+

)

k

µold

k

ηn(xn − µold

k

.

Learning rate

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 1. On-line version

The case where one datum is observed at once.

> Apply Robbins-Monro algorithm

µnew =

+

)

k

µold

k

ηn(xn − µold

k

.

Learning rate

Decrease with iteration

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

Here, two variation will be introduced:

1. On-line version

2. General dissimilarity

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

Euclidian distance is not

ü appropriate to categorical data, etc.

ü robust to outlier.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

Euclidian distance is not

ü appropriate to categorical data, etc.

ü robust to outlier.

> Use general dissimilarity measure V(x, x )

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

Euclidian distance is not

ü appropriate to categorical data, etc.

ü robust to outlier.

> Use general dissimilarity measure V(x, x )

E step ... No difference

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

Euclidian distance is not

ü appropriate to categorical data, etc.

ü robust to outlier.

> Use general dissimilarity measure V(x, x )

M step ... Not assured J is easy to minimize

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

To make M-step easy, restrict t

µk o the vector

chosen from

{xn}

> A solution can be obtained by finite

number of comparison

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

K-means Clustering

[Variation] 2. General dissimilarity

To make M-step easy, restrict t

µk o the vector

chosen from

{xn}

> A solution can be obtained by finite

number of comparison

µk = arg min

V(xn, xn )

xn xn ∈Ck

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

Basic Idea

Treat similar pixel as same one

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

Basic Idea

Treat similar pixel as same one

Original data

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

Basic Idea

Treat similar pixel as same one

Cluster center

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

Basic Idea

Treat similar pixel as same one

Cluster center

(pallet / code-book vector)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

K-means algorithm can be applied to

Image Compression and Segmentation

Basic Idea

Treat similar pixel as same one

= so called “vector quantization”

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Demo

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Demo

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Demo

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Compression rate

Original image...24N bits

(N=# of pixels)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Compression rate

Original image...24N bits

(N=# of pixels)

Compressed image... 24K+N log K bits

2

(K=# of pallet)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Application for Image Compression

Compression rate

Original image...24N bits

(N=# of pixels)

Compressed image... 24K+N log K bits

2

(K=# of pallet)

16.7% if N~1M, K=10

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

In K-means, all assignments

are equal, “all or nothing”.

Treated same

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

In K-means, all assignments

are equal, “all or nothing”.

Is these “hard” assignment

appropriate?

Treated same

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

In K-means, all assignments

are equal, “all or nothing”.

Is these “hard” assignment

appropriate?

> Want introduce "soft"

assignment

Probabilistic

Treated same

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Introduce random variable z,

having 1-of-K representation

> Control unobserved “states”

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Introduce random variable z,

having 1-of-K representation

> Control unobserved “states”

Once state is determined,

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Introduce random variable z,

having 1-of-K representation

> Control unobserved “states”

Once state is determined,

x is drawn from Gaussian of the state

p(x|zk = 1) = N (x|µk, Σk).

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Introduce random variable z,

having 1-of-K representation

z

> Control unobserved “states”

Once state is determined,

x is drawn from Gaussian of the state x

p(x|zk = 1) = N (x|µk, Σk).

Graphical representation

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Here the distribution over x is

p(x) =

p(z)p(x|z)

z

K

=

p(zk = 1)p(x|zk = 1)

k=1

K

=

πkN (x|µk, Σk).

k=1

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Here the distribution over x is

p(x) =

p(z)p(x|z)

z

z is 1-of-K rep. K

=

p(zk = 1)p(x|zk = 1)

k=1

K

=

πkN (x|µk, Σk).

k=1

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Here the distribution over x is

p(x) =

p(z)p(x|z)

z

K

=

p(zk = 1)p(x|zk = 1)

k=1

K

=

πkN (x|µk, Σk).

k=1

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Here the distribution over x is

p(x) =

p(z)p(x|z)

z

K

=

p(zk = 1)p(x|zk = 1)

k=1

K

=

πkN (x|µk, Σk).

k=1

Gaussian Mixtures !

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

= πkN (x|µk, Σk)

π

j

j N (x|µj , Σj ) .

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

= πkN (x|µk, Σk)

π

j

j N (x|µj , Σj ) .

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

= πkN (x|µk, Σk)

π

j

j N (x|µj , Σj ) .

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

Posteriors

= πkN (x|µk, Σk)

π

j

j N (x|µj , Σj ) . - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

Posteriors

= πkN (x|µk, Σk)

Priors

π

j

j N (x|µj , Σj ) .

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

Posteriors

= πkN (x|µk, Σk)

Priors

π

j

j N (x|µj , Σj ) .

Likelihood

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Estimate (or “explain”) x came from which state

γ(zk) ≡ p(zk = 1|x) = p(zk = 1)p(x|zk = 1)

p(z

j

j = 1)p(x|zj = 1)

= πkN (x|µk, Σk)

π

j

j N (x|µj , Σj ) .

This value is also called “responsibilities”

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Introduction of Latent Variable

Example of Gaussian Mixtures

1

1

1

(b)

(a)

(c)

0.5

0.5

0.5

0

0

0

0

0.5

1

0

0.5

1

0

0.5

1

No state info

Coloured by

Coloured by

true state

responsibility

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

ML estimates of mixtures of Gaussians have

two problems:

i. Presence of Singularities

ii. Identifiability

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

ML estimates of mixtures of Gaussians have

two problems:

i. Presence of Singularities

ii. Identifiability

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞.

→ ∞

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞. → ∞

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞. → ∞

→ 0

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞. → ∞

> 0

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

What if a mean collides with a data point?

∃j, m µj = xm

Likelihood can be however large by σj → 0

1

1

(x

L ∝

+

p

exp − n − µj)2 +

p

σ

k,m

k,n

j

σj

2σ2

k=j

n=m

j

k=j

→∞.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

It doesn't occur in single Gaussian.

1

(x

L ∝

exp − n − µj)2

σN

2σ2

j

n=m

j

→0.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

It doesn't occur in single Gaussian.

1

(x

L ∝

exp − n − µj)2

σN

2σ2

j

n=m

j

→0

→.∞

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

It doesn't occur in single Gaussian.

1

(x

L ∝

exp − n − µj)2

σN

2σ2

j

n=m

j

→0

→.∞

→ 0

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

It doesn't occur in single Gaussian.

1

(x

L ∝

exp − n − µj)2

σN

2σ2

j

n=m

j

→0.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

i) Presence of Singularities

It doesn't occur in single Gaussian.

1

(x

L ∝

exp − n − µj)2

σN

2σ2

j

n=m

j

→0.

It doesn't occur in Bayesian approach either.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

ML estimates of mixtures of Gaussians have

two problems:

i. Presence of Singularities

ii. Identifiability

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

ii) Identifiability

Optimal solutions are not unique:

If we have a solution, there are (K!-1) other

equivalent solution.

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

Problems of ML estimates

ii) Identifiability

Optimal solutions are not unique:

If we have a solution, there are (K!-1) other

equivalent solution.

Matters when interpret,

but does not matter when model only

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

The conditions of ML are obtained by

∂ L = 0,

∂µk

∂ L = 0,

∂Σ

k

∂

L + λ

π

∂π

j

j − 1

= 0.

k

where L(π, µ, Σ) =

N

ln

K

π

n=1

k=1

kN (xn|µk, Σk)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

The conditions of ML

1 N

µk =

γn(zk)xn,

Nk n=1

1 N

Σ

γ

k = N

n(zk)(xn − µj )(xn − µj )T,

k n=1

π

,

k = Nk

N

where Nk =

N

γ

n=1

n(zk)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

The conditions of ML

1 N

µ

γ

k =

γn(zk)xn,

n(zk) appeared

Nk n=1

1 N

Σ

γ

k = N

n(zk)(xn − µj )(xn − µj )T,

k n=1

π

,

k = Nk

N

where Nk =

N

γ

n=1

n(zk)

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Recall that

γn(zk) =

πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Recall that

γn(zk) =

πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

Parameters appeared

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Recall that

γn(zk) =

πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

Parameters appeared

= No closed form solution

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Recall that

γn(zk) =

πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

Parameters appeared

= No closed form solution

Again, use iterative algorithm!

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures

1. Initialize parameters

2. Repeat following two steps until converge

i) Calculate γn(zk) = πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

ii) Update parameters

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures

1. Initialize parameters

2. Repeat following two steps until converge

i) Calculate γn(zk) = πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

ii) Update parameters

E step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

EM algorithm for Gaussian Mixtures

1. Initialize parameters

2. Repeat following two steps until converge

i) Calculate γn(zk) = πkN (xn|µk, Σk)

π

j

j N (xn|µj , Σj ) .

ii) Update parameters

M step

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA - K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Demo of algorithm

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA- K-means Clustering

Mixtures of Gaussians

EM-algorithm for Gaussian Mixtures

Comparison with K-means

EM for Gaussian Mixtures

K-means Clustering

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA

Mixtures of Gaussians

Today's topics

1. K-means Clustering

1. Clustering Problem

2. K-means Clustering

3. Application for Image Compression

2. Mixtures of Gaussians

1. Introduction of latent variables

2. Problem of ML estimates

3. EM-algorithm for Mixture of Gaussians

July 16, 2014

PRML 9.1-9.2

Shinichi TAMURA