このページは http://www.slideshare.net/mamoruk/snlp2013-komachi の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

約3年前 (2013/09/09)にアップロードinテクノロジー

Slides presented at the Summer Camp of Natural Language Processing 2013. Tim Van de Cruys, Thierr...

Slides presented at the Summer Camp of Natural Language Processing 2013. Tim Van de Cruys, Thierry Poibeau and Anna Korhonen. A Tensor-based Factorization Model of Semantic Compositionality. ACL 2013.

- A Tensor-based

Factorization Model of

Semantic

Compositionality

Tim Van de Cruys, Thierry Poibeau and Anna Korhonen

(ACL 2013)

Presented by Mamoru Komachi

<komachi@tmu.ac.jp>

The 5th summer camp of NLP

2013/08/31 - The principle of

compositionality

Dates back to Gottlob Frege (1892)

“… meaning of a complex expression is a

function of the meaning of its parts and the way

those parts are (syntactically) combined”

2 - Compositionality is modeled as a

multi-way interaction between latent

factors

Propose a method for computation of

compositionality within a distributional

framework

Compute a latent factor model for nouns

The latent factors are used to induce a latent

model of three-way (subject, verb, object)

interactions, represented by a core tensor

Evaluate on a similarity task for transitive

phrases (SVO)

3 - Previous work

Distributional framework for semantic

composition

4 - Previous work:

Mitchell and Lapata (ACL 2008)

Explore a number of different models for vector

composition:

Vector addition: pi = ui + vi

Vector multiplication: pi = ui・vi

Evaluate their models on a noun-verb phrase

similarity task

Multiplicative model yields the best results

One of the first approaches to tackle

compositional phenomena (baseline in this work)

5 - Previous work: Grefenstette and

Sadrzadeh (EMNLP 2011)

An instantiation of Coecke et al. (Linguistic

Analysis 2010)

A sentence vector is a function of the

Kronecker product of its word vectors

subverbobj (sub obj)*verb

Assume that relational words (e.g. adjectives

or verbs) have a rich (multi-dimensional)

structure

Proposed model uses an intuition similar to

theirs (the other baseline in this work)

6 - Overview of

compositional

semantics

input

target

operation

Mitchell and

Lapata (2008)

Vector

Noun-verb

Add & mul

Baroni and

Linear

Zamparelli

Vector

Adjective &

transformation

(2010)

noun

(matrix mul)

Coecke et al.

(2010),

Grefenstette

Vector

Sentence

Krochecker

and Sadrzadeh

product

(2011)

Socher et al.

(2010)

Vector + matrix

Sentence

Vector & matrix

mul

7 - Methodology

The composition of SVO triples

8 - Construction of latent

noun factors

Context words

Context words

H

Nouns

V

Noun W

= s

×

Non-negative matrix factorization (NMF)

Minimizes KL divergence between an original

matrix V and W H s.t. all values of the in

I×J

I×K

K×J

the three matrices be non-negative

9 - Tucker decomposition

objects

k

subjects

obje su

cts bjects

=

k

k

verbs

verbs

Generalization of the SVD

Decompose a tensor into a core tensor,

multiplied by a matrix along each mode

10 - Decomposition w/o the

latent verb

objects

k

subjects

obje su

cts bjects

=

k

verbs

verbs

Only the subject and object mode are

represented by latent factors (to be able to

efficiently compute the similarity of verbs)

11 - Extract the latent vectors

from noun matrix

objects

k

subjects

k

Y

= ○

Y

w

w

<athlete,race>

athlete

race

The athlete runs a race.

Compute the outer product (◯) of subject and

object.

12 - Capturing the latent

interactions with verb matrix

objects

k

subjects

k

Z

=

Y

*

○

verbs

Z

G

* Y

run,<athlete,race>

athelete,race

Take the Hadamard product (*) of matrix Y

with verb matrix G, which yields our final

matrix Z.

13 - Examples & Evaluation

14 - Semantic features of the subject combine

with semantic features of the object

Animacy: 28, 40, 195; Sport: 25; Sport event: 119; Tech: 7, 45, 89

15 - Verb matrix contains the verb

semantics computed over the

complete corpus

‘Organize’ sense: <128, 181>; <293, 181>

‘Transport’ sense: <60, 140>

‘Execute’ sense: <268, 268>

16 - Tensor G captures the

semantics of the verb

Most similar verbs from Z

Zrun,<athlete,race>: finish (.29), attend (.27), win (.25)

Zrun<user,command>: execute (.42), modify (.40), invoke

(.39)

Zdamage,<man,car>: crash (.43), drive (.35), ride (.35)

Zdamage,<car,man>: scare(.26), kill (.23), hurt (.23)

Similarity is calculated by measuring the cosine

of the vectorized representation of the verb

matrix

Can distinguish word order

17 - Transitive (SVO) sentence

similarity task

Extension of the similarity task (Mitchell and

Lapata, ACL 2008)

http://www.cs.ox.ac.uk/activities/CompDistM

eaning/GS2011data.txt

2,500 similarity judgments

25 participants

p

target

subjec

object

landmark

sim

t

19

meet

system

criterio

visit

1

n

21

write

student name

spell

6

18 - Latent model outperforms

previous models

model

contextualized

Non-

contextualized

baseline

.23

multiplicative

.32

.34

categorical

.32

.35

latent

.32

.37

Upper bound

.62

Multiplicative (Mitchell and Lapata, ACL-2008)

Categorical (Grefenstette and Sadrzadeh, 2011)

Upper bound = inter-annotator agreement (Grefenstette

and Sadrzadeh, EMNLP 2011)

19 - Conclusion

Proposed a novel method for computation of

compositionality within a distributional

framework

Compute a latent factor model for nouns

The latent factors are used to induce a latent

model of three-way (subject, verb, object)

interactions, represented by a core tensor

Evaluated on a similarity task for transitive

phrases and exceeded the state of the art

20