このページは http://www.slideshare.net/stephenbach/psl-overview-22992504 の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

3年以上前 (2013/06/14)にアップロードinテクノロジー

An overview of probabilistic soft logic, a language for templating hinge-loss Markov random fields.

- Probabilistic Soft Logic:

An Overview

Lise Getoor, Stephen Bach, Bert Huang

University of Maryland, College Park

http://psl.umiacs.umd.edu

Additional Contributors: Matthias Broecheler, Shobeir Fakhraei, Angelika Kimmig,

London, Alex Memory, Hui Miao, Lily Mihalkova, Eric Norris, Jay Pujara, Theo Rekatsinas, Arti Rames - (PSL)

Declarative language based on logics to express

collective probabilistic inference problems

Predicate = relationship or property

Atom = (continuous) random variable

Rule = capture dependency or constraint

Set = define aggregates

PSL Program = Rules + Input DB - (PSL)

Declarative language based on logics to express

collective probabilistic inference problems

Predicate = relationship or property

Atom = (continuous) random variable

Rule = capture dependency or constraint

Set = define aggregates

PSL Program = Rules + Input DB - Entity Resolution

Entities

- People References

Attributes

- Name

Relationships

- Friendship

Goal: Identify

references that denote

the same person

A B

John Smith J. Smith

name name

C

E

D F G

H

friend friend

=

= - Entity Resolution

References, names,

friendships

Use rules to express

evidence

- ‘’If two people have similar

names, they are probably the

same’’

- ‘’If two people have similar

friends, they are probably the

same’’

- ‘’If A=B and B=C, then A and C

must also denote the same

person’’

A B

John Smith J. Smith

name name

C

E

D F G

H

friend friend

=

= - Entity Resolution

References, names,

friendships

Use rules to express

evidence

- ‘’If two people have similar

names, they are probably the

same’’

- ‘’If two people have similar

friends, they are probably the

same’’

- ‘’If A=B and B=C, then A and C

must also denote the same

person’’

A B

John Smith J. Smith

name name

C

E

D F G

H

friend friend

=

=

A.name ≈{str_sim} B.name => A≈B : 0.8 - Entity Resolution

References, names,

friendships

Use rules to express

evidence

- ‘’If two people have similar

names, they are probably the

same’’

- ‘’If two people have similar

friends, they are probably the

same’’

- ‘’If A=B and B=C, then A and C

must also denote the same

person’’

A B

John Smith J. Smith

name name

C

E

D F G

H

friend friend

=

=

{A.friends} ≈{} {B.friends} => A≈B : 0.6 - Entity Resolution

References, names,

friendships

Use rules to express

evidence

- ‘’If two people have similar

names, they are probably the

same’’

- ‘’If two people have similar

friends, they are probably the

same’’

- ‘’If A=B and B=C, then A and C

must also denote the same

person’’

A B

John Smith J. Smith

name name

C

E

D F G

H

friend friend

=

=

A≈B ^ B≈C => A≈C : ∞ - Link Prediction

Entities

- People, Emails

Attributes

- Words in emails

Relationships

- communication, work

relationship

Goal: Identify work

relationships

- Supervisor, subordinate,

colleague - Link Prediction

People, emails, words,

communication, relations

Use rules to express

evidence

- “If email content suggests type X,

it is of type X”

- “If A sends deadline emails to B,

then A is the supervisor of B”

- “If A is the supervisor of B, and A

is the supervisor of C, then B and C

are colleagues” - Link Prediction

People, emails, words,

communication, relations

Use rules to express

evidence

- “If email content suggests type X,

it is of type X”

- “If A sends deadline emails to B,

then A is the supervisor of B”

- “If A is the supervisor of B, and A

is the supervisor of C, then B and C

are colleagues”

complete by

due - Link Prediction

People, emails, words,

communication, relations

Use rules to express

evidence

- “If email content suggests type X,

it is of type X”

- “If A sends deadline emails to B,

then A is the supervisor of B”

- “If A is the supervisor of B, and A

is the supervisor of C, then B and C

are colleagues” - Link Prediction

People, emails, words,

communication, relations

Use rules to express

evidence

- “If email content suggests type X,

it is of type X”

- “If A sends deadline emails to B,

then A is the supervisor of B”

- “If A is the supervisor of B, and A

is the supervisor of C, then B and C

are colleagues” - Node Labeling

? - Voter Opinion Modeling

?

$$

Tweet

Status

update - Voter Opinion Modeling

spouse

spouse

colleague

colleague

spouse

friend

friend

friend

friend - Voter Opinion Modeling

vote(A,P) ∧ spouse(B,A) vote(B,P) : 0.8

vote(A,P) ∧ friend(B,A) vote(B,P) : 0.3

spouse

spouse

colleague

colleague

spouse

friend

friend

friend

friend - Multiple Ontologies

18

Organization

Employees

Developer Staff

CustomersService & Products

Hardware IT Services Sales PersonSoftware

work forprovides

buys

helps

interacts

sells todevelops

Company

Employee

Technician Accountant

CustomerProducts & Services

Hardware Consulting SalesSoftware Dev

works fordevelop

buys

helps

interacts with

sells - Ontology Alignment

19

Organization

Employees

Developer Staff

CustomersService & Products

Hardware IT Services Sales PersonSoftware

work forprovides

buys

helps

interacts

sells todevelops

Company

Employee

Technician Accountant

CustomerProducts & Services

Hardware Consulting SalesSoftware Dev

works fordevelop

buys

helps

interacts with

sells - Ontology Alignment

20

Organization

Employees

Developer Staff

CustomersService & Products

Hardware IT Services Sales PersonSoftware

work forprovides

buys

helps

interacts

sells todevelops

Company

Employee

Technician Accountant

CustomerProducts & Services

Hardware Consulting SalesSoftware Dev

works fordevelop

buys

helps

interacts with

sells

Match, Don’t Match? - Ontology Alignment

21

Organization

Employees

Developer Staff

CustomersService & Products

Hardware IT Services Sales PersonSoftware

work forprovides

buys

helps

interacts

sells todevelops

Company

Employee

Technician Accountant

CustomerProducts & Services

Hardware Consulting SalesSoftware Dev

works fordevelop

buys

helps

interacts with

sells

Similar to what extent? - Logic Foundation
- Rules

Atoms are real valued

- Interpretation I, atom A: I(A) [0,1]

- We will omit the interpretation and write A [0,1]

∨, ∧ are combination functions

- T-norms: [0,1]n →[0,1]

H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn

Ground Atoms

[Broecheler, et al., UAI ‘1 - Rules

Combination functions (Lukasiewicz T-norm)

A ∨ B = min(1, A + B)

A ∧ B = max(0, A + B – 1)

H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn

[Broecheler, et al., UAI ‘1 - Satisfaction

Establish Satisfaction

- ∨ (H1,..,Hm) ← ∧ (B1,..,Bn)

H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn

H1 ← B1:0.7 ∧ B2:0.8

[Broecheler, et al., UAI ‘1

≥0.5 - Distance to Satisfaction

Distance to Satisfaction

- max( ∧ (B1,..,Bn) - ∨ (H1,..,Hm) , 0)

H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn

H1:0.7 ← B1:0.7 ∧ B2:0.8

H1:0.2 ← B1:0.7 ∧ B2:0.8

0.0

0.3

[Broecheler, et al., UAI ‘1 - W: H1 ∨... Hm ← B1 ∧ B2 ∧ ... Bn

Rule Weights

Weighted Distance to Satisfaction

- d(R,I) = W max( ∧ (B1,..,Bn) - ∨ (H1,..,Hm) , 0)

[Broecheler, et al., UAI ‘1 - Let’s Review

Given a data set and a PSL program, we can

construct a set of ground rules.

Some of the atoms have fixed truth values

and some have unknown truth values.

For every assignment of truth values to the

unknown atoms, we get a set of weighted

distances from satisfaction.

How to decide which is best? - Probabilistic Foundation
- Probabilistic Model

Probability

density over

interpretation I

Normalization

constant

Set of ground

rules

Distance

exponent

in {1, 2}

Rule’s weight Rule’s distance to satisfaction - Hinge-loss Markov Random Fields

Subject to arbitrary linear constraints

PSL models ground out to HL-MRFs

Log-concave! - MPE Inference
- Inferring Most Probable Explanations

Objective:

Convex optimization

Decomposition-based inference

algorithm using the ADMM framework - Alternating Direction Method of

Multipliers

• We perform inference using the alternating

direction method of multipliers (ADMM)

• Inference with ADMM

is fast, scalable,

and straightforward

• Optimize subproblems

(ground rules) independently, in parallel

• Auxiliary variables enforce consensus

across subproblems

[Bach et al., NIPS ’12] - Inference Algorithm

Initialize local copies

of variables and

Lagrange multipliers

Begin inference

iterations

Simple updates

solve subproblems

for each potential...

...and each

constraint

Average to update

global variables and

clip to [0,1]

[Bach, et al., Under revie - ADMM Scalability

0

100

200

300

400

500

600

125000 225000 325000

Timeinseconds

Number of potential functions and constraints

ADMM

Interior-point method

[Bach, et al., NIPS ‘12] - ADMM Scalability [Bach, et al., Under revie
- Distributed MPE Inference:

MapReduce Implementation

z1z1 zq zqz1 z2z2 zp

sub

problem

local

variable

copy

Mapper

z1 z2z1 z1 z2 zq zq zp

Reducer

update

global

component

load global variable

X as side data

Job Bootstrap

HDFS or

HBase

read/write

subproblem

write new

global variable

read global

variable X

Hui Miao, Xiangyang Liu - Distributed MPE Inference:

GraphLab Implementation

.

.

.

.

.

.

.

.

.

.

.

.

sub

problem

node

global

variable

component

gather

get z

apply

update y

update x

scatter

notify z

gather

get local z,y

apply

update z

scatter

unless converge

notify X

update i update i+1

Hui Miao, Xiangyang Liu - Weight Learning
- Weight Learning

Learn from training data

No need to hand-code rule-weights

Various methods:

- approximate maximum likelihood

[Broecheler et al., UAI ’10]

- maximum pseudolikelihood

- large-margin estimation

[Bach, et al., UAI ’13] - Weight Learning

State-of-the-art supervised-learning

performance on

- Collective classification

- Social-trust prediction

- Preference prediction

- Image reconstruction

[Bach, et al., UAI ’13] - Foundations Summary
- Foundations Summary

Design probabilistic models using

declarative language

- Syntax based on first-order logic

Inference of most-probable

explanation is fast convex

optimization (ADMM)

Learning algorithms for training rule

weights from labeled data - PSL Applications
- Document Classification

2

2

2

2

2A

B

A or B?

A or B?

A or B?

Given a networked collection of documents

Observe some labels

Predict remaining labels using

link direction

inferred class label

[Bach, et al., UAI ’13] - Computer Vision Applications

Low-level vision:

- image reconstruction

High-level vision:

- activity recognition in videos - Image Reconstruction

RMSE reconstruction error

[Bach, et al., UAI ’13] - Activity Recognition in Videos

crossing waiting queueing walking talking dancing jogging

[London, et al., Under revi - Results on Activity Recognition

Recall matrix between

different activity types

Accuracy metrics

compared against

baseline features

[London, et al., Under revi - Social Trust Prediction

Competing models from social

psychology of strong ties

- Structural balance [Granovetter ’73]

- Social status [Cosmides et al., ’92]

Leskovec, Huttenlocher, & Kleinberg,

[2010]: effects of both models present

in online social networks - Structural Balance vs. Social Status

Structural balance: strong ties are governed by

tendency toward balanced triads

- e.g., the enemy of my enemy...

Social status: strong ties indicate unidirectional

respect, “looking up to”, expertise status

- e.g., patient-nurse-doctor, advisor-advisee - Structural Balance in PSL

[Huang, et al., SBP ‘1 - Structural Balance in PSL

[Huang, et al., SBP ‘1 - Social Status in PSL

[Huang, et al., SBP ‘1 - Social Status in PSL

[Huang, et al., SBP ‘1 - Experiments

User-user trust ratings from two different

online social networks

Observe some ratings, predict held-out

Eight-fold cross validation on two data sets:

- FilmTrust - movie review network,

trust ratings from 1-10

- Epinions - product review network,

trust / distrust ratings {-1, 1}

[Huang, et al., SBP ‘1 - Compared Methods

TidalTrust: graph-based propagation of trust

- Predict trust via breadth-first search to combine

closest known relationships

EigenTrust: spectral method for trust

- Predict trustworthiness of nodes based on

eigenvalue centrality of weighted trust network

Average baseline: predict average trust score for

all relationships

[Huang, et al., SBP ‘1 - FilmTrust Experiment

Normalize [1,10] rating to [0,1]

Prune network to largest connected-component

1,754 users, 2,055 relationships

Compare mean average error, Spearman’s rank

coefficient, and Kendall-tau distance

* measured on only non-default predictions

• PSL-Status and PSL-Balance disagree on 514 relationships

[Huang, et al., SBP ‘1 - Epinions Experiment

Snowball sample of 2,000 users from

Epinions data set

8,675 trust scores normalized to {0,1}

Measure area under precision-recall

curve for distrust edges (rarer class)

[Huang, et al., SBP ‘1 - Knowledge Graph Identification

Problem: Collectively reason about noisy,

inter-related fact extractions

Task: NELL fact-promotion (web-scale IE)

- Millions of extractions, with entity ambiguity

and confidence scores

- Rich ontology: Domain, Range, Inverse,

Mutex, Subsumption

Goal: Determine which facts to include in

NELL’s knowledge base

Jay Pujara, Hui Miao, William Cohen - PSL Rules

Candidate Extractions:

- CandRel_CMC(E1,E2,R) => Rel(E1,E2,R)

- CandLbl_Morph(E1,L) => Lbl(E1,L)

Entity Resolution:

- SameEntity(E1,E2) & Lbl(E1,L) => Lbl(E2,L)

- SameEntity(E1,E2) & Rel(E1,X,R) => Rel(E2,X,R)

Ontological Rules:

- Domain(R,L) & Rel(E1,E2,R) => Lbl(E1,L)

- Mut(L1,L2) & Lbl(E,L1) => ~Lbl(E,L2) - Results

Data: Iteration 165 of NELL

- 1.2M extractions

- 68K ontological constraints

- 14K labeled instances (Jiang, 2012)

- NELL & MLN baselines (Jiang, 2012)

F1 Precision Recall AUC-PR

NELL .673 .801 .580 .765

MLN .836 .837 .836 .899

PSL .841 .773 .922 .882 - Schema Matching

Correspondences between

source and target schemas

Matching rules

- ‘’If two concepts are the same,

they should have similar

subconcepts’’

- ‘’If the domains of two attributes

are similar, they may be the

same’’

Organization

Customers

Service &

Products

provides

buys

Company

Customer

Products &

Services

develop

buys

Portfolios

includes

develop(A, B) <= provides(A, B)

Company(A) <= Organization(A)

Products&Services(B) <= Service&Products(B) - Schema Mapping

Input: Schema matches

Output: S-T query pairs

(TGD) for exchange or

mediation

Mapping rules

- “Every matched attribute should

participate in some TGD.”

- “The solutions to the queries in

TGDs should be similar.”

Organization

Customers

Service &

Products

provides

buys

Company

Customer

Products &

Services

develop

buys

Portfolios

includes

∃Portfolio P, develop(A, P) ∧

includes(P, B) <= provides(A, B) . . .

Alex Memory, Angelika Kimmig - Learning Latent Groups

Can we better understand political discourse in social

media by learning groups of similar people?

Case study: 2012 Venezuelan Presidential Election

Incumbent: Hugo Chávez

Challenger: Henrique Capriles

Left: This photograph was produced by Agência Brasil, a public Brazilian news agency. This file is licensed under the Creative

Commons Attribution 3.0 Brazil license. Right: This photograph was produced by Wilfredor. This file is licensed under the Creative

Commons Attribution-Share Alike 3.0 Unported license.

[Bach, et al., Under revi - Learning Latent Groups

South American tweets collected from 48-hour

window around election.

Selected 20 top users

Candidates, campaigns, media, and most

retweeted

1,678 regular users interacted with at least one

top user and used at least one hashtag in another

tweet

Those regular users had 8,784 interactions with

non-top users

[Bach, et al., Under revi - Learning Latent Groups

[Bach, et al., Under revi - Learning Latent Groups

[Bach, et al., Under revi - Learning Latent Groups

[Bach, et al., Under revi - Learning Latent Groups

[Bach, et al., Under revi - Other LINQS Projects

Drug-target interaction: Shobeir Fakhraei

MOOC user-role prediction and expert identification: Arti Ramesh

Semantic-role labeling: Arti Ramesh and Alex Memory

Optimal source selection for data integration: Theo Rekatsinas

Learning theory in non-I.I.D. settings: Ben London, Bert Huang

Collective stability [London et al, ICML13]

Scalable graph analysis using GraphLab: Hui Miao, Walaa Eldin

Moustafa

Uncertain graph models: Walaa Eldin Moustafa - Conclusion
- Conclusion

PSL:

expressive framework for structured problems

scalable

Much ongoing work, including incorporating hidden

variables and structure learning, distributing

Very interested in applying it to a variety of

domains, especially in computational social science

We encourage you to try it!

http://psl.umiacs.umd.edu - psl.umiacs.umd.edu