このページは http://www.slideshare.net/starrysky2/kun-liu-nips2014-developing-webscale-machine-learning-at-linkedin-42746527 の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

byKun Liu

2年弱前 (2014/12/16)にアップロードinテクノロジー

Practical challenges we are facing when developing web-scale machine learning at LinkedIn, and le...

Practical challenges we are facing when developing web-scale machine learning at LinkedIn, and lessons we learned.

- Developing Web-scale Machine Learning

at LinkedIn - from Soup to Nuts

Kun Liu

NIPS 2014 Software Engineering for Machine Learning

December 13, 2014 - Online Recommendations @ LinkedIn

2 - Many Practical Challenges, but

Fast iteration is desired

– large-scale machine-learning framework

Models should adapt to large dynamic data

– cold start model + warm start model + explore/exploit

Offline metrics do not always reflect online

– online A/B test

Real-time feedback is important

– near real-time event stream

3 - The 30,000-foot Overview

4

feature extraction

feature transformation

user modeling

Data Pipeline

data tracking & logging candidates generationmulti-pass rankers

Online Serving System

real-time feedback

Model Fitting Pipeline (Hadoop)

offline modeling fitting

(cold-start model)

nearline modeling fitting

(warm-start model)

daily/weekly hourly

online A/B test

model evaluation - Fast Iteration is Desired

5 - Fast Iteration is Desired

Every machine learning task is different

– still the steps to solving the problem are often quite similar

Fast iteration is desired

– successful systems require lots of tuning & experimentation

– reusable modules and easy-to-config workflows dramatically

improve productivity

– free engineers to concentrate on unique aspects of the project

6 - Large-scale Machine-learning Framework at LinkedIn

7

scheduled workflow

on Azkaban

feature, target sources

feature, target join

partition

test datatraining data

sampling

model training

scoring

model evaluation

sampled data

previous model best model

model deployment

snapshots - Models Should Adapt to Large Dynamic Data

8 - Ads Click Prediction Problem As an Example

A member comes to LinkedIn

Ads platform prepares a list of eligible advertising campaigns

The statistical model predicts CTR for <member, ad, context>

Rank all ads by (predicted CTR × bid), and show the top-k results

to the member

9 - Models Should Adapt to Large Dynamic Data

Very Large-scale data

– Billions of records, large feature space (~100k covariates)

Low positive rates (e.g. CTR)

– Sparsity issue

Data is dynamic

– New ads come into the system any time

Models have to be adaptive

– Otherwise bad experience, $ loss

10 - Cold-start & Warm-start Model

Cold-start & Warm-start

Cold-start component θcold relatively stable

– Less frequent update

– Large-scale model fitting using large amount of historical data

Warm-start component θwarm more dynamic

– Update as frequently as possible

– Trained using fresh data to explain the residual of cold start

only model

11

Large-scale logistic

regression

Per-item logistic

regression - Cold-start Model Fitting

Alternating Direction Method of Multipliers (ADMM)

– Stephen Boyd et al. 2011

– Constraint that each partition’s coefficient βi = global

consensus β

12 - Large Scale Logistic Regression via ADMM

13

BIG DATA

Partition 1 Partition 2 Partition 3 Partition K

Logistic

Regression

Logistic

Regression

Logistic

Regression

Logistic

Regression

Consensus

Computation

Iteration 1 - Large Scale Logistic Regression via ADMM

14

BIG DATA

Partition 1 Partition 2 Partition 3 Partition K

Logistic

Regression

Consensus

Computation

Logistic

Regression

Logistic

Regression

Logistic

Regression

Iteration 1 - Large Scale Logistic Regression via ADMM

15

BIG DATA

Partition 1 Partition 2 Partition 3 Partition K

Logistic

Regression

Logistic

Regression

Logistic

Regression

Logistic

Regression

Consensus

Computation

Iteration 2 - Warm-start Model Fitting

Warm-start θwarm trained with fresh data to explain the

residual of cold start only model

Select small set of item-dependent features for fast training

Train as frequently as possible

16

warm - Explore/Exploit

Model serving and data feedback loop

– Ad A: true CTR 5%, 500 clicks, 10000 views

– Ad B: true CTR 10%, 0 click, 10 views

We ends up always serving A!

Multi-armed bandit problem

– ε-greedy: random selection for ε% traffic

– UCB (Upper Confidence Bound)

– Gittins Index

– Thompson sampling

17 - Thompson Sampling

Instead of using MAP estimation of θ, we sample it from its

posterior distribution

We sample from warm-start model only

Compute CTR using the sampled coefficients

Assumptions

– Cold start coefficients have 0 variance

– Covariance between warm start coefficients is 0

18 - Explore/Exploit Experiments

19

Campaign Warmness Segment

LiftPercentageofCTR

−%

−%

−%

0%

+%

+%

1 2 3 4 5 6 7 8

LASER

LASER−EEWith E/E

Without E/E

lots of training dataalmost no training data

Exploration Cost Winner’s Curse

Campaign Warmness Segment

CTRLift - Offline Metrics Do Not Always Reflect Online

20 - Offline Metrics Do Not Always Reflect Online

Offline Metrics

– ROC, AUC

– Test log likelihood

Online Metrics

– eCPC (effective cost per click)

– eCPM (effective cost per thousand impressions)

– downstream or sitewide effects are hard to measure offline…

How to select best model to ramp

– A/B testing in a scientific and controlled manner

21 - A/B Testing in One Slide

22

20%80%

collect results to determine which one is better

Join now

Control Treatment - AdsCTR

Date

Ads CTR Drop

23

profile top ads

sudden CTR drop - Root Cause

24

5 Pixels!

navigation bar

profile top ads - Growth of Experiments at LinkedIn

• 200+ experiments

• 800+ side-wide and

vertical specific metrics

• billions of experiment

events

all on a daily basis, and

counting …

25

#ofExperiments - Simple Experiments Management

26

Easy design and deployment of experiments

with many built-in targeting attributes to

select - Automatic Analysis

Statistical analysis

– Statistical significance test (p-value, conf. interval)

– Deep-dive: slicing & dicing capability

Metrics design and management

De-centralized ownership

Centralized onboarding and management

27 - Who Moved My Cheese?

28

Experiments Owners Metrics Owners

• Provide visibility, ensure responsible ramping, and encourage

communication

Most Impactful Experiments - Real-time Feedback is Important

29 - Real-time Feedback is Important

Recommendations served

every millisecond

How frequently do users keep

seeing the same item?

High frequency user fatigue

low engagement

Impression discounting to

penalize relevance scores by

impression counts

30 - Impression Discounting

31

Item Id Relevance

Score

Action Counts Adjusted Score

item:10 0.9 VIEW 3 0.9 * (0.2231) = 0.20

item:15 0.8 VIEW 0 0.8

item:20 0.7 VIEW 2 0.7 * (0.3679) = 0.26

Impression discounting

score¬ score´exp(-exposure / factor),

where exposure is a function of impression counts.

Member member:1234567

Time Range within last 12 hours

Item Id Action Counts

item:10 VIEW 3

Item:10 CLICK 1

item:15 VIEW 0

item:20 VIEW 2

Real-time impression tracking

Sounds a simple idea! But … - How to Achieve (Near) Real-time Tracking

Apache Kafka: a high-throughput distributed messaging system

Voldemort: distributed key-value storage system

32 - Conclusions

Fast iteration is desired

Models should adapt to large dynamic data

Offline metrics do not always reflect online

Real-time feedback is important

and there is something else as important

33

We are hiring! - Questions

34