このページは http://www.slideshare.net/ThoughtWorks/big-data-agile-analytics-by-ken-collier-director-agile-analytics-thoughtworks の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

2年以上前 (2014/04/01)にアップロードinテクノロジー

We are in the midst of an exciting time. There is an explosion of very interesting data, and emer...

We are in the midst of an exciting time. There is an explosion of very interesting data, and emergence of powerful new technologies for harnessing data, and devices that enable humans to receive tremendous benefits from it. What is required are innovative processes that enable the creation and delivery of value from all of that data. More often than not, it is the predictive (what will happen?) and prescriptive (how to make it happen!) analytics that produces this value, not the raw data itself.

Agile software teams are continuously involved in projects that involve rich, complex, and messy data. Often this data represents innovative analytics opportunities. Being analytics-aware gives these teams the opportunity to collaborate with stakeholders to innovate by creating additional value from the data. This session is aimed at making Agile software teams more analytics-aware so that they will recognize these innovation opportunities.

The trouble with conventional analytics (like conventional software development) is that it involves long, phased, sequential steps that take too long and fail to deliver actionable results. This talk will examine the convergence of the following elements of an exciting emerging field called Agile Analytics:

•sophisticated analytics techniques, plus

•lean learning principles, plus

•agile delivery methods, plus

•so-called "big data" technologies

Learn:

•The analytical modeling process and techniques

•How analytical models are deployed using modern technologies

•The complexities of data discovery, harvesting, and preparation

•How to apply agile techniques to shorten the analytics development cycle

•How to apply lean learning principles to develop actionable and valuable analytics

•How to apply continuous delivery techniques to operationalize analytical models

- Prescriptive

Analytics

Predictive

Analytics

How can we

make it happen?

Diagnostic

Analytics

Value

What wil happen?

Descriptive

Analytics

Why did it happen?

What happened?

Complexity - Prescriptive

Analytics

Predictive

Analytics

How can we

make it happen?

Advanced Analytics

Diagnostic

Analytics

Value

What wil happen?

Descriptive

Analytics

Why did it happen?

Traditional

Business Intelligence

What happened?

Complexity

3 - Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Lean Delivery

Learning - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Delivery

Lean

Learning - Report

Report

Accepted

Report

Model

Operational

Data

Reporting

Engine

Model

Test

Data

Validation

Data

Sampling

Data

Clean

Data

Candidate

Modeling

Integration

Model

Data

Data

Partitioning

Feature

Selection

Analytical

Dimension

Training

Data

Modeling

Mapping

External

Data

Dimensional

Data

Big Data Analytics Pipeline - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Delivery

Lean

Learning - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Delivery

Lean

Learning - How Advanced Analytics Works

If we knew X,

Analytical Opportunities

we could do Y

Data Convergence

Analytical Divergence

Integrate Augment

Filter

Analyze

Harvest

Act

Discover

Discover &

Analyze & Act

Explore - Traditional Analytics

If we knew X,

Analytical Opportunities

we could do Y

Data Convergence

Analytical Divergence

Integrate Augment

Filter

Analyze

Harvest

Act

Discover

Typical Timeline

3-6 months

2 months

2-4 months

10 - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Lean

Delivery

Continuous

Learning

Integration

Evolve

Collaboration

Continuous

Delivery - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Big Data

Advanced

Solutions

Analytics

Thinking

Agile

Analytics

Ethics

Impact

Agile

Lean Delivery

Continuous

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Agility in Analytics

If we knew X,

Analytical Opportunities

we could do Y

Data Convergence

Analytical Divergence

BUILD

MEASURE

LEARN

Integrate Augment

Filter

Analyze

Harvest

Act

Discover

Repeat this cycle solving smal problems every few days - Like this example…

Retain high value

High value business

customers

goal - Like this example…

Common features of

defectors?

What’s the

Retain high value

smallest, simplest

customers

thing we can do? - Like this example…

Common features of

defectors?

Retain high value

Is it useful &

customers

actionable? - Like this example…

Shopping behaviors of

defectors?

Common features of

defectors?

Retain high value

Repeat!

customers - Like this example…

Shopping behaviors of

defectors?

Common features of

defectors?

What do defectors say

about us?

What leads to customers

Retain high value

leaving?

customers

Customers’ sentiment

before defecting?

Do incentives reduce

defection rates?

What encourages

customers to stay? - Like this example…

Shopping behaviors of

defectors?

Common features of

defectors?

What do defectors say

about us?

Problem

What leads to customers

leaving?

solved or

continue?

Customers’ sentiment

before defecting?

Do incentives reduce

defection rates?

What encourages

customers to stay? - Velocity

Volume

Variety

Complexity Polyglot

NoSQL

Persistence

Machine

Learning

Big Data

Data

Advanced

Solutions

Science

Analytics

Thinking

Statistics

Agile

Analytics

Ethics

Impact

Agile

Lean Delivery

Continuous

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - THE “DATA SCIENTIST”

Machine Learning

Statistical Modeling

Bayesian Classiﬁcation

Artiﬁcial Neural Networks

Monte Carlo Simulation

Decision Tree Learning

Logistic Regression

Support Vector Machines

K-Nearest Neighbor

Clustering

…and many more…

…and many more…

Domain Knowledge

Programming Skills

Data Semantics

Functional Programming

Business Understanding

Data “Wrangling”

Business Communication

Map/Reduce, SQL, & NoSQL - Velocity

Volume

Variety

Visual

Complexity Polyglot

Storytelling

NoSQL

Persistence

Machine

Learning

Big Data

Data

Advanced

Solutions

Science

Analytics

Thinking

Statistics

Agile

Analytics

Ethics

Impact

Agile

Lean Delivery

Continuous

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Data Visualization

drones.pitchinteractive.com - Velocity

Volume

Variety

Visual

Complexity Polyglot

Storytelling

Data

NoSQL

Persistence

Reduction

Machine

Learning

Big Data

Data

Advanced

Solutions

Science

Analytics

Thinking

Statistics

Agile

Analytics

Ethics

Impact

Agile

Lean Delivery

Continuous

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - “Little Data”

Objective Truth

Discoverable Truth

Irrelevant

Noise

Not

Actionable

Uninterpretable

Impactful

New Insights - Velocity

Volume

Variety

Visual

Complexity Polyglot

Storytelling

Data

NoSQL

Persistence

Reduction

Machine

Learning

Big Data

Data

Advanced

Solutions

Science

Analytics

Thinking

Statistics Knowledge

Agile

Analytics

Ethics

Insight

Impact

Agile

Action

Lean Delivery

Continuous

Disruption

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Velocity

Volume

Variety

Business vs. IT

Visual

Complexity Polyglot

Storytelling

Data

NoSQL

Persistence

Reduction

Focus vs. Platform

Machine

Learning

Big Data

Data

Advanced

Solutions

Monitor & Measure

Science

Analytics

Thinking

Statistics Knowledge

Agile

Analytics

Ethics

Insight

Impact

Agile

Action

Lean Delivery

Continuous

Disruption

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Velocity

Volume

Variety

Business vs. IT

Visual

Complexity Polyglot

Storytelling

Data

NoSQL

Persistence

Reduction

Focus vs. Platform

Machine

Learning

Big Data

Data

Advanced

Solutions

Monitor & Measure

Science

Analytics

Thinking

Statistics Knowledge

Agile

Privacy Controls

Analytics

Radical Transparency

Ethics

Insight

Data Democracy

Impact

Open Data

Agile

Action

Lean Delivery

Continuous

Disruption

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Velocity

Volume

Variety

Business vs. IT

Visual

Complexity Polyglot

Storytelling

Data

NoSQL

Persistence

Reduction

Focus vs. Platform

Machine

Learning

Big Data

Data

Advanced

Solutions

Monitor & Measure

Science

Analytics

Thinking

Statistics Knowledge

Agile

Privacy Controls

Analytics

Radical Transparency

Ethics

Insight

Data Democracy

Impact

Open Data

Agile

Action

Lean Delivery

Continuous

Disruption

Learning

Integration

Hypothesis

Evolve

Collaboration

Build

Learn Continuous

Measure

Delivery - Cool New Technologies

+

Sophisticated Analytics

+

Lean Learning Principals

+

Fast Agile Delivery =

Value Creation

Ken Collier, Director, Agile Analytics

kcollier@thoughtworks.com