このページは http://www.slideshare.net/Eniod/tutorial-of-topologicaldataanalysispart1basic の内容を掲載しています。

掲載を希望されないスライド著者の方は、こちらよりご連絡下さい。

7ヶ月前 (2016/04/15)にアップロードin学び

Tutorial of topological data analysis (basis _ part 1)

- Tutorial of

Topological Data Analysis

Part I - Basic Concepts

Tran Quoc Hoan

The University of Tokyo

Hasegawa lab., Tokyo

haduonght.wordpress.com/

@k09ht - My TDA = Topology Data Analysis ’s road

Part I - Basic concepts &

Part II - Advanced computation

applications

Part III - Mapper Algorithm

Part IV - Software Roadmap

He is following me

Part V - Applications in…

Part VI - Applications in…

TDA Road

2 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

TDA - Basic Concepts

3 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

TDA - Basic Concepts

4 - Topology

The properties of space that are preserved under continuous

deformations, such as stretching and bending, but not tearing or

gluing

⇠

=

⇠

=

⇠

=

⇠

=

⇠

=

⇠

=

⇠

=

I - Topology and Holes

5 - Invariant

Question: what are invariant things in topology?

Connected

Number of

Component

Ring

Cavity

⇠

=

⇠

=

⇠

=

1

0

0

⇠

=

2

0

0

⇠

=

⇠

=

1

1

0

⇠

=

1

0

1

I - Topology and Holes

6 - Holes and dimension

✤ Concern to forming of shape: connected component, ring, cavity

•

0-dimensional “hole” = connected component

•

1-dimensional “hole” = ring

•

2-dimensional “hole” = cavity

Topology: consider the continuous deformation under the

same dimensional hole

How to define “hole”?

Use “algebraic”

Homology group

I - Topology and Holes

7 - Homology group

✤ For geometric object X, homology Hl satisfied:

k0 : number of connected components

k1 : number of rings

Betti-numbers

k2 : number of cavities

kq : number of q-dimensional holes

Image source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

I - Topology and Holes

8 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

TDA - Basic Concepts

9 - Simplicial complexes

Simplicial complex:

A set of vertexes, edges, triangles, tetrahedrons, … that are closed

under taking faces and that have no improper intersections

vertex

edge

triangle

tetrahedron

k-simplex

(0-dimension)

(1-dimension)

(2-dimension)

(3-dimension)

simplicial

not simplicial

complex

complex

2 - Simplicial complexes

10 - Simplicial

n-simplex:

The “smallest” convex hull of n+1 aﬃnity independent points

= |v0v1...vn| = { 0v0 + 1v1 + ... + nvn| 0 + ... + n = 1, i

0}

vertex

edge

triangle

tetrahedron

n-simplex

(0-dimension)

(1-dimension)

(2-dimension)

(3-dimension)

A m-face of σ is the convex hull τ = |vi0…vim| of a non-empty subset

of {v0, v1, …, vn} (and it is proper if the subset is not the entire set)

⌧

2 - Simplicial complexes

11 - Simplicial

Direction of simplicial:

The same direction with permutation <i0i1…in>

2-simplex

1-simplex

3-simplex

2 - Simplicial complexes

12 - Simplicial complex

Definition:

A simplicial complex is a finite collection of simplifies K such that

(1) If

2 K and for all face ⌧

then ⌧ 2 K

(2) If , ⌧ 2 K and

\ ⌧ 6= ? then \ ⌧

and

\ ⌧

⌧

The maximum dimension of simplex in K is the dimension of K

K2 = {|v0v1v2|, |v0v1|, |v0v2|, |v1v2|, |v0|, |v1|, |v2|}

NOT

YES

K = K2 [ {|v3v4|, |v3|, |v4|}

2 - Simplicial complexes

13 - Simplicial complexes

Hemoglobin

simplicial complex

Image source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

2 - Simplicial complexes

14 - Nerve

✤ Let be

= {Bi|i = 1, ..., m}

a covering of X = [m

i=1Bi

✤ The nerve of is a simplicial complex N ( ) = (V, ⌃)

2 - Simplicial complexes

15 - Nerve theorem

✤ If i

X ⊂ RN s covered by a collection of convex closed

sets t

= {Bi|i = 1, ..., m} hen X and a

N ( ) re

homotopy equivalent

2 - Simplicial complexes

16 - Cech complex

✤ P

= {xi 2 RN |i = 1, ..., m}

Br(xi) = {x 2 RN | ||x

xi|| r}

ball with radius r

✤ The Cech complex C(P, r) is the nerve of

= {Br(xi)| xi 2 P }

✤ From nerve theorem:

C(P, r)

Xr = [m

i=1Br (xi) ' C (P, r)

✤ Filtration

2 - Simplicial complexes

17 - Cech complex

✤ The weighted Cech complex C(P, R) is the nerve of

= {Br (x

i

i)|

xi 2 P }

ball with diﬀerent radius

✤ Computations to check the intersections of balls are not easy

Alpha complex

2 - Simplicial complexes

18 - Voronoi diagrams and Delaunay complex

✤ P = {xi 2 RN |i = 1, ..., m}

Vi = {x 2 RN | ||x

xi|| ||x

xj||, j 6= i} Voronoi cell

RN = [m

i=1Vi

Voronoi decomposition

✤

= {Vi|i = 1, ..., m}

D(P ) = N ( )

Delaunay complex

2 - Simplicial complexes

19 - General position

✤ i

x1, ..., xN+2 2 RN s in a general position, if there is no

x 2 RN s.t.||x

x1|| = ... = ||x

xN+2||

✤ If all combination of N+2 points in P is in a general

position, then P is in a general position

✤ If P is in a general position then

The dimensions of Delaunay simplexes <= N

Geometric representation of D(P) can be

embedded in RN

2 - Simplicial complexes

20 - Alpha complex

✤

✤

✤ The alpha complex is the nerve of

↵(P, r) = N ( )

✤ From Nerve theorem:

Xr ' ↵(P, r)

2 - Simplicial complexes

21 - Alpha complex

✤

if P is in a general position

✤

filtration of alpha complexes

✤ The weighted alpha complex is defined

with different radius

2 - Simplicial complexes

22 - Alpha complex

✤ Computations are much easier than Cech complexes

✤ Software: CGAL

• Construct alpha complexes of points clouds data in RN with

N <= 3

Filtration of alpha complex

Image source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

2 - Simplicial complexes

23 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

TDA - Basic Concepts

24 - Definition of holes

Simplicial Algebraic

Chain

Holes

Homology

complex

complex

group

Geometrical

Algebraic

object

object

3 - Definition of Holes

25 - What is hole?

✤ 1-dimensional hole: ring

Ring =

1-dimensional graph without boundary?

However, NOT

not ring

have ring

without

ring

without

boundary

1-dimensional graph without

boundary

boundary but is 2-dimensional graph

’s boundary

Ring = 1-dimensional graph without boundary and is not boundary

of 2-dimensional graph

3 - Definition of Holes

26 - What is hole?

✤ 2-dimensional hole: cavity

Cavity =

2-dimensional graph without boundary?

However, NOT

not cavity

have cavity

without

cavity

2-dimensional graph without

without

boundary

boundary but is 3-dimensional graph

boundary

’s boundary

Cavity = 2-dimensional graph without boundary and is not boundary

of 3-dimensional graph

3 - Definition of Holes

27 - Hole and boundary

q-dimensional hole

=

q-dimensional graph without boundary and

is not boundary of (q+1)-dimensional graph

We try to make it clear by “Algebraic” language

3 - Definition of Holes

28 - Chain complexes

Definition:

Let K be a simplicial complex with dimension n. The group of q-

chains is defined as below:

X

⌦

↵

⌦

↵

Cq(K) := {

↵i vi ...v

|↵

v ...v

: q simplicial in K}

0

iq

i 2 R,

i0

iq

if

0 q n

Cq(K) := 0, if q < 0 or q > n

The element of Cq(K) is called q chain.

3 - Definition of Holes

29 - Boundary

Definition:

Boundary of a q-simplex is the sum of its (q-1)-dimensional faces.

vil is omitted

@|v0v1v2| := |v0v1| + |v1v2| + |v0v2|

3 - Definition of Holes

30 - Hole and boundary

q-dimensional hole

(1)

q-dimensional graph without boundary and is

not boundary of (q+1)-dimensional graph (2)

=

:= ker @q

:= im@q+1

Elements in Zq(K) remain after make Bq(K) become zero

This operator is defined as Q

(z and z’ are equivalent in ker @q

For z0 = z + b, z, z0 2 ker @q, b 2 im @q+1 with respect to )

im @q+1

Q(z0) = Q(z) + Q(b) = Q(z)

q-dimensional hole = an equivalence

class of vectors

3 - Definition of Holes

33 - Homology group

Homology groups

The qth Homology Group Hq is defined as Hq = Ker@q/Im@q+1

= {z + Im@q+1 | z 2 Ker@q } = {[z]|z 2 Ker@q}

Divided in groups with operator [z] + [z’] = [z + z’]

H0(K): connected component

H1(K): ring

H2(K): cavity

Betti Numbers

The qth Betti Number is defined as the dimension of Hq

bq = dim(Hq)

3 - Definition of Holes

34 - Computing Homology

v1

v2

v0

v3

All vectors in the column space of Ker@0 are equivalent with respect to Im@1

b0 = dim(H0) = 1

Im@2 has only the zero vector

H1 = { (|v0v1| + |v1v2| + |v2v3| + |v3v0|)}

b1 = dim(H1) = 1

3 - Definition of Holes

35 - Computing Homology

v1

v2

v0

v3

All vectors in the column space of Ker@0 are equivalent with respect to Im@1

b0 = dim(H0) = 1

Im@2 has only the zero vector

H1 = { (hv0v1i + hv1v2i + hv2v3i

hv0v3i)}

b1 = dim(H1) = 1

3 - Definition of Holes

36 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

TDA - Basic Concepts

37 - Persistent Homology

✤ Consider filtration of finite type

time

K : K0 ⇢ K1 ⇢ ... ⇢ Kt ⇢ ...

9 ⇥ s.t. Kj = K⇥, 8j

⇥

✤ : t

K = [t 0Kt otal simplicial complex

Kk : all k-simplexes in K

Kt : all k-simplexes in K at time t

k

T ( ) = t : birth time of the simplex

2 Kt \ Kt 1

Persistent homology

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf 38 - Persistent Homology

✤ Z2 - vector space

✤ Z2[x] - graded module

✤ Inclusion map

✤ is a fr

Ck(K)

ee Z2[x] module with the base

Persistent homology

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

39 - Persistent Homology

✤ Boundary map

face of σ

(graded homomorphism)

✤ From the graded structure

✤ Persistent homology

Persistent homology

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf 40 - Persistent Homology

✤ From the structure theorem of Z2[x] (PID)

✤ Persistent interval

Ii(b): inf of Ii, Ii(d): sup of Ii

✤ Persistent diagram

Persistent homology

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

41 - Persistent Homology

death time

✤ Detect the “structure hole”

✤ “Hole” appears close to the

diagonal may be the “noise”

✤ “Hole” appears far to the

diagonal may be the “noise”

birth time

Persistent homology

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf 42 - Outline

1. Topology and holes

2. Simplicial complexes

3. Definition of holes

4. Persistent homology

5. Some of applications

see more at part2 of tutorial

TDA - Basic Concepts

43 - Applications

• Persistence to Protein compressibility

Marcio Gameiro et. al. (Japan J. Indust. Appl. Math (2015) 32:1-17)

5 - Some of applications

44 - Protein Structure

amino acid 1

amino acid 2

peptide bond

folding

1-dim structure of protein

3-dim structure of hemoglobin

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

45 - Protein Structure

✤ Van der Waals radius of an atom

H: 1.2, C: 1.7, N: 1.55 (A0)

O: 1.52, S: 1.8, P: 1.8 (A0)

Van der Waals ball model of hemoglobin

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

46 - Alpha Complex for Protein Modeling

✤

: position of atoms

: radius of i-th atom

✤

: weighted Voronoi Decomposition

: power distance

✤

: ball with radius ri

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

47 - Alpha Complex for Protein Modeling

✤ Alpha complex

nerve

k - simplex

✤ Nerve lemma

✤ Changing radius

to form a filtration (by w)

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

48 - Topology of Ovalbumin

th time

th time

dea

PD1

dea

PD2

birth time

birth time

1st betti

2nd betti

plot

plot

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

49 - Compressibility

Functionality

3-dim structure

Softness

…..

…..

Compressibility

holes

Experiments

Quantification

(Diﬃcult)

Select generators and fitting parameters

Persistence diagrams

with experimental compressibility

Persistence to protein compressibility

50 - Denoising

✤ Non-robust topological features depend on a status of

fluctuations

✤ The quantification should not be dependent on a

status of fluctuations

th time

✤ Topological noise

dea

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

birth time

Persistence to protein compressibility

51 - Holes with Sparse or Dense Boundary

✤ A sparse hole structure is deformable to a much larger

extent than the dense hole → greater compressibility

: van der Waals ball

: enlarged ball

th time

dea

✤ Eﬀective sparse holes

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

birth time

Persistence to protein compressibility

52 - # of generators v.s. compressibility

# of generators v.s. compressibility

y

essibilit

Compr

Topological Measurement Cp

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to protein compressibility

53 - Applications

• Persistence to Phylogenetic Trees

5 - Some of applications

54 - Protein Phylogenetic Tree

✤ Phylogenetic tree is defined by a distance matrix for a

set of species (human, dog, frog, fish,…)

✤ The distance matrix is calculated by a score function

based on similarity of amino acid sequences

human hemoglobin

frog hemoglobin

distance matrix of

hemoglobin

fish

frog

fish hemoglobin

human

dog

amino acid sequences

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to Phylogenetic Trees

55 - Persistence Distance and Classification of Proteins

✤ The score function based on amnio acid sequences does not

contain information of 3-dim structure of proteins

✤ Wasserstein distance (of degree p)

Cohen-Steiner, Edelsbrunner, Harer, and Mileyko, FCM, 2010

on persistence diagrams reflects similarity of persistence

diagram (3-dim structures) of proteins

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to Phylogenetic Trees

56 - Persistence Distance and Classification of Proteins

th time

th time

dea

dea

birth time

birth time

th time

Wasserstein distance

dea

Bijection

birth time

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to Phylogenetic Trees

57 - Distance between persistence diagrams

Persistence of sub level sets

th time

dea

birth time

Stability Theorem (Cohen-Steiner et al., 2010)

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

Persistence to Phylogenetic Trees

58 - Phylogenetic Tree by Persistence

✤ Apply the distance on persistence diagrams to classify

proteins

2FZB

1C40

1FAW

3LQD

1QPW

3D1A

3DHT

Persistence diagram used the noise band same as

in the computations of compressibility

Persistence to Phylogenetic Trees Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf 59 - Future work

✤

Principle to de-noise fluctuations in persistence diagrams (NMR

experiments)

✤

Finding minimum generators to identify specific regions in a

protein (e.g., a region inducing high compressibility, hereditarily

important regions)

✤

Zigzag persistence for robust topological features among a

specific group of proteins (quiver representation)

✤

Multi-dimensional persistence (PID → Grobner basic)

Slide source: http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

TDA - Basic Concepts

60 - Applications more in part … of tutorials

One of pioneers in applications

✤ Robotics

✤ Computer Visions

✤ Sensor network

✤ Concurrency & database

Prof. Robert Ghrist

Department of Mathematics

University of Pennsylvania

✤ Visualization

Michael Farber Edelsbrunner Zomorodian

Carlsson Mischaikow Gaucher

Bubenik

5 - Some of applications

61 - Software

•

Alpha complex by CGAL

http://www.cgal.org/

•

Persistence diagrams by Perseus (coded by Vidit Nanda)

http://www.sas.upenn.edu/~vnanda/perseus/index.html

•

CHomP project

http://chomp.rutgers.edu/Project.html

TDA - Basic Concepts

62 - Reference links

•

Yasuaki Hiraoka associate professor homepage

http://www2.math.kyushu-u.ac.jp/~hiraoka/site/About_Me.html

http://www2.math.kyushu-u.ac.jp/~hiraoka/protein_homology.pdf

•

Applications in sensor network

www.msys.sys.i.kyoto-u.ac.jp/~kazunori/paper/nist20081219.pdf

TDA - Basic Concepts

63