Machine Learning Basics for Web Application Developers Etsuji Nakai Cloud Solutions Architect at Google 2016/08/19 ver1.2 Google confidential | Do not distribute
$ who am i ▪Etsuji Nakai Cloud Solutions Architect at Google Twitter @enakai00
Machine Learning Basics
Linear Binary Classifier ▪ Build a model to classify two types of data with a straight line. ● The model will predict the probability of being in the positive class for new data. ● It’s like predicting if the patient is infected with a specific virus based on the preliminary check result. ▪ Observe how the model is trained on x : Positive o : Negative “Neural Network Playground” ● http://goo.gl/A2G4Hv
Logistic Regression ▪ The straight line can be represented as below, which can be translated to a probability through the logistic The value of f increases function σ. in this direction Probability of being positive ▪ “To train the model” is to adjust the parameters so that the Logistic function σ model fits in the training dataset.
How to measure “fitness” of the model ▪ You define the “loss function” which indicates the non-fitness of the model. Then ML algorithms adjust parameters to minimize the loss function. ● In logistic regression, you adjust the parameters to maximize the probability of giving a perfect prediction for the training dataset. ● For example, suppose that n-th data is given as and its correct label is (1=x, 0=o). Then the probability that the model gives the correct prediction for this data is: ● Hence the probability of giving correct predictions for all data is: ● By defining the loss function E as below, you cal tell ML algorithms to minimize it.
Graphical Understanding of Linear Classifier ▪ Drawing 3-dimensional graph of , you can see that the “tilted flat plane” divides the plane into two classes.
Linear Multiclass Classifier (Hardmax) ▪ How can you divide the plane into three classes (instead of two)? ▪ You can define three liner functions and classify the point based on “which of them has the maximum value at that point.” ● It is equivalent to dividing with the three tilted flat planes.
Linear Multiclass Classifier (Softmax) One dimensional example of ▪ You can define the probability that the softmax translation. belongs to the i-th class as below: ▪ This translates the magnitude of into the probability satisfying the following conditions.
Image Classification with Neural Network
Classifying Images with Softmax function ▪ For example, a gray scale image with 28x28 pixels can be represented as a 784 dimensional vector. (i.e a collection of 784 float numbers.) ● In other word, it corresponds to a single point in a 784 dimensional space! ▪ When you spread a bunch of images into this 784 dimensional space, similar images may come together to form clusters of images. ● If this is a correct assumption, you can classify the images by dividing the 784 dimensional space with the softmax function.
Let’s try with TensorFlow Correct Incorrect * Comments are in Japanese. ▪ You can see the code and its result (92% accuracy). http://goo.gl/rGqjYh
Improving Accuracy using CNN Fully-connected Layer ▪ Instead of providing the raw image data into the softmax function, you can extract “features” of images through Dropout Layer convolutional filters and pooling layers. Convolution Pooling Convolution Pooling Filter Layer Filter Layer ・ ・ ・ ・ Raw ・ ・ ・ ・ ・ ・ ・ ・ Image ・ ・ ・ Convolution Pooling Convolution Pooling Filter Layer Filter Layer Softmax Function
Other Possible Architectures ▪ Providing additional data to a pre-trained model to fine-tune it for your specific purpose. ● Technically referred as “Transfer Learning.” ▪ Running trained model on the client. ● You need a lot of computing resource to train the model. But you can use the trained model directly on the client. ▪ Realtime model training on the client? ● Considering the increasing computing resource available on the client, you may be able to train the model dynamically on the client using realtime data (such as webcam images) available on the client.
Similarity between model training and application development Upgrade models Production environment Model tunings Revised Model Test Final Model success Training Deploy new API fail models access E E x xi is s t ti in n g g Revised Model Training Test M M o o de del ls s Applications Version control Fix and retry of models Preprocess and feed ▪ This resembles the software development model (CI/CD). Additional ▪ There will be some de-fact tools to build this framework Data in near future (maybe.)