Introduction

Thesedays, I have been watching Andrew NG’s machine learning videos from coursera. It was awsome, so I try to write a lecture note of it on my blog.

Evaluating A Learning Algorithm

1. Deciding what to try next

Typical actions, when it has large error.

Get more training examples.
- There are one thing you should know. Sometimes, getting more data doesn’t acctually help.
Try smaller sets of features.
Try getting additional features.
Try adding polynomial features. (\(x_1^2, x_2^2, x_1 x_2, etc\))
Try decreasing or decreasing regularization parameter lambda

There are many people that choose one of actions randomly above to increase performance. Dignotics of a algorithm gives you a insight of choosing which action you should do next.

2. Evaluating a hypothesis

Split dataset to training set, cross validation set, and test set.
Compute training error.
Compute test error.
Computer misclassification error.

3. Model selection and Train/Valid/Test sets

Typical split of dataset ratio : 0.6(train set), 0.2(cross validation set), 0.2(test set)
How to select model?
1. train several model with train set.
2. choose best model based on cross validation error.

A model is fitted to train and cross validation set only, therefore calculation of test error of the model represents generalization error.