Evaluating A Learning Algorithm
Introduction
Thesedays, I have been watching Andrew NG’s machine learning videos from coursera. It was awsome, so I try to write a lecture note of it on my blog.
Evaluating A Learning Algorithm
1. Deciding what to try next
Typical actions, when it has large error.
- Get more training examples.
- There are one thing you should know. Sometimes, getting more data doesn’t acctually help.
- Try smaller sets of features.
- Try getting additional features.
- Try adding polynomial features. (\(x_1^2, x_2^2, x_1 x_2, etc\))
- Try decreasing or decreasing regularization parameter lambda
There are many people that choose one of actions randomly above to increase performance. Dignotics of a algorithm gives you a insight of choosing which action you should do next.
2. Evaluating a hypothesis
- Split dataset to training set, cross validation set, and test set.
- Compute training error.
- Compute test error.
- Computer misclassification error.
3. Model selection and Train/Valid/Test sets
-
Typical split of dataset ratio : 0.6(train set), 0.2(cross validation set), 0.2(test set)
-
How to select model?
- train several model with train set.
- choose best model based on cross validation error.
A model is fitted to train and cross validation set only, therefore calculation of test error of the model represents generalization error.