Classification
For the binary classification, (Also maybe, , that’s called a multiclass classification problem, we will discuss it later.).
So, we use a model called Logisitic Regression, and we can see the hypothesis should value in the range of 0 and 1. This is to say, .
Hypothesis Representation
-
Logistic Regression: ,
-
Interpretation of Hypothesis Output. The value of equals to the estimated probability that y=1 (on input x, parameterized by ). This is to say,
-
Decision Boundary
The decision boundaries are like this:
Emphasis: Decision boundary is the property of hypothesis function, but not the property of training set and its parameters.
Logistic Regression—How to fit the parameters of theta
-
Cost Function of Logistic Regression:
-
Gradient Descent:
To minimize
Repeat{
}
And we can see that this algortithm looks identical to linear regression!
But actually, the hypothesis of them are different.
Linear Regression:
Logistic Regression:
Besides,
“…use a vector rise implementation, so that a vector rise implementation can update all of these until parameters all in one fell swoop.”
Advanced Optimization
- Gradient Descent
- Conjngate Gradient
- BFGS
- L-BFGS
- ……
-
Adcantages: No need to manually pick ; Often faster than gradient descent;
-
Disadvantages: More complex;
Multi-class classification: One-vs-all
- For example, to slove the three-class problem, we can “turn this into three seperate two-class classification problems.”
- On a new input , to make a prediction, pick the class i that maximizes.