Recent Posts

« 2025/01 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Archives

Today

Total

관리 메뉴

MY MEMO

[MACHINE LEARNING] Feature Scaling and Normal Equation 본문

MACHINE LEARNING/Stanford University

[MACHINE LEARNING] Feature Scaling and Normal Equation

l_j_yeon 2017. 3. 29. 20:57

+) this post is based on the lecture and content in the coursera(https://www.coursera.org/) machine learning class

(professor)

+) you can make new algorithm using original gradient descent function

so you can find this functions are same

Feature Scaling

We can speed up gradient descent by having each of our input values in roughly the same range.

−1 ≤ x(i) ≤ 1

−0.5 ≤ x(i) ≤ 0.5

These aren't exact requirements; we are only trying to speed things up. The goal is to get all input variables into roughly one of these ranges, give or take a few.

Two techniques to help with this are feature scaling and mean normalization. Feature scaling involves dividing the input values by the range (i.e. the maximum value minus the minimum value) of the input variable, resulting in a new range of just 1.

formula:

Where μi is the average of all the values for feature (i) and si is the range of values (max - min), or si is the standard deviation.

i:=price−10001900.

Learning Rate

To summarize:

If α is too small: slow convergence.

If α is too large: may not decrease on every iteration and thus may not converge.

Features and Polynomial Regression

2Polynomial Regression

Our hypothesis function need not be linear (a straight line) if that does not fit the data well.

We can change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).

eg. if x1 has range 1 - 1000 then range of x21 becomes 1 - 1000000 and that of x31 becomes 1 - 1000000000

Normal Equation

There is no need to do feature scaling with the normal equation.

The following is a comparison of gradient descent and the normal equation:

Gradient Descent	Normal Equation
Need to choose alpha	No need to choose alpha
Needs many iterations	No need to iterate
O (nn2)	O (n^3n3), need to calculate inverse of XTXXTXXTX
Works well when n is large	Slow if n is very large

if XTXnoninvertible, the common causes might be having :

Redundant features, where two features are very closely related (i.e. they are linearly dependent)
Too many features (e.g. m ≤ n). In this case, delete some features or use "regularization" (to be explained in a later lesson).

but 'pinv' function will give you a value.

저작자표시 비영리 변경금지

'MACHINE LEARNING > Stanford University' 카테고리의 다른 글

[MACHINE LEARNING] Classification and Simplified cost function & Gradient Descent (0)	2017.03.29
[MACHINE LAERNING] Basic Conception of the Machine Learning and Cost Function (0)	2017.03.29
[MACHINE LEARNING] Week2 과제 (0)	2017.03.28