MY MEMO

[MACHINE LAERNING] Basic Conception of the Machine Learning and Cost Function 본문

MACHINE LEARNING/Stanford University

[MACHINE LAERNING] Basic Conception of the Machine Learning and Cost Function

l_j_yeon 2017. 3. 29. 20:43

+) this post is based on the lecture and content in the coursera(https://www.coursera.org/) machine learning class 

(professor)


Supervised Learning

=> regression problem : we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. 

=> classification problem : we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

Example 1:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture. (continuous output)

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.(discrete output)



Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

+) 구분하려고 하는 각 class에 대한 아무런 지식이 없는 상태에서 분류 (classify) 하는 것이므로 자율학습 (Unsupervised Learning) 에 해당한다.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).



Cost Function

We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x's and the actual output y's.

A contour plot is a graph that contains many contour lines. A contour line of a two variable function has a constant value at all points of the same line. An example of such a graph is the one to the right below.


Gradient Descent

+) to find the minimum value

The way we do this is by taking the derivative (the tangential line to a function) of our cost function. The slope of the tangent is the derivative at that point and it will give us a direction to move towards. We make steps down the cost function in the direction with the steepest descent. The size of each step is determined by the parameter α, which is called the learning rate.

θj:=θjαθjJ(θ0,θ1


+) how can you find alpha

"Batch" Gradient Descent

: "Batch" : Each step of gradient descent uses all the training examples.

Comments