Machine Learning Level 0


By Sridhar Gangavarapu MBA,MS.

Identifying patterns, finding anomalies, solving puzzles, probability of winning a card game were somethings i always loved to do since middle school. Some weekly, monthly magazines published these and involuntarily i was addicted toward solving these bizarre puzzles and pattern recognition quizzes.

While in undergrad in Computer Science, i took a course in artificial intelligence. at that time it was quite boring probably because we had no practical applications of that.

I worked as Enterprise Software Developer for over a decade. During which i felt the need to solve problems before they occurred. Few solutions could be identify patterns, ability to forecast the future performance based on the "trend". Until couple of years I did not know much about the concept of Machine Learning. But once i learned, it was love at first sight.

Thanks to coursera.com, udemy.com for their wonderful courses. I recommend taking one or more of the ML courses to understand the fundamentals. Statistics subject in my MBA helped me tremendously with understanding various concepts of ML There are several other sources of ML knowledge for free on Internet. 

Here is high-level diagram of ML model creation and improving and then creating final test results 




The following is just a summary of what i learned over the months. It is definitely not a comprehensive list but gives some insight.

Steps 0) learn Python programming. R programming could be useful but Python is scalable and can run virtually on any environment.

1) Learn to use Pandas or SFrames. I personally feel that SFrames is much richer in functionality and most importantly does not require the dataset to be in memory. Pandas is in-memory based, which means that your RAM size determines how big the input dataset could be.

2) Understand the concept of supervised and unsupervised learning.
The former is uses the past to predict the future, while the latter uses just throws the data on the wall to identify clusters or some commonality. The latter could be used for target marketing.

3)  Supervised learning:
The input history data is used to predict the future. The data is split into Training and Testing data. The regression fitted model is trained on Training set and then examined on testing set to see the error level. A good model has a minimum training and test error. in other words it predicts well.

Linear Regression : Trying to understand the relation ship between a independent variable x and a dependent variable y.  where y = f(x) = mx + c
m = slope of fitted line; c= y intercept.
The f(x) here based on cost function (most commonly RSS = Residual Sum of Squares). The concept is the function with lowest cost is best fitted.

A simple example commonly used is predicting home prices given square footage of the house.

Note: More than one independent variable could be used in the regression.
Also f(x) could also be a quadratic or a polynomial function. But we need to consider the over-fitting (trying to reduce the RSS) and under-fitting.








Coming soon..

> Classification, Sentiment Analysis
> Clustering, K-Means 
> Recommendation system












Comments

Popular posts from this blog