Recommendation Engine

By: Sridhar Gangavarapu

I came across the question how do you build a Amazon recommendation engine and i was very curious what the answer was. It is a very relevant question in today's context of service industry.  I spent quite some time over weekends to learn why, what and how of recommendation engines. 

Recommendation: It is the ability to propose/suggest a set of related products/services/ideas based on historical evidence.

Personalization is all about providing the content relevant to their context, need(s) and interest(s).

Again, the recommendation should be made based on the current context. Otherwise, dealing with millions of records might end up in irrelevant recommendations. The context could also relate to the state of the user (like habits, age, gender etc). Habits changes and hence needs to be relevant (Right now Vs 6 years ago. The context needs to be reevaluated periodically.

Most famous use cases of Machine Learning are:

Youtube Videos
Facebook users
Netflix movies
Drug interaction.

Solution :

Popularity of the product. How popular is the item compared to other products?

This is based on reviews,  most bought, most talked etc... in the entire population and then subset/sample of the population that are closer to your context.

Example: Lets consider this week's top blockbuster movie list. Well, the list is based on the how people reviewed a bunch of movies released that week. The order is same for you and me but what if  our interests are different. Then, this list makes less sense to some people as there is no personalization added into the algorithm that generated the list.

Classification model: 
Some kind of customization needs to be added to popularity to determine if a specific product would be of your interest. The Classification is based on feature set that is used to determine if a product is of your type. The feature could be and not limited to User Info, Product information and Purchase history (recent). This works well on Amazon for users that have enough recent history of purchases and the profile is correct. If age, gender, location etc.. are provided incorrect or missing, then the recommendations would be as ugly.

But classification model has its own limitations. As we see, the classification model heavily relies on the feature input. Hence, for example, if amazon does now know what is your age, gender, product description is not clear, not very well classified. Collaboration filtering is one concept that will reduce this ambiguity in classification model.

Collaboration filtering: It is based on Co-Occurrence Matrix to provide the insight based on knowledge about the past purchase history.

In simple terms People who bought item X also bought A,B, and C. 


Co-Occurrence Matrix:  It is a symmetrical matrix with same number of row and columns. For example, the matrix would contain counts of other product purchases for a specific products.
The below co-occurrence matrix shows the iphone purchases vs other things purchased at the same time. The person who bought iphone is less likely to buy diapers. These are not real numbers but just an example to demonstrate the model .




But lets assume that one of the co-occurrence items is very popular and undermines all other recommendations. under that situation the recommendations are skewed/biased towards the most popular item. For example, lets take the case of cool toy like Fidget Spinner. Everyone, including me bought this. But, what relevance has this to purchasing a iPhone case?  This is one problem that needs to be addressed using normalization to diffuse the effect of these supper hot products. The normalization could be weighted to include personalization touch.

The idea is multiple each item with (number of people bought ( A AND B) / (A OR B)). In the case of Frigid Spinner, denominator would be gigantic making the resulting number very small. But for genuine products the result would be appropriate.  This is all well, but it does not work very well for a newbie who has no history of purchase.

Matrix factorization enables us to take user likes and dislikes into consideration and then take product recommendations and then blend both to get a personalized solution.
Lets take an example of Product recommendation based on user interest:

Product Category [Baby, Phones, Furniture, Grocery, Electronics]
User Interest Matrix  U =[10, 1, 2, 4, 5]  - this person is more interested in Baby products compared to Phones.

Product Category  P = [0, 0, 3, 1, 2]  UxP =  0+0+6+4+10 = 20. This gives us an idea that this product is not of great interest as it is not a baby product. 

Linear Classification and Feature engineering is a very important aspect of recommendation. The classification model can identify if a customer would like a product or not. Based on which the recommendations can be provided. 

References:
Coursera, udemy, and lots of study material on internet.


Comments

Popular posts from this blog