Mining Web Graphs for Recommendations

    70 Votes

Recommendations on the Web is a general term representing a specific type of information filtering technique that attempts to present information items (queries, movies, images, books, Web pages, etc.) that are likely of interest to the users. With the diverse and explosive growth of Web information, how to organize and utilize the information effectively and efficiently has become more and more critical. This is especially important for Web 2.0 related applications since user generated information is more freestyle and less structured, which increases the difficulties in mining useful information from these data sources. In order to satisfy the information needs of Web users and improve the user experience in many Web applications, Recommender Systems.

Typically, recommender systems are based on Collaborative Filtering which is a technique that automatically predicts the interest of an active user by collecting rating information from other similar users or items. The underlying assumption of collaborative filtering is that the active user will prefer those items which other similar users prefer. Based on this simple but effective intuition, collaborative filtering has been widely employed in some large, well-known commercial systems, including product recommendation at Amazon, movie recommendation at Netflix etc. But, collaborative filtering algorithms require a user-item rating matrix which contains user-specific rating preferences to infer users characteristics. However, in most of the cases, rating data are always unavailable since information on the Web is less structured and more diverse.

Fortunately, on the Web, no matter what types of data sources are used for recommendations, in most cases, these data sources can be modeled in the form of various types of graphs. If we can design a general graph recommendation algorithm, we can solve many recommendation problems. However, while designing a framework for recommendations on the Web, there are several challenges. The first challenge is that it is not easy to recommend latent semantically relevant results to users. Take Query Suggestion as an example, there are several outstanding issues that can potentially degrade the quality of the recommendations. The first one is the ambiguity which commonly exists in the natural language. For example queries containing ambiguous terms may confuse the algorithms degrading the information needs of the users.

Moreover users tend to submit short queries consisting of only one or two terms under most circumstances, and short queries are more likely to be ambiguous. Adding to that, in most cases, the reason why users perform a search is because they have little or even no knowledge about the topic they are searching for. In order to find satisfactory answers, users have to rephrase their queries constantly. The second challenge is how to take into account the personalization feature. Personalization is desirable for many scenarios where different users have different information needs. For instance, Amazon.com has been the early adopter of personalization technology to recommend products to shoppers on its site, based upon their previous purchases. The adoption of personalization will not only filter out irrelevant information to a person, but also provide more specific information that is increasingly relevant to a person’s interests. 

The last challenge is that it is time consuming and inefficient to design different recommendation algorithms for different recommendation tasks. Hence a general framework is needed to unify the recommendation tasks on the Web. Moreover, most of existing methods are complicated and require to tune a large number of parameters. In this paper, aiming at solving the problems analyzed above, a general framework is proposed for the recommendations on the Web. This framework is built upon the heat diffusion on both undirected graphs and directed graphs, and has several advantages.

  • It is a general method, which can be utilized to many recommendation tasks on the Web.
  • It can provide latent semantically relevant results to the original information need.
  • This model provides a natural treatment for personalized recommendations.
  • The designed recommendation algorithm is scalable to very large data sets.

The empirical analysis on several large scale data sets shows that the proposed framework is effective and efficient for generating high quality recommendations.

Attachments:
Download this file (Mining Web Graphs for Recommendations.pdf)Mining Web Graphs for Recommendations[Seminar Report]507 Kb