Skip to content
Snippets Groups Projects
Commit ec33d181 authored by Winnie Uyen Nguyen's avatar Winnie Uyen Nguyen
Browse files

Update README.md

parent 45f5b32c
Branches main
No related merge requests found
# Tuned Asymmetric SVD - Recommender System
<br />
**HIGHLIGHTS** <br />
- I propose Asymmetric SVD with tuned hyper-parameters <br />
**HIGHLIGHTS**
- I propose Asymmetric SVD with tuned hyper-parameters
- Goals/Expectations: <br />
- It is a suboptimal approximation with lower running time <br />
- Give the upper bound of error generated by Asymmetric SVD <br />
- Showing the advantages of my algorithm on Movielens 100k Dataset <br />
<br />
**OBJECTIVE**<br />
**OBJECTIVE**
In recent years, the need for more accurate recommender systems to improve user interaction and provide more personalized services on eCommerce platforms such as Amazon and Netflix has been increasing globally. The motivation results from a desire to help users find an appropriate product that fits their tastes and meets a variety of special needs, enhancing users’ satisfaction and loyalty. However, with the overload of vast amounts of customer data, recommender systems face challenges in processing data robustly and accurately. This proposal focuses on designing a movie recommendation system that works for offline and online processes, meaning performs well when new data is added to the original dataset. The base algorithm in my paper is Singular Value Decompositions (SVD), an applied matrix factorization method of the item-based collaborative filtering model. To solve scalability matters and reduce the expensive matrix factorization steps, the Asymmetric SVD model with tuned hyper-parameters is expected to improve in prediction accuracy and run time.
<br />
<br />
**DATA**<br />
In this project, I used the “MovieLens” database, developed by the GroupLens research lab at the University of Minnesota.
The initial data set consisted of ~100,000 user ratings; then was reduced to ~47,000 once both users and movies with low number of ratings were removed to combat high matrix sparsity of around 99%. My final dataset included information on userId, movieId, rating, title, genres and year.
**DATA**
In this project, I used the “MovieLens 100K” database, developed by the GroupLens research lab at the University of Minnesota.
The initial data set consisted of ~100,000 user ratings; then was reduced to ~47,000 once both users and movies with low number of ratings were removed to combat high matrix sparsity of around 93%. My final dataset included information on userId, movieId, rating, title, genres and year.
**MAIN PARTS**
1. *Exploratory Data Analysis and Post-Modeling:* emphasize the spread of movie ratings by looking at distributions and visualize the popular movies in each genre. Moreover, the EDA post-modeling step helps me figure out the differences between initial ratings and ratings after using model. From that, we can conclude about how accurate the model performed.
2. *Recommender Modeling:* Starting with a baseline SVD model without tuning any hyperparameters or making changes in algorithm, I will explore how Asymmetric SVD with tuned hyper-parameters perform in term of accuracy and speed.
3. *Software demonstration video*:
2. *Recommender Modeling:* Starting with a baseline SVD model without tuning any hyperparameters or making changes in algorithm, I will explore how Asymmetric SVD with tuned hyper-parameters perform in term of accuracy and speed. We need to run the **TuningAsym_SVD.py** to fine-tuned the hyper-parameters of algorithm then get the results applying to **FinalResultAsym_SVD.py** - the model that applying tuned Asymmetric SVD algorithm to our dataset and return the two evaluation metrics including Root Mean Squarred Error (RMSE) and Mean Absolute Error (MAE)
3. [Software demonstration video](https://youtu.be/SYulEglaKlQ)
4. [Poster](https://portfolios.cs.earlham.edu/wp-content/uploads/2022/05/zdnguyen18-CS488-Capstone_Final.pdf)
5. [Paper](https://portfolios.cs.earlham.edu/wp-content/uploads/2022/05/CS488_FinalPaper.pdf)
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment