Well, Hello there!

I pity those of you who want to set the project up and running on your local env.

If you want to try still, you'll probably want to use Ubuntu with MongoDB.

That being said, the system is adapted to a very specific set of data which I cannot distribute for legal reasons. The reason I am sharing the code is because, well, maybe someone takes some good ideas from here one of these days!

So, what are the goals of this project?

We want to create an online service that receives a GET petition with a username and returns a set of recommendations

The recommendations are structured by categories. We only send the top 3 categories (We'll how the relevance is determined later)

The recommendation is based on Collaborative Filtering (based on the activity of all our users)

We are going to infer what this patterns made by the user activity with a neural network and data extracted from a DB

In this particular case, the DB is a MongoDB.

So, how was it done?

A Django server that listens to GET petitions and communicates with the recommender system

A Recommender System with a single-method API that responds to the server and trains the NN with info coming from DB

A 3-layer Fully Connected NN with 2 layers made out of RLU neurons and one sigmoid neuron in the output layer (so we can estimate the sigmoid range [0, 1] as the estimated rating of a given product

The overall process

For each product we try to predict the rating that each user will give it and then pack them in categories sorted by predicted rating with a maximum of 10 elements.

To accomplish this, we generate training data taken from stored real ratings from users to products. We take as many transactions userXproduct --> rating as possible, then for each one we extract the logged rating as well as the characteristics of the user (gender, nationality, age and sex) and the product (average rating and category) and then send to the NN a training set of user_characteristicsXproduct_characteristics --> rating.

This results in our NN being trained and ready to make predictions. We can use it now to process the holes in the rating of our sparse rating matrix of usersXproducts. For this we just repeat the same process, taking combinations of {user, product} and calling the predict() method of the NN.

In order to answer to the server blazingly fast, we do the recommendation job before in a continuous batch process that does the thing that I just described and then stores for each user a list of categories with the recomendations. As mentioned before, each category holds 10 products of the category it represents. All categories are calculated but only three are stored as final and sent back to the server, those that have the highest rating summing up the top 10 individual product estimated ratings.

Side note

Interestingly, I noticed that the RLU learned much faster than other neuron models. My best guess is that this is because my inputs were not normalized between [0, 1]. Instead, I was sending values as age in a range of [14, 100] (and RLU does not have upper activation limit).

I observed that the NN was good at determining things like ageing being a good or a bad factor depending on the product category (say, good for books, bad for videogames) also making distinctions about gender, and giving almost always a big weight to the average rating.

It is quite impressing how a simple model like this one was capable of learning so much with a reduced set of data. It would be definitely worth it taking a look at what's going on in Spotify's recommender, for instance.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
doc		doc
product_selector		product_selector
server_settings		server_settings
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
apps.py		apps.py
config.json		config.json
create_indexes.py		create_indexes.py
db.sqlite3		db.sqlite3
db_test.py		db_test.py
db_update_maincategory.py		db_update_maincategory.py
db_update_rates.py		db_update_rates.py
dependencies.txt		dependencies.txt
dummy_rating_generator.py		dummy_rating_generator.py
generate_umls.sh		generate_umls.sh
manage.py		manage.py
nohup.out		nohup.out
requirements.txt		requirements.txt
runserver.sh		runserver.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Well, Hello there!

So, what are the goals of this project?

So, how was it done?

The overall process

Side note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

gabriel-cf/nn-product-selector

Folders and files

Latest commit

History

Repository files navigation

Well, Hello there!

So, what are the goals of this project?

So, how was it done?

The overall process

Side note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages