The Last Book Bender

2024/01-04/2024 Python BERT Recommendation Systems ReactJS Flask

Basic Info

This was a project that I worked on in the class Data and Visual Analytics (CSE-6242) at Georgia Tech’s OMSA program, parterned with teamates Ryan Tsedevsuren, Tyler David, Terrance Patric Lenaghan, Timothy Zhang, Swathi Naik.

The Idea

We wanted to create a book webapp that allows interactive query and recommendation visualizations, built on top of the datasets goodbooks-10k and Project Gutenberg.

We want to implement both the CF (collaborative filtering) and CBF (content-based filtering) functionalities; an brief overview of what these do can be found in this Wikipedia link, or this textbook¹. We intended the user to interact with the app in the following ways:

In the first way, the users may enter a prompt, indicating their preferences, and our CBF backend API endpoint will search for top books that matches this response utilizing the bert-base-uncased model.
In the second way, the users may have an anonymized view of other users, and if their is a match in preferences, our CF backend API endpoint will find the top books that the users might also like. We tried to look for existing implementations, such as the surprise library. However we found that some sparse implementations and fine-tweaks might be needed to fit our needs, so I decided to implment it myself.
In both ways, the API endpoints are implemented in flask.
In both ways, we visualize the books as force graphs, where each book is a node. The users can drag around, zoom in/out on the force graphs, and right click to expand a node’s most similar nodes. Here we implemented the UI and force graphs functionalities via material-ui, react-force-graph React libraries.

To appear: System Design Diagram

My contributions

My contributions are summarized in the following table:

Components	Implementation	Role
Frontend - UI	`material-ui`	Main implementation
Frontend - Visualization	`react-force-graph`	Main implementation
Backend	`flask`	Exposed CF Functionality as an API endpoint
Modeling - CF	From scratch	Implemented CF from scratch, and tested different similarity metrics via 9-fold CV

Animated Screenshots

The final app has the following look (animated GIF generated from our unlisted YouTube Video Demo, produced with CLI tools yt-dlp, ffmpeg, magick):

Demo of Webapp

Issues and Rooms for Improvements

I thought the idea of project was pretty cool, but I will talk about some improvement directions and related takeaways on my side (some of which are things that I wished I had more time to spent working on on my part):

Deployment Issues: We tried to find free solutions for deploying this webapp online, but the final webapp was pretty large, so we decided to keep it as an app that can be built locally. The main reason was that the backend functionality for CBF uses the entire bert-base-uncased model, which was about 0.5GB in size (without accounting for the size of other Python libraries needed). A possible direction for improvement is to use distilled version of it (either distilbert-base-uncased or TinyBERT_General_4L_312D).
Exploration Issues: The exploration ability of the force graph is actually pretty limited - each time we generate top 5 recommendations, but we often ended up in a complete graph, and each node expands to nodes that we have seen before, making the exploration process stagnant. We do not know if this is an issue in our modeling process (whether the CBF or the CF part), our data (where we have very strong nodes dominating the whole graph).
UI/UX Issues: The view for existing users is extremely inconvenient - I implemented in a way that one can only click alone one direction to look at one book at a time. This is one part that I wanted to improve, but I was not very adept with the MUI library at that time.
RWD Design: I didn’t have time finishing the webapp design for adaptability for devices of various size, which is a crucial aspect of modern webapp development, although breakpoints for media queries were used.

Final Thoughts

I learned a lot from this project:

I used to have some “have watched some Udemy React course” React skill, and vague experience with other frontend development stuffs, but this project really pushed me to put these into works.
This is my first time reading Flask backend codes, connecting frontend apps to it, as well as designing some additional endpoints. I used to have a vague idea on how maybe Python can be used to do such things, but working with teammates that have backend skills enabled me to see how this actually works.
This is my first time learning about the BERT model. I made up my mind at that time to learn about this tool and possibly use it in the future when encountering analytics tasks involving NLP components.

Most importantly, this is my first time trying to build an end-to-end pseudo-complete data product with a team. It might be a weird to say this, but I always find it exciting and happy to see when datas “doing things”.

Back to Projects Page

Back to Resume

Aggarwal, Charu C. Recommender systems. Vol. 1. Cham: Springer International Publishing, 2016. ↩︎

Samuel Wade Wang

Title here