← All Projects

Movie Recommendation Engine

2024 – 2025 academic 1 unique views 1 total views
Movie Recommendation Engine

Leaving SaaS

After a short six month stint at the SaaS company Proactis (explained in BucketRace), I became convinced of a few things.

  1. AI was going to be almost everything in the world of SaaS.
  2. A lot of SaaS companies leadership had no strategy a quite frankly no clue what 'AI' is.
  3. Many were slow moving and structurally incapable of capitalising on the opportunities that were coming.

Given those conclusions, I knew I needed to get out.

After a company wide Teams call full of hot air, waffling on about AI this and AI that with zero substance, delivered shortly after firing an entire unit of people who might actually have been able to implement AI, I handed in my notice.

Realising My Skills Were Unstructured

Within about 20 minutes.

At that point, I had roughly three solid years of programming under my belt, but much of it was unstructured. I was writing ad hoc scripts, trying to build full stack systems in Google Sheets, creating terminal apps that only worked for me, and renting VMs to host automated scripts and software that again only helped me. I knew I needed to learn how to build things end to end.

Learning the Foundations Properly

So I started learning Django, Docker, and Flask, and worked through Harvard’s CS50 courses. Around then, a friend told me about a Harvard data science qualification he was doing, focused on R. I signed up as well, thinking I would finish it in a few months. It took me 18.

Falling Down the Machine Learning Rabbit Hole

The main reason was the rabbit hole I went down. People who know me will tell you I can fairly determined to understand the fundamentals, which you'll see in the report below. I did not just want to pass. I wanted to understand everything I could, including machine learning, stochastic gradient descent, decision trees, linear regression, ensemble models, and neural networks. How do these algorithms actually produce actionable data? Is it all fluff? Do they really work, and how well? How does search work? Why do Google Maps and Waze give different results? What's the best type of average? What is really going on under the hood?

The Result

After more than 20 models, 10,000 words, and 2,000 lines of code, I had my answer. The results are in the report below.

Technologies

r github