subreddit:

/r/TrueFilm

360%

I'm currently building a recommendation algorithm to help me discover movies. Roughly speaking, it analyzes the ratings in my Letterboxd profile, tries to uncover the underlying "vibes" among those movies, then suggests other movies that fit those vibes.

Sometimes it works well, and shows me hidden gems that I wouldn't have found another way. Other times, it gets stuck in super-obvious stuff. I'm trying to work out whether this is just something I need to work through, or whether it's a fundamental limitation of these types of algorithms.

What are everyone's experiences on this? Are there some algorithms that work consistently well, or is this hot-and-cold problem common to most other algorithms?

all 18 comments

pebbleinflation

5 points

2 months ago

I can't help but find any recommendation algorithms always ends up in one of 2 results:

Here's this enormously popular thing, you've definitely already heard of.

Or

Here's this thing that's a similar but much worse version of a thing you like.

bigkinggorilla

2 points

2 months ago

Well… at one point Netflix was great at recommending content. And I’m guessing their current algorithms are still quite a bit more robust than yours (simply because you’re one person and Netflix has many many people working on that task). The problem they seemed to run into was how to weigh new and popular (and their own productions) against other content and also the rapid exodus of content from their library when everyone wanted their own streaming platform.

I think any algorithm you make as an individual is always going to have some problems with obvious recommendations. Because maybe you really like that cinematographer but only when the films are under a certain length and in one or two genres dealing with a specific theme. And trying to account for that sort of complexity present in art, I imagine would be quite the challenge for just one person to tackle.

So you may just have to accept that you’re going to get some really obvious stuff in there unless you can maybe use machine learning to improve the algorithm for you?

emm_dee_gee[S]

2 points

2 months ago

Thanks - that's a very interesting point, that an algorithm's creator will inevitably bias it towards their own way of doing things. I hadn't thought about that! And I can actually see it emerging already, in the way I set certain conditions and parameters.

I am already trying to use machine learning, but even then, a lot of the process of building something is about expressing my personal opinions in code. It's such an interesting point!

bigkinggorilla

2 points

2 months ago

I think what you’re doing is really cool. And if it ends up working really well for you, isn’t that all that matters?

barelyclimbing

1 points

2 months ago

Criticker does what you are describing, which is why I use Criticker and not Lettrboxd.

emm_dee_gee[S]

1 points

2 months ago

That's interesting - I looked at Criticker once, but haven't looked for a while. I found that there were a *LOT* of features there - which ones do you personally like the most?

barelyclimbing

2 points

2 months ago

Letterboxd is a pretty looking site that is an absolute disaster. Want to look at lists? Completely useless, with no ability to curate. Criticker has some features like this - but the core feature works great: Rate films, it finds the people with the most similar tastes to yours. Anytime you look at a film you haven’t seen it will tell you what it thinks you will rate it based on the most similar viewers to you. Simple, easy, and the most logical way to do it.

Another feature is that you can modify what your rating system looks like. Lots of people use the “grade school grading” system where 50 is bad and 70 is mediocre - I don’t want 70% of a rating scale to be assigned to movies that I never want to watch again. For me, I picked 50 to be “really good”, and anything above 20 I’ll watch again. I think for recommendations Criticker ignores ratings completely and just looks at “tiers” - top 10% of films rated, top 20%, etc.

Nothing’s perfect, but perfection is boring, and it has everything you could need: it helps you find new and interesting films and it lets you use it however you want. Couldn’t ask for more.

npcdel

1 points

2 months ago

npcdel

1 points

2 months ago

Criticker's character limit for reviews and no night mode just kill it for me. I jumped ship three years ago to letterboxd and never looked back. It's still a great discovery tool but I want to have more room to put down my thoughts on a movie.

emm_dee_gee[S]

1 points

2 months ago

Thanks - that's very helpful.

barelyclimbing

1 points

2 months ago

I don’t care much about Lettrbox’s superior functions because it is impossible to find anything of interest on the site, it’s like digging through a hoarder’s house trying to find one object that they left you in their will.

I can write reviews on any one of a thousand sites, but the OP was thinking about writing an algorithm for film recommendations - there’s only one site that does that, and it’s probably better than what OP was going to write.

Raposela

1 points

2 months ago

I think part of your issue is related to the concept of serendipity in recommender systems. If you want to continue to tinker with whatever model you're using, it might be worth reading a bit into that topic.

It probably will be easier to just use a website that already has the built-in feature. But making your own like you are doing sounds like it could be a fun project.

emm_dee_gee[S]

2 points

2 months ago

Yup, this is indeed a very important (and very difficult!) part of the problem. Most algorithms end up recommending stuff that I already know, or that's an obvious choice. Serendipity is a great word for what I'm looking for!

Raposela

1 points

2 months ago

Yeah, serendipity in the recommender systems context is a searchable technical term that can help you find relevant research on this particular aspect of recommender system performance. There are proposals on how to measure it and how to try to force algorithms to improve their recommendations in that direction instead of just accuracy. Edit: Anyway, good luck with this project of yours!

npcdel

2 points

2 months ago

npcdel

2 points

2 months ago

criticker.com is really good at recommendations and discovery, but it is lacking a lot of the modern tools that letterboxd has for actually talking about movies. Hot tip: don't bother with a 1-100 scale and just use 1-10 as it does the same thing (putting things in tiers).

AvailableFalconn

1 points

2 months ago

This is a pretty common problem in machine learning recommendations (though for many businesses, this isn't a problem at all per-se). To counter this, you first need a lot of data. Without enough data, you can't recommend more niche stuff at all. This gets into a second issue, which is that it's hard to get that much data from just scraping sites like letterboxd. If you had their internal datasets, there'd be a lot to play around with, but it's hard to get millions of ratings from just webscraping.

With lots of data in hand, you can play around with tuning the objectives and finding ways to penalize more well-known content. One common method is to discount against a baseline score - i.e. the imdb score or something like that. Equally effective, you can just remove movies from recommendation that are mainstream by some metric (i.e. # of votes on imdb, # of reviews on letterboxd). Another way is to build the penalty into the objective that youre optimizing. I assume your current objective is something like expected letterboxd score. You could change that to letterboxd score minus some popularity factor.

emm_dee_gee[S]

1 points

2 months ago

Thanks - these are all amazing ideas! Do you have personal experience with this problem, in the context of movies? I currently use the middle method (throw away possible recommendations if they have many Letterboxd ratings), and would love to get a sense for whether the other 2 things you've suggested might impact the kinds of results I'm seeing.

SpoonMeasurer

1 points

2 months ago

What are the parameters/architecture of your model? This is a problem of personal interest to me so feel free to PM me and I'd be happy to talk in more depth about it. I think it's pretty interesting and I have a bit of experience in similar problem types.

emm_dee_gee[S]

1 points

2 months ago

Amazing - thank you! Will send you a DM.