Personalized Recommendations of Manga #1220

peachblacky · 2024-12-19T14:05:52Z

Describe your suggested feature

As of now (as i see from the source code), the Recommendations block in app just suggest some bunch of random manga to scroll through.
I think it would be a great idea to make some more thoughtful algorithm to do it, for example some Linear Recommendation model (like EASE or SANSA)

Thing to take into account

I am currently working as Recommendation System Engineer, and i think i could try to approach this problem and make a cool recommendations for people to find new manga.
But to do this, a couple of thing should be connsidered

Computational Resources. Are we constrained to only the app itself, or does the team have some sort of server/cluster, where we could set up a services to train model and run API for them?
Does the app somehow collect statistics for users now? Like with which manga has user interacted, when that happened etc.
Would be great if someone could have a discussion with me on this point :-)

Acknowledgements

This is not a duplicate of an existing issue. Please look through the list of open issues before creating a new one.

Koitharu · 2024-12-19T15:07:21Z

I guess using AI for recommendation is overengineering: this is a secondary functionality that will require a lot of resources. But any ideas are welcome

MariusAlbrecht · 2024-12-21T11:34:55Z

Like with which manga has user interacted, when that happened etc.

I don't want this app to track my every move and then send that data somewhere on the web.

peachblacky · 2024-12-21T16:36:41Z

Like with which manga has user interacted, when that happened etc.

I don't want this app to track my every move and then send that data somewhere on the web.

User data is always anonymous during statistics collection, especially when the app is open-source... So nothing will be leaked, all anonymization is easy do be done

Additionally, we can just ask users for permission to collect their data for statistics.
Everybody will still have recommendations, but training of models will only be done on those who have gave confirmation

peachblacky · 2024-12-21T16:41:17Z

I guess using AI for recommendation is overengineering: this is a secondary functionality that will require a lot of resources. But any ideas are welcome

Advanced models could actually lead to some serious resource demand.
But under AI there is a very wide (in terms of resource "weight") variety of options.
There is some simple statistics models (i think, some of them are even able to be implemented on-edge).

I think since there is already a "Recommendations" block in the app, it should at least contain some thoughtful list of titles, and not some random duplicated stuff, which can be found there now.
Current Recommendations sections is practically useless for user, in my opinion, user still in need to sort all irrelevant stuff manually...

I think that such functionality will greatly engage users to explore manga more and use Kotatsu more)

MariusAlbrecht · 2024-12-22T19:07:15Z

User data is always anonymous during statistics collection, especially when the app is open-source... So nothing will be leaked, all anonymization is easy do be done

I'd be careful here, anonymising data can be very hard. Large sets of usage data can, depending on the circumstances, quite easily identify an individual even when the data doesn't contain anything that directly identifies said individual.

Additionally, we can just ask users for permission to collect their data for statistics.
Everybody will still have recommendations, but training of models will only be done on those who have gave confirmation

That, I'm happy with. Earlier it sounded like the model should be deployed on a central server with clients sending requests (including their personal usage data) to that server to get recommendations.

MariusAlbrecht · 2024-12-22T19:11:32Z

Maybe we could also just rely on the recommendations provided by some sources instead of coming up with our own?
I could, for example, imagine the following scheme to not be great but "good enough":

somehow figure out a couple of topics (genres, pieces of media, whatever) the user likes. For example, look at all favourites and get the top 5 genres and look at tracker entries and get the top 5 highest ratings.
somehow figure out sources which could provide reasonable recommendations for those topics. For example, use the top 5 most-used sources which provide reccomendations
get the recommendations for the previously determined topics from the previously determined sources
do some filtering on the results. For example, rank mangas which occurred multiple times higher and consider ratings on the source site

peachblacky · 2024-12-23T05:11:50Z

User data is always anonymous during statistics collection, especially when the app is open-source... So nothing will be leaked, all anonymization is easy do be done

I'd be careful here, anonymising data can be very hard. Large sets of usage data can, depending on the circumstances, quite easily identify an individual even when the data doesn't contain anything that directly identifies said individual.

Additionally, we can just ask users for permission to collect their data for statistics.
Everybody will still have recommendations, but training of models will only be done on those who have gave confirmation

That, I'm happy with. Earlier it sounded like the model should be deployed on a central server with clients sending requests (including their personal usage data) to that server to get recommendations.

Well, regarding the second point - we actually need to store some info regarding user behaviour to get recommendations, but i might be a lower magnitude of data collection
It would be best to just ask user to collect data and then show recommender block

peachblacky · 2024-12-23T05:13:06Z

Maybe we could also just rely on the recommendations provided by some sources instead of coming up with our own? I could, for example, imagine the following scheme to not be great but "good enough":

somehow figure out a couple of topics (genres, pieces of media, whatever) the user likes. For example, look at all favourites and get the top 5 genres and look at tracker entries and get the top 5 highest ratings.

somehow figure out sources which could provide reasonable recommendations for those topics. For example, use the top 5 most-used sources which provide reccomendations

get the recommendations for the previously determined topics from the previously determined sources

do some filtering on the results. For example, rank mangas which occurred multiple times higher and consider ratings on the source site

Sound interesting, but need to listen to some people who worked with sources to understand the complexity of such solution

Perhaps, we will still need to store some statistics data, as least locally

MariusAlbrecht · 2024-12-24T11:14:16Z

The majority of usage statistics are already being stored locally. The only thing I can think of that isn't are the ratings the user leaves with the tracking services.

We also are already getting recommendations from the sources. That might even mean that we don't need any new functionality in the parsers (which is important as that'd be a lot of work)

Caellian · 2025-02-03T07:28:58Z

Computational Resources. Are we constrained to only the app itself, or does the team have some sort of server/cluster, where we could set up a services to train model and run API for them?

Any backend requirements can (and likely will) break the functionality in the future. Any free tier isn't enough to cover the metrics that would need to be collected (besides maybe gradually building a model over time). Hosting user information is probably out of the picture as well.

Even with cheapest hosting, assuming the total cost ends up being as low as 5 USD/mo (which it won't due to number of users), there's no guarantee the owner could/would want to continuously pay for the hosting. The moment they stop, the recommendation feature is reverted to old behavior (or broken).

So basically what @MariusAlbrecht said: something that recommends based on history/favorites by weighing most read topics and showing random top results of those.

peachblacky added the feature request label Dec 19, 2024

Caellian mentioned this issue Feb 3, 2025

Suggestions #1254

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Personalized Recommendations of Manga #1220

Personalized Recommendations of Manga #1220

peachblacky commented Dec 19, 2024 •

edited

Loading

Koitharu commented Dec 19, 2024

MariusAlbrecht commented Dec 21, 2024 •

edited

Loading

peachblacky commented Dec 21, 2024 •

edited

Loading

peachblacky commented Dec 21, 2024 •

edited

Loading

MariusAlbrecht commented Dec 22, 2024

MariusAlbrecht commented Dec 22, 2024 •

edited

Loading

peachblacky commented Dec 23, 2024

peachblacky commented Dec 23, 2024

MariusAlbrecht commented Dec 24, 2024 •

edited

Loading

Caellian commented Feb 3, 2025

Personalized Recommendations of Manga #1220

Personalized Recommendations of Manga #1220

Comments

peachblacky commented Dec 19, 2024 • edited Loading

Describe your suggested feature

Acknowledgements

Koitharu commented Dec 19, 2024

MariusAlbrecht commented Dec 21, 2024 • edited Loading

peachblacky commented Dec 21, 2024 • edited Loading

peachblacky commented Dec 21, 2024 • edited Loading

MariusAlbrecht commented Dec 22, 2024

MariusAlbrecht commented Dec 22, 2024 • edited Loading

peachblacky commented Dec 23, 2024

peachblacky commented Dec 23, 2024

MariusAlbrecht commented Dec 24, 2024 • edited Loading

Caellian commented Feb 3, 2025

peachblacky commented Dec 19, 2024 •

edited

Loading

MariusAlbrecht commented Dec 21, 2024 •

edited

Loading

peachblacky commented Dec 21, 2024 •

edited

Loading

peachblacky commented Dec 21, 2024 •

edited

Loading

MariusAlbrecht commented Dec 22, 2024 •

edited

Loading

MariusAlbrecht commented Dec 24, 2024 •

edited

Loading