-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add S3Fifo eviction for memory #303
Conversation
@MrCroxx Thank for you maintaining foyer. I found this library is very well designed and are currently evaluating if it could be used in https://github.com/chroma-core/chroma for various caching scenarios. Chroma is one of the most popular embedding stores for RAG. Chroma will use both memory and SSD for caches for different scenarios. We will also need different eviction policies for different scenarios. S3Fifo is a very promising policy and that's why I would like to collaborate with you on this. The current PR is till a draft and are subject to change and improvements. I would love to discuss with you more on how we can collaborate and discuss a bit more on how to use foyer as a hybrid cache. Looking forward to speaking with you! BTW, I used to work at PingCAP before joining Chroma. |
Hi @Ishiihara . Thank you for your attention and contribution to foyer. I apologize for just seeing your PR now as I was on call these days. It will be an honor to have foyer as a component of Chroma. To be honest, I'm not very familiar with what the workload of Chroma is like. And foyer still lacks the ability to handle small KVs as efficiently as it could be. I'm willing to hear from you about where you will use foyer and how can we make foyer better. 🥰 I'm glad to review this PR carefully after my shift. And I glanced at it briefly, it is implemented very well. And just a reminder, you can run And I'm glad to meet my former colleague here. I'm looking forward to discussing more via IM or offline. But I'm really careful to leave my personal IM on Github. I'll send you an email with my contact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @Ishiihara 🥰. This PR generally LGTM. There is only some code styles don't match foyer. I can help fix them if you like, cause I've also waiting for a S3Fifo implementation for a long time. I've not finished reviewing (I'm still on-calling). I'll finish it soon.
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #303 +/- ##
==========================================
+ Coverage 77.10% 77.40% +0.30%
==========================================
Files 61 62 +1
Lines 7478 7671 +193
==========================================
+ Hits 5766 5938 +172
- Misses 1712 1733 +21 ☔ View full report in Codecov by Sentry. |
Hi @Ishiihara , thanks again for your contribution. I've modified the related codes to make them more compatible with foyer. This PR still needs some unit tests with s3fifo (e.g. https://github.com/MrCroxx/foyer/blob/main/foyer-memory/src/eviction/lfu.rs#L425-L576). Could you please add some? 🙏 Then the PR is ready to be merged. |
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
Hi @xiaguan @Little-Wallace , maybe you are also interested in this PR. 🥰 |
hit ratio bench result:
|
Thanks so much for your contribution. I've actually tried to code up an S3FIFO before and ended up failing miserably 😭 . |
Regarding a more long-term integration with foyer. Could you briefly describe your workload, cache size, expected goals, etc? |
Hi @xiaguan . Are you talking about disk cache? If it is, foyer doesn't use LSM trees as the disk cache. We can discuss on it in another issue. (Or maybe I should transfer this kind of issues into github discussions. foyer repo has already enabled it but is not used properly.) |
I tryied to impl ghost queue based on your code. |
Thanks @xiaguan . Can you send another PR with your changes? I'm interested in it. Let's find out why. |
Signed-off-by: MrCroxx <mrcroxx@outlook.com>
* Add S3Fifo eviction for memory * fix: refine s3fifo, fix some bugs Signed-off-by: MrCroxx <mrcroxx@outlook.com> * fix: fix license Signed-off-by: MrCroxx <mrcroxx@outlook.com> * refactor: expose s3fifo, fix hakari, add s3fifo fuzzy test Signed-off-by: MrCroxx <mrcroxx@outlook.com> * bench: add s3fifo to hit ratio bench Signed-off-by: MrCroxx <mrcroxx@outlook.com> * test: add s3fifo uts Signed-off-by: MrCroxx <mrcroxx@outlook.com> --------- Signed-off-by: MrCroxx <mrcroxx@outlook.com> Co-authored-by: MrCroxx <mrcroxx@outlook.com>
What's changed and what's your intention?
Checklist
make all
(ormake fast
instead if the old tests are not modified) in my local environment.Related issues or PRs (optional)