Skip to content

Conversation

@EddyLXJ
Copy link
Contributor

@EddyLXJ EddyLXJ commented Oct 27, 2025

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528

@meta-codesync
Copy link
Contributor

meta-codesync bot commented Oct 27, 2025

@EddyLXJ has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84660528.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 27, 2025
EddyLXJ added a commit to EddyLXJ/FBGEMM-1 that referenced this pull request Oct 27, 2025
Summary:
X-link: meta-pytorch/torchrec#3488

X-link: facebookresearch/FBGEMM#2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528
EddyLXJ added a commit to EddyLXJ/torchrec that referenced this pull request Oct 27, 2025
Summary:
X-link: pytorch/FBGEMM#5059


X-link: facebookresearch/FBGEMM#2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528
EddyLXJ added a commit to EddyLXJ/FBGEMM-1 that referenced this pull request Oct 27, 2025
Summary:

X-link: meta-pytorch/torchrec#3488

X-link: facebookresearch/FBGEMM#2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528
…#3490)

Summary:
X-link: pytorch/FBGEMM#5062


X-link: facebookresearch/FBGEMM#2070

Before KVZCH is using ID_COUNT and MEM_UTIL eviction trigger mode, both are very tricky and hard for model engineer to decide what num to use for the id count or mem util threshold. Besides that, the eviction start time is out of sync after some time in training, which can cause great qps drop during eviction. 

This diff is adding support for free memory trigger eviction. It will check how many free memory left every N batch in every rank and if free memory below the threshold, it will trigger eviction in all tbes of all ranks using all reduce. In this way, we can force the start time of eviction in all ranks.

Differential Revision: D85604160
Summary:
X-link: pytorch/FBGEMM#5059


X-link: facebookresearch/FBGEMM#2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528
EddyLXJ added a commit to EddyLXJ/FBGEMM-1 that referenced this pull request Oct 27, 2025
Summary:

X-link: meta-pytorch/torchrec#3488

X-link: facebookresearch/FBGEMM#2068

As title
If one table is using feature score eviction in one tbe, then all tables in this tbe need to use the same policy. Feature score eviction can support ttl based eviction now. This diff is adding support no eviction in feature score eviction policy.

Differential Revision: D84660528
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant