refer to the draft [PR 2343](https://github.com/pytorch/torchtitan/pull/2343) https://github.com/pytorch/pytorch/pull/170978#issuecomment-3865412916 π **[Detailed torch.sort(indices_dtype) Performance Benchmark Results & Latency Analysis](https://github.com/pytorch/pytorch/pull/170978#issuecomment-3865412916)** ### β Status & Todo - [x] Integrate `_indices_dtype_by_sort_size` logic into `torchtitan.distributed.deepep`. - [x] Update `TokenReorderer` in `torchtitan.models.moe`. - [ ] **Work In Progress** [pytorch/pytorch#170978](https://github.com/pytorch/pytorch/pull/170978) - [ ] **optional** When sort is merged, topk can be modified to support indices_dtype in out-variant op. - [ ] **optional** Support general integer indexing to enjoy fully indices dtype feature of torch.sort.