Skip to content

Actions: ayushdg/NeMo-Curator

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
54 workflow runs
54 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[REVIEW] Fix Sem Dedup (#478)
Create PR to main with cherry-pick from release #2: Commit 7cfda44 pushed by ayushdg
January 14, 2025 19:46 18s main
January 14, 2025 19:46 18s
[REVIEW] Fix Sem Dedup (#478)
GPU CI/CD #2: Commit 7cfda44 pushed by ayushdg
January 14, 2025 19:46 6s main
January 14, 2025 19:46 6s
[REVIEW] Fix Sem Dedup (#478)
Test Python package #23: Commit 7cfda44 pushed by ayushdg
January 14, 2025 19:46 4m 50s main
January 14, 2025 19:46 4m 50s
Clean up internal column logic in _run_classifier_helper function (…
Build, test, and publish a PyPi wheel (to testpypi) #1: Commit 694970a pushed by ayushdg
January 6, 2025 20:50 18s main
January 6, 2025 20:50 18s
Clean up internal column logic in _run_classifier_helper function (…
Create PR to main with cherry-pick from release #1: Commit 694970a pushed by ayushdg
January 6, 2025 20:50 16s main
January 6, 2025 20:50 16s
January 6, 2025 20:50 6s
Clean up internal column logic in _run_classifier_helper function (…
Test Python package #22: Commit 694970a pushed by ayushdg
January 6, 2025 20:50 4m 32s main
January 6, 2025 20:50 4m 32s
[REVIEW] Speedup Connected Components (#302)
Test Python package #21: Commit 36fcf50 pushed by ayushdg
October 30, 2024 18:47 5m 19s main
October 30, 2024 18:47 5m 19s
Write to file without including "filename" column (#317)
Test Python package #20: Commit 7d7767b pushed by ayushdg
October 23, 2024 23:49 5m 12s main
October 23, 2024 23:49 5m 12s
Fix enabling spilling by enabling it on client process (#275)
Test Python package #19: Commit d9c414b pushed by ayushdg
October 3, 2024 18:43 5m 11s main
October 3, 2024 18:43 5m 11s
Enabled nightly build using RAPIDS nightly (#237)
Test Python package #18: Commit c89c115 pushed by ayushdg
September 19, 2024 21:11 5m 10s main
September 19, 2024 21:11 5m 10s
Add option to skip false positive checks during Fuzzy Deduplication (…
Test Python package #17: Commit 982e7ec pushed by ayushdg
September 6, 2024 21:24 9m 38s main
September 6, 2024 21:24 9m 38s
Change combinations() to pairwise() when constructing a list of edges in _BucketsToEdges
Test Python package #16: Pull request #2 opened by yury-tokpanov
September 3, 2024 18:50 6m 3s patch-1
September 3, 2024 18:50 6m 3s
Fix a few bugs in fuzzy dedup and docs (#156)
Test Python package #15: Commit e654281 pushed by ayushdg
July 30, 2024 00:28 5m 27s main
July 30, 2024 00:28 5m 27s
Enable Sem-dedup (#130)
Test Python package #14: Commit e557ee3 pushed by ayushdg
July 8, 2024 19:40 5m 15s main
July 8, 2024 19:40 5m 15s
Fix #116. Fix task-decontamination broken links (#117)
Test Python package #13: Commit 462b964 pushed by ayushdg
June 18, 2024 22:16 6m 31s main
June 18, 2024 22:16 6m 31s
Update index.rst (#113)
Test Python package #12: Commit f1e993b pushed by ayushdg
June 14, 2024 00:03 5m 49s main
June 14, 2024 00:03 5m 49s
Applying SEO Best Pratices (#104)
Test Python package #11: Commit 38b0ac1 pushed by ayushdg
June 12, 2024 19:57 5m 33s main
June 12, 2024 19:57 5m 33s
Update readme (#93)
Test Python package #10: Commit e814736 pushed by ayushdg
June 3, 2024 17:44 5m 28s main
June 3, 2024 17:44 5m 28s
Fuzzy Dedup: Use text_field instead of hardcoded text column (#74)
Test Python package #9: Commit 8755cdc pushed by ayushdg
May 22, 2024 23:18 5m 21s main
May 22, 2024 23:18 5m 21s
Remove argparse from get_client function signature (#12)
Test Python package #8: Commit 5e46cd8 pushed by ayushdg
May 22, 2024 21:49 5m 18s main
May 22, 2024 21:49 5m 18s
Align extract_partitioning_index logic with upstream shuffling (#60)
Test Python package #7: Commit ecd4f4b pushed by ayushdg
May 15, 2024 20:46 5m 12s main
May 15, 2024 20:46 5m 12s
[Tutorials] Add a tutorial for PEFT data curation (#45)
Test Python package #6: Commit 06ee061 pushed by ayushdg
May 10, 2024 18:33 7m 59s main
May 10, 2024 18:33 7m 59s
Fix issue #43 (empty files creation) and improve reading/writing spee…
Test Python package #5: Commit 72b9775 pushed by ayushdg
May 9, 2024 17:09 5m 26s main
May 9, 2024 17:09 5m 26s
High level fuzzy duplicates module (#46)
Test Python package #4: Commit 52270ea pushed by ayushdg
May 3, 2024 23:56 5m 11s main
May 3, 2024 23:56 5m 11s