Commit 96bdf12
Handle label imbalance in binary classification tasks on text benchmark (#376)
Labels in the text benchmarks are imbalanced and weighting the positive
labels improves performance.
Experiments done on `fake` dataset (5% positive labels) with
`text_embedded` and `RoBERTa` encodings:
- `ResNet` result changes 91.1% -> 93.4%
- `FTTransformer` result remains unchanged
- `Trompt` result changes 95.2% -> 95.8%
The differences were even more stark with distilled roberta, but we
aren't reporting those anywhere so I didn't note them down.
More results are pending
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>1 parent 893678f commit 96bdf12
1 file changed
+2
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
457 | 457 | | |
458 | 458 | | |
459 | 459 | | |
460 | | - | |
| 460 | + | |
| 461 | + | |
461 | 462 | | |
462 | 463 | | |
463 | 464 | | |
| |||
0 commit comments