Skip to content

[Bug] predict_with_generate=True causes 1D dimension collapse of eval_prediction.predictions in custom metrics (DDP) #8256

@coldchair

Description

@coldchair

Checklist / 检查清单

  • I have searched existing issues, and this is a new bug report. / 我已经搜索过现有的 issues,确认这是一个新的 bug report。

Bug Description / Bug 描述

Version
4.0.0.dev0

Describe the bug
When --predict_with_generate true is enabled in a multi-GPU (DDP) environment, the eval_prediction.predictions passed to a custom EvalMetrics subclass collapses from a 2D tensor (BatchSize, SeqLen) into a flattened 1D array.

This causes Serializer.from_tensor(preds) to fail with _pickle.UnpicklingError or IndexError because the data structure is corrupted during the gathering process.

Root Cause
In Seq2SeqTrainer.prediction_step, pad_sequence only performs local padding within each GPU. If GPU 0 and GPU 1 have different maximum sequence lengths for their local batches, accelerator.gather cannot stack them and flattens the tensors instead.

How to Reproduce / 如何复现

Run SFT with --predict_with_generate true on 2+ GPUs.
Use a custom EvalMetrics that accesses eval_prediction.predictions.
Observe that preds.ndim is 1 and the evaluation crashes.

Additional Information / 补充信息

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions