FeatureUnion: error with transform_output="pandas" when transformer aggregates (index length mismatch)

### User Request

FeatureUnion not working when aggregating data and pandas transform output selected

#### Describe the bug
I would like to use `pandas` transform output and use a custom transformer in a feature union which aggregates data. When I'm using this combination I got an error. When I use default `numpy` output it works fine.

#### Steps/Code to Reproduce
```python
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn import set_config
from sklearn.pipeline import make_union
index = pd.date_range(start="2020-01-01", end="2020-01-05", inclusive="left", freq="H")
data = pd.DataFrame(index=index, data=[10] * len(index), columns=["value"])
data["date"] = index.date
class MyTransformer(BaseEstimator, TransformerMixin):
    def fit(self, X: pd.DataFrame, y: pd.Series | None = None, **kwargs):
        return self
    def transform(self, X: pd.DataFrame, y: pd.Series | None = None) -> pd.DataFrame:
        return X["value"].groupby(X["date"]).sum()
# This works.
set_config(transform_output="default")
print(make_union(MyTransformer()).fit_transform(data))
# This does not work.
set_config(transform_output="pandas")
print(make_union(MyTransformer()).fit_transform(data))
```

#### Expected Results
No error is thrown when using `pandas` transform output.

#### Actual Results
```
ValueError: Length mismatch: Expected axis has 4 elements, new values have 96 elements
```
(Full stack trace shows failure inside `sklearn/utils/_set_output.py` when assigning the original input index to the pandas output.)

#### Versions
Python 3.10.6; sklearn 1.2.1; pandas 1.4.4; numpy 1.23.5; macOS 11.3

### Researcher Specification

#### Root Cause
- In `sklearn/utils/_set_output.py`, when `transform_output="pandas"`, `_wrap_in_pandas_container` unconditionally sets the returned pandas object's index to the original input's index:
  ```python
  if index is not None:
      data_to_wrap.index = index
  ```
- If a transformer returns a pandas `DataFrame`/`Series` with a different number of rows (e.g., aggregated data), assigning the original input index raises `pandas` `ValueError` due to length mismatch.
- `FeatureUnion` and `ColumnTransformer` use `pd.concat` for pandas outputs, which aligns by index and can handle differing row counts, but the unconditional index assignment fails earlier.

#### Proposed Change (Minimal, Backward-Compatible)
- Modify `_wrap_in_pandas_container` to assign the original input index only when lengths match. Otherwise, preserve the transformer’s output index.
- Keep column naming behavior unchanged. For ndarray outputs, only apply the provided index when lengths match; otherwise, use default index to avoid `ValueError`.

Illustrative adjustment:
```python
if isinstance(data_to_wrap, pd.DataFrame):
    if columns is not None:
        data_to_wrap.columns = columns
    if index is not None:
        try:
            if len(index) == len(data_to_wrap):
                data_to_wrap.index = index
        except TypeError:
            pass
    return data_to_wrap
# For ndarray outputs
index_to_use = None
if index is not None:
    try:
        if len(index) == len(data_to_wrap):
            index_to_use = index
    except TypeError:
        pass
return pd.DataFrame(data_to_wrap, index=index_to_use, columns=columns)
```

#### Tests to Add/Update
- Unit tests in `sklearn/utils/tests/test_set_output.py`:
  - Preserve index when lengths differ for pandas `DataFrame`/`Series` outputs.
  - Ignore provided index for ndarray outputs when lengths differ.
  - Confirm alignment still happens when lengths match.
- Integration tests in `sklearn/tests/test_pipeline.py`:
  - `FeatureUnion` with an aggregate transformer under `transform_output="pandas"` should not raise and should preserve the aggregated index.
  - Confirm that equal-length outputs still align to the original index.
- Optional: `sklearn/compose/tests/test_column_transformer.py` similar aggregate case.

#### Reproduction and Observed Failure
- See code above; reproduces `ValueError` originating from `_wrap_in_pandas_container` assigning a mismatched index to a pandas output.

#### Risk and Compatibility Notes
- Behavior change is scoped to `transform_output="pandas"` and mismatched lengths.
- Equal-length behavior remains unchanged (still aligns to original index).
- Enables aggregate-style transformers with pandas outputs to pass through and be concatenated by index, consistent with `pd.concat` semantics already used by `FeatureUnion`/`ColumnTransformer`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FeatureUnion: error with transform_output="pandas" when transformer aggregates (index length mismatch) #72

User Request

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Researcher Specification

Root Cause

Proposed Change (Minimal, Backward-Compatible)

Tests to Add/Update

Reproduction and Observed Failure

Risk and Compatibility Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FeatureUnion: error with transform_output="pandas" when transformer aggregates (index length mismatch) #72

Description

User Request

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Researcher Specification

Root Cause

Proposed Change (Minimal, Backward-Compatible)

Tests to Add/Update

Reproduction and Observed Failure

Risk and Compatibility Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions