It appears that by calling Subset.dataset in train_ensemble, the whole dataset is used instead of the split ones. <img width="387" alt="Image" src="https://github.com/user-attachments/assets/3549882c-fb29-4e9c-a424-0835b1803287" /> <img width="235" alt="Image" src="https://github.com/user-attachments/assets/c8fb7057-535b-4b54-9be1-5319a7504ce5" />