Fix LSTM benchmark to evaluate on test set by YounesBouhadjar · Pull Request #263 · NeuroBench/neurobench

YounesBouhadjar · 2025-06-20T10:47:07Z

Fixes #262
The benchmark is corrected to evaluate LSTM on the test set, using the hidden states warmed up at the end of training.

Dev

…the end of training

codecov · 2025-06-20T10:49:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.63%. Comparing base (aefdbaf) to head (a2a7026).
Report is 5 commits behind head on dev.

Additional details and impacted files

@@           Coverage Diff           @@
##              dev     #263   +/-   ##
=======================================
  Coverage   77.63%   77.63%           
=======================================
  Files          43       43           
  Lines         805      805           
  Branches      119      119           
=======================================
  Hits          625      625           
  Misses        133      133           
  Partials       47       47

Flag	Coverage Δ
unittests	`77.63% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jasonlyik · 2025-06-20T21:32:11Z

Thanks @YounesBouhadjar, I think there was just one small issue -

When the hidden states are populated in the last epoch of the training at L139, there is still an optimizer step / weight update after that.

So I believe that the hidden states initialized for testing would be using the weights after the second-to-last update, rather than the weights after the final update.

I changed it to re-run the forward pass on the train data before the benchmark run on the test set, and also re-did the benchmark, but the difference in results is negligible.

jasonlyik

Update corrects the issue with the test dataset for the LSTM Mackey-Glass benchmark.

ben9809 and others added 2 commits March 28, 2025 09:20

Merge pull request #260 from NeuroBench/dev

097ec37

Dev

Fix LSTM benchmark to evaluate on test set using warmed-up states at …

a7df52c

…the end of training

YounesBouhadjar requested a review from jasonlyik June 20, 2025 10:47

YounesBouhadjar mentioned this pull request Jun 20, 2025

Using train_set for evaluation in MackeyGlass LSTM benchmark? #262

Closed

update re-run of training set to init hidden states

e4b3f83

jasonlyik approved these changes Jun 20, 2025

View reviewed changes

Merge branch 'dev' into correct_mg_bench

a2a7026

jasonlyik merged commit 4fa2cfb into dev Jun 20, 2025
6 checks passed

jasonlyik deleted the correct_mg_bench branch December 30, 2025 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix LSTM benchmark to evaluate on test set#263

Fix LSTM benchmark to evaluate on test set#263
jasonlyik merged 4 commits intodevfrom
correct_mg_bench

YounesBouhadjar commented Jun 20, 2025

Uh oh!

codecov bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

jasonlyik commented Jun 20, 2025

Uh oh!

jasonlyik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

YounesBouhadjar commented Jun 20, 2025

Uh oh!

codecov bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jasonlyik commented Jun 20, 2025

Uh oh!

jasonlyik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Jun 20, 2025 •

edited

Loading