Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better output if the optimization process gets terminated during the execution of an epoch. #180

Closed

Conversation

gaurav-singh1998
Copy link
Contributor

@gaurav-singh1998 gaurav-singh1998 commented Mar 15, 2020

Epoch 1/54
1563/1875 [===================================================================================>................] 83% - ETA: 0s - loss: 249.826

After the feature which has been implemented in this pull request the output is like

Epoch 1/54
1562/1875 [===================================================================================>................] 83% - ETA: 0s - loss: 249.985
1563/1875 [===================================================================================>................] 83% - ETA: 0s - loss: 249.826
Optimization terminated because of the entire dataset not being passed to the optimizer.

As it can be seen that there is an improvement in the output shown but if there is any way if the progress bar is shown only once rather than being shown twice as it can be seen from the above-pasted output kindly suggest it. Thanks.

@mlpack-bot
Copy link

mlpack-bot bot commented Mar 15, 2020

Thanks for opening your first pull request in this repository! Someone will review it when they have a chance. In the mean time, please be sure that you've handled the following things, to make the review process quicker and easier:

  • All code should follow the style guide
  • Documentation added for any new functionality
  • Tests added for any new functionality
  • Tests that are added follow the testing guide
  • Headers and license information added to the top of any new code files
  • HISTORY.md updated if the changes are big or user-facing
  • All CI checks should be passing

Thank you again for your contributions! 👍

@shrit
Copy link
Member

shrit commented Mar 17, 2020

@gaurav-singh1998 Just a small question, Did you keep the macroENS_PRINT_INFO ON when it printed the progress bar twice? if yes, Did both of them were progressing forward at the same time? or you had only the second one printed out when the first one finished?

@gaurav-singh1998
Copy link
Contributor Author

gaurav-singh1998 commented Mar 18, 2020

Hi, @shrit no I didn't keep the macro ENS_PRINT_INFO ON while I tested this but when I did I got the result as,

Epoch 1/1
156/1875 [========>...........................................................................................] 8% - ETA: 15s - loss: 2049.38
157/1875 [========>...........................................................................................] 8% - ETA: 10s - loss: 2039.62
SGD: maximum iterations (10000) reached; terminating optimization.ssed to the optimizer.

As it can be seen that this line and the line Optimization terminated because of the entire dataset not being passed to the optimizer. do not get flushed properly on the screen so to remedy this I changed

output << "\n" << "Optimization terminated because of the entire "
                       << "dataset not being passed to the optimizer."
                       << "\r";

to

output << "\n" << "Optimization terminated because of the entire "
                       << "dataset not being passed to the optimizer."
                       << "\n" << "\r";

and the output I got was

Epoch 1/1
156/1875 [========>...........................................................................................] 8% - ETA: 33s - loss: 2049.38
Optimization terminated because of the entire dataset not being passed to the optimizer.
157/1875 [========>...........................................................................................] 8% - ETA: 26s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.
SGD: maximum iterations (10000) reached; terminating optimization.

In my opinion, we should either scrape this idea of including the print message in the ProgressBar() callback function and solve this issue by just uncommenting this line because the optimizer message, in my opinion, is clear enough or we can include this check if ENS_PRINT_INFO ON is not defined. But this still wouldn't solve the problem of progress bar getting printed twice. Let me know your opinion on this and if I am wrong somewhere. Thanks.

@shrit
Copy link
Member

shrit commented Mar 18, 2020

@gaurav-singh1998, Actually I would prefer adding a line here explaining the message of SGD that is printed from a callback. Because I would not like to have all the output of the library for only one line especially if I am using it inside other libraries and other software. Imagine in a worse case, I might not have access to the stdout in an embedded system. What do you think?

@gaurav-singh1998
Copy link
Contributor Author

Oh okay! thanks for clarifying that @shrit. So we need to find a way so that the line gets printed only once.

{
if (progress == (size_t)((double)(optimizer.MaxIterations()) /
(double)(function.NumFunctions()) * 100) &&
(int)(steps * optimizer.BatchSize() -
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @shrit I made a minor change in this line changing step to steps and got the expected output i.e.

Epoch 1/6
157/1875 [========>...........................................................................................] 8% - ETA: 0s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.

and now also if ENS_PRINT_INFO is turned on then the output is,

157/1875 [========>...........................................................................................] 8% - ETA: 0s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.
SGD: maximum iterations (10000) reached; terminating optimization.

Let me know what you think of this approach. Thanks.

@gaurav-singh1998
Copy link
Contributor Author

Hi, @shrit have a look at this whenever you get the chance. Thanks.

@rcurtin
Copy link
Member

rcurtin commented Apr 9, 2020

Hey @gaurav-singh1998, thanks for doing this, I think it will be a lot clearer to users what's going on. But I wonder if we can modify the way the progress bar looks so that it's a bit more obvious what happened, maybe using a different character than =. Here's a few ideas: (I don't think they're the right width, but you get the idea)

Epoch 1/6
157/1875 [========------------------------------------------------------------] 8% - ETA: 0s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.
Epoch 1/6
157/1875 [========|__________________________________________] 8% - ETA: 0s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.
Epoch 1/6
157/1875 [========|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] 8% - ETA: 0s - loss: 2039.62
Optimization terminated because of the entire dataset not being passed to the optimizer.

I'm not sure which one of those is best. I might lean towards the x? What do others think?

Also, we could probably improve the message. It's true that the optimization is indeed terminated because the entire dataset did not get passed to the optimizer, but a user will probably want to know why. For now we could perhaps use a simpler, more generic message like Optimization finished before end of epoch.---the term terminated could imply that it was a failure or that something's wrong, and that may not be the case.

For a longer-term solution, we could modify the EndOptimization callback to also take a const std::string& reason parameter, that would contain a little text on why the optimization has ended. There might need to be a little bit more change to properly pass a reason for termination from anywhere, I'm not sure. So maybe that's not the best way to do it. But maybe I sparked some ideas, I don't know. :)

@gaurav-singh1998
Copy link
Contributor Author

Hi @rcurtin, sorry for being so late on this. As per your comment, I have made the necessary changes. Now the output, if the entire dataset is not passed is,

Epoch 1/6
158/1875 [========|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] 8% - ETA: 0s - loss: 2026.82
Optimization finished before the end of an epoch because of the entire dataset not being passed to the optimizer.

About the third point of your comment, stating to modify the EndOptimization() please give some more time to think about a solution.

@mlpack-bot
Copy link

mlpack-bot bot commented May 16, 2020

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label May 16, 2020
@gaurav-singh1998
Copy link
Contributor Author

Hi, @rcurtin how to keep this open?

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gaurav-singh1998, sorry for the slow review on this. Thanks for working on it! It would be really nice to get this fixed. I took a look through, but I wonder if some refactoring of the approach might be necessary here. Let me know what you think of my comments.

Also, don't forget to add to HISTORY.md. :)

@@ -21,7 +21,7 @@
namespace ens {

/**
* CMA-ES - Covariance Matrix Adaptation Evolution Strategy is s a stochastic
* CMA-ES - Covariance Matrix Adaptation Evolution Strategy is a stochastic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks! 👍

objective / (double) step << "\r";
output << "\n" << "Optimization finished before the end of an epoch "
<< "because of the entire dataset not being passed to "
<< "the optimizer." << "\n" << "\r";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that we could improve this error message? To a user I don't think it's clear what's going on. I think maybe something more like this could be clearer:

Optimization terminated; maximum iterations reached before the end of an epoch.

Also, why the \r at the end of the line? \n or std::endl should be sufficient. I don't imagine that we are printing anything after termination such that the line needs to be rolled back.


if (optimizer.MaxIterations() < function.NumFunctions() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is the right way to check this condition. This will fail if the termination happens during the second epoch. Here's some example code:

#include <ensmallen.hpp>

int main()
{
  arma::mat data(10, 1000, arma::fill::randu);
  arma::Row<size_t> responses(1000);
  for (size_t i = 0; i < 500; ++i)
    responses[i] = 0;
  for (size_t i = 500; i < 1000; ++i)
    responses[i] = 1;

  ens::test::LogisticRegressionFunction<> lrf(data, responses, 0.1);

  ens::Adam adam;
  adam.BatchSize() = 1;
  adam.MaxIterations() = 1500;
  arma::mat coordinates = arma::randu<arma::mat>(1, 11);
  adam.Optimize(lrf, coordinates, ens::ProgressBar());
}

If you try running that on this branch, you'll see:

$ ./test && echo ""
Epoch 1/2
1000/1000 [====================================================================================================] 100% - 0s 0ms/step - loss: 1.23509
Epoch 2/2
500/1000 [==================================================>.................................................] 50% - ETA: 0s - loss: 0.836281
$

(The && echo "" is necessary because the last \r is printed unnecessarily.)

I wonder if we need to do this in a separate way, where a final call to this callback happens when the method terminates. Perhaps in the destructor of ProgressBar? I'm not sure if that could work, but it might.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants