Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible problem with requests timing out #106

Open
BRMatt opened this issue Aug 19, 2013 · 7 comments
Open

Possible problem with requests timing out #106

BRMatt opened this issue Aug 19, 2013 · 7 comments

Comments

@BRMatt
Copy link
Contributor

BRMatt commented Aug 19, 2013

It seems l2met is taking a long time to accept some requests, causing them to timeout and trigger Heroku 503 errors. The logs show the 30 seconds or so leading up to the errors, is there any other information that'd be useful?

The instance is running cfa2fc0 on heroku with a 2 line change to measure the size of deadline misses.

@ryandotsmith
Copy link
Owner

@BRMatt Interesting. Can you show me your Procfile and any relevant environment variables?

@BRMatt
Copy link
Contributor Author

BRMatt commented Aug 19, 2013

Procfile:

web: ./l2met -receiver=true -outlet=true -port=$PORT -outlet-ttl=10s -recv-deadline=4

Added the -recv-deadline line this morning after those errors were reported. The env vars are pretty standard - METCHAN_URL, APP_NAME and SECRETS.

@BRMatt
Copy link
Contributor Author

BRMatt commented Aug 25, 2013

Updated the gist with another onslaught of 5XX errors. It seems the app received a large number of log payloads in a short amount of time and was unable to cope with new connections?

@ryandotsmith
Copy link
Owner

@BRMatt How strange. Can you add heroku runtime metrics onto this app? I would be curious to see the metal metrics on this dyno during these turbulent times.

From the logs, it looks like you are doing less than 100 http requests per second. I have benchmarked l2met at much higher throughput.

@BRMatt
Copy link
Contributor Author

BRMatt commented Aug 25, 2013

Sure thing, here're some more logs of about 2 minutes prior to some request timeouts. By the looks of things the dyno's not under any stress at all.

@ryandotsmith
Copy link
Owner

Do you have to perform any actions to bring the system back to a healthy state? How do you recover? Also, how are you noticing these problems?

@BRMatt
Copy link
Contributor Author

BRMatt commented Aug 26, 2013

Do you have to perform any actions to bring the system back to a healthy state? How do you recover?

We don't do anything, the errors are often sporadic. Some seem to be near a dyno restart (in which case we get H13, "Connection closed without response" errors), though most appear to happen randomly.

Also, how are you noticing these problems?

The logs are piped through papertrail and it emails me when l2met returns status codes other than 200.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants