Skip to content
This repository has been archived by the owner on Feb 18, 2021. It is now read-only.

Advertisement Timeout SLA #11

Open
rssathe opened this issue Sep 24, 2015 · 3 comments
Open

Advertisement Timeout SLA #11

rssathe opened this issue Sep 24, 2015 · 3 comments

Comments

@rssathe
Copy link

rssathe commented Sep 24, 2015

{
  "log": "{\"error\":{\"stack\":\"TchannelRequestTimeoutError: request timed out after 500ms (limit was 500ms)\\n    at Object.createError [as RequestTimeoutError] (/home/udocker/chronotrigger/node_modules/tchannel/node_modules/error/typed.js:31:22)\\n    at V2OutRequest.onTimeout (/home/udocker/chronotrigger/node_modules/tchannel/out_request.js:555:31)\\n    at TimeHeap.callExpiredTimeouts (/home/udocker/chronotrigger/node_modules/tchannel/time_heap.js:169:14)\\n    at TimeHeap.drainExpired (/home/udocker/chronotrigger/node_modules/tchannel/time_heap.js:160:14)\\n    at TimeHeap.onTimeout (/home/udocker/chronotrigger/node_modules/tchannel/time_heap.js:144:10)\\n    at onTimeout [as _onTimeout] (/home/udocker/chronotrigger/node_modules/tchannel/time_heap.js:135:14)\\n    at Timer.listOnTimeout [as ontimeout] (timers.js:112:15)\",\"type\":\"tchannel.request.timeout\",\"message\":\"request timed out after 500ms (limit was 500ms)\",\"id\":871,\"start\":1443038570760,\"elapsed\":500,\"timeout\":500,\"logical\":true,\"name\":\"TchannelRequestTimeoutError\",\"fullType\":\"tchannel.request.timeout\"},\"serviceName\":\"chronotrigger\",\"level\":\"error\",\"message\":\"HyperbahnClient: advertisement failure, marking server as sick\"}\n",
  "stream": "stderr",
  "time": "2015-09-23T20:02:51.260517901Z"
}

@Raynos @jcorbin Deployment Process results in hyperbahn timeouts

@Raynos
Copy link
Contributor

Raynos commented Sep 24, 2015

When we deploy hyperbahn we see an increase in advertise timeouts in edge clients.

This is not acceptable. We should figure out how to stick within our 500ms SLA using more tricks and/or drain.

@jcorbin
Copy link
Contributor

jcorbin commented Sep 24, 2015

@Raynos , it could be that we shouldn't be exempting hyperbahn protocol itself from drain: https://github.com/uber/hyperbahn/blob/master/app.js#L78 ; instead you'd see elevated declined responses to advertisements during a deploy. Advertises being declined are okay, and shouldn't themselves necessarily trigger a re-ad, but what should and what would help such case would be the re-ad on re-conn work we keep mentioning.

@Raynos
Copy link
Contributor

Raynos commented Sep 24, 2015

@jcorbin OOPS. Yes we should not exclude it from drain.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants