assertion hit after attaching dead coroutine #145

daurnimator · 2016-09-06T15:05:47Z

co = coroutine.create(function() end)
coroutine.resume(co)
cq = require"cqueues".new()
cq:attach(co)
cq:step()

lua: /home/daurnimator/src/cqueues/src/cqueues.c:2017: cqueue_resume: Assertion `nargs >= 0' failed.
Aborted (core dumped)

- Instead of re-pushing all arguments, just use lua_insert to get thread below args - Need to check the new stack to ensure we have enough slots

daurnimator · 2016-09-06T15:07:17Z

src/cqueues.c

 	} else {
 		nargs = lua_gettop(T->L);
-		if (status != LUA_YIELD) {
+		if (status == LUA_OK && lua_getstack(T->L, 0, &(lua_Debug){}) > 0) {


This little bit was adapted from the coroutine.status implementation: http://www.lua.org/source/5.3/lcorolib.c.html#luaB_costatus

I think the test should be status == LUA_OK && lua_getstack(...) <= 0, or even better status == LUA_OK && lua_getstack(...) != 1 IIUC, nargs is decremented because in the initial state we want to exclude the function that will be called from the number of arguments provided to lua_resume.

Alternatively, just remove the assertion and don't decrement nargs if it's 0. If we've been given a dead coroutine or a running coroutine then lua_resume() will return a descriptive error, right?

I think the test should be status == LUA_OK && lua_getstack(...) <= 0, or even better status == LUA_OK && lua_getstack(...) != 1

Why? does >0 vs <= 0 matter?
==> What I have is the same as what the lua coroutine library has.

This branch is what to do if the coroutine is already running; e.g.:

local cq = cqueues.new() local co = coroutine.create(function() print(cq:step()) end) cq:attach(co) coroutine.resume(co)

If we've been given a dead coroutine or a running coroutine then lua_resume() will return a descriptive error, right?

A running coroutine, yes.
However, a dead coroutine is not handled well. Note that lua itself checks for this: http://www.lua.org/source/5.3/lcorolib.c.html#auxresume

Why? does >0 vs <= 0 matter?
==> What I have is the same as what the lua coroutine library has.

The original code was assuming that if status != LUA_YIELD then it must be a newly created thread. That's why it was decrementing nargs--to exclude the function at the bottom of the stack (on the first invocation, lua_resume is like lua_pcall). lua_getstack returns 1 if the thread is running. We want to know whether the thread is in the initial state. If you look at the code in lcorolib.c, the non-yielding, suspended condition is only reached if status == LUA_OK && lua_getstack() <= 0. But using a condition of != 1 better matches the official documentation for lua_getstack.

As requested in wahern#145 (comment)

daurnimator · 2016-09-09T05:16:57Z

Changed.

wahern · 2016-09-09T23:52:37Z

The issue is triggering the assert. The simplest solution is to not decrement nargs below 0. For everything else, lua_resume will report the error for us. Any reason why we're literally copying and reimplementing the code from ldo.c:resume? lua_resume will push the exact same error message onto the stack, and AFAICT we'll create the exact same error context (see the default: case for the switch on the status returned from lua_resume) afterward.

If it's intended for improved diagnostic messages, shouldn't we put the check into cqueue_wrap so that it returns an error immediately? That way one would know what code was attaching the bogus coroutine.

daurnimator · 2016-09-11T05:36:29Z

The issue is triggering the assert. The simplest solution is to not decrement nargs below 0. For everything else, lua_resume will report the error for us. Any reason why we're literally copying and reimplementing the code from ldo.c:resume? lua_resume will push the exact same error message onto the stack, and AFAICT we'll create the exact same error context (see the default: case for the switch on the status returned from lua_resume) afterward.

It just felt really "icky" doing that.... if (nargs > 0) nargs--; it seems like so much could go wrong.

If it's intended for improved diagnostic messages, shouldn't we put the check into cqueue_wrap so that it returns an error immediately? That way one would know what code was attaching the bogus coroutine.

That wouldn't catch all cases: someone could :attach a coroutine, then resume it outside of cqueues, then :step(). Therefore I don't think its worth adding a check there.

wahern · 2016-10-19T09:24:42Z

Merged a different approach which removed the assertion. I was uncomfortable duplicating so much internal logic from the Lua VM.

But leaving open because it's still unresolved whether it's worthwhile to specifically detect and report a dead coroutine (i.e. one that has successfully terminated) for the diagnostic/debugging benefit.

Note 1: Cases other than a dead coroutine should be reported by lua_resume with the same messages as this pull request.
Note 2: The next release of Lua 5.3 will include a lua_resume which does dead coroutine detection, making the behavior more symmetric with coroutine.resume. Currently calling lua_resume on a dead coroutine generates an error message about calling a nil value, which is admittedly quite unhelpful.

daurnimator added bug component: core labels May 20, 2016

daurnimator added 2 commits September 7, 2016 00:29

cqueues:wrap: Remember to checkstack

8d6c5fe

- Instead of re-pushing all arguments, just use lua_insert to get thread below args - Need to check the new stack to ensure we have enough slots

src/cqueues: Detect if trying to resume a running or dead coroutine

b8041a6

daurnimator reviewed Sep 6, 2016
View reviewed changes

src/cqueues.c: Use != 0 instead of > 0

80aef12

As requested in wahern#145 (comment)

wahern added a commit that referenced this pull request Oct 17, 2016

add daurnimator's regression test for issue #145

0912cc6

wahern added a commit that referenced this pull request Oct 17, 2016

interim fix for issue #145

163d50c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

assertion hit after attaching dead coroutine #145

assertion hit after attaching dead coroutine #145

Uh oh!

daurnimator commented Sep 6, 2016 •

edited

Loading

Uh oh!

daurnimator Sep 6, 2016

Uh oh!

wahern Sep 8, 2016

Uh oh!

daurnimator Sep 9, 2016

Uh oh!

wahern Sep 9, 2016

Uh oh!

daurnimator commented Sep 9, 2016

Uh oh!

wahern commented Sep 9, 2016

Uh oh!

daurnimator commented Sep 11, 2016

Uh oh!

wahern commented Oct 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

assertion hit after attaching dead coroutine #145

Are you sure you want to change the base?

assertion hit after attaching dead coroutine #145

Uh oh!

Conversation

daurnimator commented Sep 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daurnimator Sep 6, 2016

Choose a reason for hiding this comment

Uh oh!

wahern Sep 8, 2016

Choose a reason for hiding this comment

Uh oh!

daurnimator Sep 9, 2016

Choose a reason for hiding this comment

Uh oh!

wahern Sep 9, 2016

Choose a reason for hiding this comment

Uh oh!

daurnimator commented Sep 9, 2016

Uh oh!

wahern commented Sep 9, 2016

Uh oh!

daurnimator commented Sep 11, 2016

Uh oh!

wahern commented Oct 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

daurnimator commented Sep 6, 2016 •

edited

Loading