Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when executing request permission #25

Open
arberg opened this issue Dec 6, 2022 · 10 comments
Open

Deadlock when executing request permission #25

arberg opened this issue Dec 6, 2022 · 10 comments

Comments

@arberg
Copy link

arberg commented Dec 6, 2022

PEKO 3.0.1

If I request permission on a fresh app startup where permissions has not been requested previously, then the app is stuck in a deadlock style ANR. It never proceeds. The same code I run works when I execute it via button press later

private val permissionRequester = PermissionRequester.instance() // Creates new instance

    fun register() {
        CoroutineScope(Dispatchers.Main).launch {
            permissionRequester.request(android.Manifest.permission.ACCESS_FINE_LOCATION)
                .collect {
                    // MyStuff - It never arrives to collect when deadlocked
                }
        }
    }

Debugging indicates that it hangs after executing PermissionRequester:75 which is this line

val requester = requesterFactory.getRequesterAsync(requireContext()).await()

And the PekoActivity executes onPostCreate (with non-null requesterDeferred) but neither requestPermissions, onRequestPermissionsResult nor finish got executed on PekoActivity according to my breakpoints. Apparantly your maven upload included the source-code, so IntelliJ could debug it which was pretty cool.

If I change the code so it performs the request on another thread, then it works

    fun register() {
        CoroutineScope(Dispatchers.Main).launch {
            // do my stuff on main (if needed, otherwise we could just create the above coroutine on IO or elsewhere
            withContext(Dispatchers.IO) { // this works, no deadlock
                permissionRequester.request(android.Manifest.permission.ACCESS_FINE_LOCATION)
                    .collect {
                        withContext(Dispatchers.Main) {
                            // do my stuff back on mair
                        }
                    }
            }
        }
    }

After changing to the above code, the coroutine proceeded past the PermissionRequester:75 await line, performed the permission-requests and then things ran along fine from there.

For some reason the main-thread code only hung when I executed during my startup work (but long after onCreate), but did not hang when executed from user button click. But then again, that just indicates deadlocks can be tricky.

Its quite unclear to me how it can be a deadlock, as your coroutines seem fine. They suspend and should allow the main to do its work.

I have no idea if you can make any sense of that. I'll probably roll back to v2 though that had a resume rare bug, but for now I'll skip requesting permission on startup and let the user trigger it.

if you changed your library to the following I suspect it would work:

					val asyncRequester = requesterFactory.getRequesterAsync(requireContext())
					val requester = 
					withContext(Dispaters.IO) {
					    asyncRequester.await()
					}

Best Alex

@deva666
Copy link
Owner

deva666 commented Dec 7, 2022

Thanks for the detailed explanation and possible solution. I am looking into this.

@deva666
Copy link
Owner

deva666 commented Dec 7, 2022

I tried to reproduce this with the included examples project. I added to MainActivity onCreate call this

		CoroutineScope(Dispatchers.Main).launch {
			PermissionRequester.instance().request(Manifest.permission.CALL_PHONE)
				.collect {
					Log.d("PEKO", "RESULT: $it")
				}
		}

And I got the result. No Deadlock
Can you provide me some more info please? Where exactly are you calling this? What Android version?

@arberg
Copy link
Author

arberg commented Dec 7, 2022

Yeah I would expect that would happen. As I wasn't calling it from onCreate but later on after user had already closed another dialog it seemed obvious that its not just any call that would trigger the behaviour.

I'm using kotlin 1.7.20, with coroutineVersion = "1.6.4", I've seen the behaviour completely (seemingly 100%) reproducible on OnePLus 7Pro+10Pro runningr android 11 and 12 respetively. I've seen this ANR from crashlytics from production which I have learned from my own tests was caused by this, and the ANR only started coming after releasing with the new peko version

main (native):tid=1 systid=11113 
#00 pc 0xcad6c libc.so (__epoll_pwait + 12)
#01 pc 0x17ea8 libutils.so (android::Looper::pollInner(int) + 184)
#02 pc 0x17d84 libutils.so (android::Looper::pollOnce(int, int*, int*, void**) + 116)
#03 pc 0x154f04 libandroid_runtime.so (android::android_os_MessageQueue_nativePollOnce(_JNIEnv*, _jobject*, long, int) + 48)
       at android.os.MessageQueue.nativePollOnce(Native method)
       at android.os.MessageQueue.next(MessageQueue.java:339)
       at android.os.Looper.loopOnce(Looper.java:179)
       at android.os.Looper.loop(Looper.java:344)
       at android.app.ActivityThread.main(ActivityThread.java:8212)
       at java.lang.reflect.Method.invoke(Native method)
       at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:584)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1034)

I tried searching for info on above ANR, and didn't really get a solution. Though that was before I discovered the exact place that PEKO hung. Also it kind of looks like its not just peko but everything in my app. But that would fit if somehow we have introduced a deadlock in the main-thread. Though I cannot see how. Could be a kotlin bug, but I am always very cautious to blame the compiler.

I could also show you via screen-share, if we could find the time, but I cannot give you the code. I would probably have to work some hours to strip everything else and send you a stripped down version. That does seem possible. I'll think about it.

Hmmm, interesting. I just checked the devices from production. There were 8 user devices with this bug, and every single one of them were also a OnePlus. Naaah, I just tested on a samsung, its also stuck on the samsung.

I tried reproducing in your sample, but to no avail. I'll have to strip down the codebase I already have to reproduce it.

@arberg
Copy link
Author

arberg commented Dec 7, 2022

I managed to reproduce it now in a tiny sample.

Insert this at the end of MainActivity.onCreate (or probably anywhere, but that's where I tested it) in your sample of Peko.

        val context = applicationContext
        val permissionRequester = PermissionRequester.instance() // Creates new instance
        activityLifecycleScope.launchWhenResumed { // Fails
            permissionRequester.request(android.Manifest.permission.ACCESS_FINE_LOCATION)
                .collect { result ->
                    Toast.makeText(context, "result: $result", Toast.LENGTH_LONG).apply { show() }
                }
        }

Interestingly its not a just question of switching to withContext(Dispatchers.IO). Because if I do a withContext(Dispatchers.Main) {} around the request, then it still works, even though I'm switching to main. So I'm pretty sure its the activityLifecycleScope.launchWhenResumed. I also had to use this lifeCycleScope in order to reproduce it, instead of a basic CoroutineScope. The withContext probably brings it out of it.

You can also see my branch reproduced_deadlock on the fork here, to see various interesting cases:

https://github.com/arberg/Peko/

I suspect we should report this to the kotlin guys and see what they have to say about it.

AHHH, now I think I know the problem. It did seem a bit fishy to me to have the PekoActivity run code that was created in another activity, but I couldn't see why that actually should be a problem. I'm pretty sure the problem is that I use launchWhenResumed,
and you 'carelessly' just use that scope I as the user gives to the channel, and expects it to continue working. But my scope only runs when my activity is resumed, and your second activity thus never proceeds. I don't understand why its an ANR, instead of being an invisible top PekoActivity. I suspect we should show it to the kotlin guys anyway, because its weird we get an ANR.

I would strongly suggest if possible, that you also see if its possible to make the PekoActivity run the code. Like if it gets the permissions in a channel it creates, and then makes those results flood to the subscriber in the original activity / PermissionRequester.

If you want to chat about it via voice that can probably be arranged.

@deva666
Copy link
Owner

deva666 commented Dec 8, 2022

Yes, I can reproduce it. For some reason, multiple request calls (at the same time) are causing the hang. But with this latest example, if I wrap it in withContext(Dispatchers.IO) it is still hanging. I am looking into this ....

@deva666
Copy link
Owner

deva666 commented Dec 8, 2022

An easy fix would be to not allow multiple requests at the same time. As also the native request dialog can only be shown for one request at a time. Peko v2 worked like this. But not sure about it

@deva666
Copy link
Owner

deva666 commented Dec 9, 2022

I think it is fixed in this branch https://github.com/deva666/Peko/tree/deadlock_fix
I replaced Deferred with Channel and it seems to be working.
@arberg Before I make a release can you maybe try that branch to see if it is working in your code?

@arberg
Copy link
Author

arberg commented Dec 10, 2022

Hi,

It still dead-locks. I think the problem is that the channel in PekoPermissionRequester, is collected and thus runs in the scope defined by the client and the calling activity.

I suggest moving these lines

					requester.requestPermissions(request.denied.toTypedArray())
					for (result in requester.resultsChannel) {
						trySend(result)
					}

To a channel collected in the PEKO activity, and that the peko activity channel then sends the results to the channel from above lines in PekoPermissionRequester. That way it should work save and sound. You have two channels, the PekoActivity controls the execution scope of the channel which does the actual requests of permission, and that fits perfectly in the android style. It then sends the results, to something which may or may not be running at the time we are requesting the permissions in PekoActivity, and then PekoActivity, when done, can finish itself.

I don't understand the code well enough to change it.

I think the '?: return' you have in PekoActivity.onCreate has a bug, I suspect you have to finish the activity, to pop this left over activity. I suspect this can happen when android rebuilds the process, and create a new activity, before you have Peko initiated. Probably you also need to finish the activity if channel is null (maybe for same reason?).

I have merged your changes into my reproduced_deadlock branch, but I kept my own test-code.

@arberg
Copy link
Author

arberg commented Dec 11, 2022

I missed this comment before:

An easy fix would be to not allow multiple requests at the same time. As also the native request dialog can only be shown for one request at a time. Peko v2 worked like this. But not sure about it

I don't think I in my example makes multiple request at a time. However I agree that the framework has a problem with this, due to the static behaviour. One option might be to add an ID, that is bumped on each attempt, and invoke the pekoActivity with the id. Then you could have a static map from id to channel, or perhaps more sane, just a static 'lastId' plus the channel, and then quick-fail in the PekoActivity if the ID given in the intent, is not the ID registered in the static data as 'lastId'. These are just thoughts, it might not be useful.

@deva666
Copy link
Owner

deva666 commented Dec 13, 2022

I've been playing with some solutions and it seems that your initial suggestion, wrapping getting the native requester with IO dispatcher seems like the easiest solution. I cannot test with your code, but I pushed this to branch https://github.com/deva666/Peko/tree/deadlock_fix_2 . Can you please tell me if this works with your code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants