-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mono.Android] Prevent ObjectDisposedException while reading HTTP response from InputStreamInvoker #9789
base: main
Are you sure you want to change the base?
[Mono.Android] Prevent ObjectDisposedException while reading HTTP response from InputStreamInvoker #9789
Conversation
@simonrozsival FYI CI is complaining about branch name being too long:
We have seen it on other PRs and just shortened the name... |
@simonrozsival Can you try placing a |
@AaronRobinsonMSFT adding
|
My problem is that this throws my entire understanding of how the GC and GC bridge work into doubt. (HHOS 😓) My understanding is that the GC bridge should only process objects which are not otherwise referenced by managed code. If managed code holds a GC rooted reference: static Java.Lang.Object globalInstance = new Java.Lang.Object(); then that instance should never reach the GC bridge. From #9039, we have a call stack of:
Something is holding a reference to Which is to say, the
Right? Which brings us to 2(i): Which brings us to 3(ii): why does 3(iii) thus doesn't make sense to me. Why is Updating |
This, again, is the part that confuses and scares me: This feels like something that should not happen, yet it is. Why is it happening? |
That is mostly correct. The GC Bridge only looks at bridge objects that have been marked as unreachable, but I believe they could point at an object that is still reachable. I'm basing this on the fact that since each bridge object is walked it could be in a field. However, this would be a monumental flaw in the GC Bridge that should have manifested long ago. |
Yay! (I think.)
Sounds good…
And this deeply concerns me. My problem is that #9039 feels like a GC bug, and the fix in this PR is, at best, a workaround. If this isn't a GC bug, then I need to understand what the bug is, so that we can audit the codebase to find any other scenarios which "rhyme" with this bug. Right now, I don't understand what the bug is (other than "clearly a GC bug!"), and thus have no idea what such an audit would even look like. If we go with your belief that "[the GC bridge could process] an object that is still reachable," then I don't know how to fix that. I don't know what that means. Consider: /* 1 */ partial class MyActivity : Activity {
/* 2 */ public override void OnCreate(Bundle? b)
/* 3 */ {
/* 4 */ base.OnCreate(b);
/* 5 */ var v = new Android.Widget.Button(this);
/* 6 */ SetContentView(v);
/* 7 */ }
/* 8 */ }
Is the only reason this doesn't happen because there's not enough time between lines 5 and 6 for the GC to stop the world and intervene? Does this not happen because we've been lucky? Would that change if we throw a Or will the above always work, but if you have a "large enough object graph" that, while everything is reachable, it's "complicated enough" that the GC has second thoughts…? The prospect of the GC bridge being able to process reachable objects is horrifying. |
I'm sorry, I was wrong. I didn't look deeply enough into the "push" operation for object fields. It skips over objects that are alive. |
Thank you, I've now avoided a minor heart attack. Which just re-raises my original question: how is the instance referenced by Continuing to beat a dead horse, this really feels like a GC bug to me. |
It also doesn't make sense to me, but it's what I observe (based on the It almost seems like we're not holding a
I 100% agree. The same problem probably affects
I wonder what else could cause the |
I don't think that will be true. We are the one creating the Which is why you were seeing 3(ii): because nothing on the Java side references the I still don't see why collecting the
That should be in the GREF logs. Do you have any to share? |
@jonpryor this is the logs I'm getting when running the repro app with I can confirm that the GC bridge collects the
|
I also enabled
It's interesting that there is no SCC more than 1 item in the whole log and also there are always 0 xrefs. |
@filipnavara do you have an idea what could be going on here? |
@simonrozsival Checking the SCC graph might be interesting here. You can collect that using |
@AaronRobinsonMSFT I think that should be what I posted in #9789 (comment), or should I send the whole logs? |
I experimented with the repro some more and I realized I simplified the code from the customer's project in #9039 too much. In the original repro, progress is not reported via Console.WriteLine, but it updates a text in a label: var request = new HttpRequestMessage(HttpMethod.Get, $"/images/{imageId}/download");
var response = await _client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, cancellationToken);
+ProgressLabel.Text = "Downloading...";
response.EnsureSuccessStatusCode();
await using var stream = await response.Content.ReadAsStreamAsync(cancellationToken);
await using var fileStream = File.Create(path);
int bytesRead = -1;
var buffer = new byte[512].AsMemory(); // Small buffer on purpose.
while (bytesRead != 0)
{
bytesRead = await stream.ReadAsync(buffer, cancellationToken);
await fileStream.WriteAsync(buffer[..bytesRead], cancellationToken);
Console.WriteLine($"Downloaded {(int)(100 * fileStream.Length / size)}%");
} With this single line, the GC bug manifests. Without updating the label text, it does not. I will update the code in the main description. EDIT: I don't know what's so special about updating the label text (this is MAUI Label, but it boils down to I briefly explored the idea that making changes to the UI via JNI calls would somehow switch the JNI thread to a UI thread and reading from the network input stream would somehow cause a Java exception because networking on the UI thread is not allowed. If that were the case, I would expect the app to crash or at least print something to logcat. So I did not explore this idea any further. I think I will try to find a better/smaller repro than what we got from the customer in #9039 on Monday without the additional dependencies such as MAUI. |
So, here's what I think is happening. The sample uses a
Presumably the (Technically, you can probably argue that there's a reference from |
Also, here's a smaller version of the repro: Use the HTTP server from the original repro, plug the address in the source code (ie. replace Side-note: If you modify the sample to add |
Here's even smaller repro, no HTTP involved: // Run this on UI thread (eg. in MainActivity.OnCreate)
Task.Run(GCLoop); // <== Repeat this line up to 4 times depending on HW/emulator to make the repro more reliable
_ = AsyncStreamWriter();
public static async Task GCLoop()
{
while (true)
{
GC.Collect();
await Task.Delay(10);
}
}
public static async Task AsyncStreamWriter()
{
var bs = new ByteArrayOutputStream();
var osi = new Android.Runtime.OutputStreamInvoker(bs);
try
{
while (true)
await osi.WriteAsync(new byte[2]);
}
catch (ObjectDisposedException ex)
{
System.Environment.FailFast(ex.ToString());
}
} |
@filipnavara Great analysis.
I would agree this seems like the issue. However, if this is true it seems odd we've not observed this before. We should also be able to easily confirm this. |
It's not exactly that it doesn't process delegates correctly, but it definitely does produce incorrect SCC/XREF set for this case. Now, the Also, seemingly running with the "new" bridge (which is older than the "tarjan" bridge) I don't get the crash. I enabled it using
|
Fixes #9039
TODO:
The issue manifests in the following code snippet, where we're downloading a large file from the server:
My best attempt at explaining the bug is the following:
AndroidMessageHandler
returns an instance ofAndroidHttpResponseMessage
which keeps a reference to theHttpURLConnection
MCW which internally keeps a reference to theInputStream
(wrapped byAndroid.Runtime.InputStreamInvoker
)stream = await response.Content.ReadAsStreamAsync(cancellationToken)
...stream
is referencing the response'sInputStreamInvoker
which keeps a reference to theInputStream
response
can be collected by GCresponse
is collected on the .NET sideHttpURLConnection
is collected on the Java sideHandle
inInputStream
to a weak ref and initiate Java GCInputStream
since there are no more references to it from the Java sideIntPtr.Zero
InputStream
object won't be collected though, becauseInputStreamInvoker
is still holding a reference to itInputStreamInvoker
, it's internalInputStream
is disposed and reading from it will throw theObjectDisposedException
we are observingThe problem can be prevented by keeping a gref to the underlying
InputStream
inInputStreamInvoker
. This way, the gc bridge cannot dispose theInputStream
while the only reference to it is on the .NET side in an object which isn't a Java class wrapper. Without this patch, the app will fail every single time when using a local debug build. With the patch applied, the exception is not thrown anymore./cc @grendello @jonathanpeppers @jonpryor @AaronRobinsonMSFT