Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use maybe_async that supports async traits for breakpoint #161

Closed

Conversation

SidongYang
Copy link

@SidongYang SidongYang commented Jan 2, 2025

Description

This PR is exprimental codes that uses maybe_async for supporting fully async traits. Adding simple maybe_async macro for trait and it generates async code when is_sync feature is disabled. For this there is new feature is_sync for gdbstub which is default feature. So it doesn't affacts original sync code.

issue: #159

API Stability

  • This PR does not require a breaking API change

Checklist

  • Documentation
    • Ensured any public-facing rustdoc formatting looks good (via cargo doc)
    • (if appropriate) Added feature to "Debugging Features" in README.md
  • Validation
    • Included output of running examples/armv4t with RUST_LOG=trace + any relevant GDB output under the "Validation" section below
    • Included output of running ./example_no_std/check_size.sh before/after changes under the "Validation" section below
  • If implementing a new protocol extension IDET
    • Included a basic sample implementation in examples/armv4t
    • IDET can be optimized out (confirmed via ./example_no_std/check_size.sh)
    • OR implementation requires introducing non-optional binary bloat (please elaborate under "Description")
  • If upstreaming an Arch implementation
    • I have tested this code in my project, and to the best of my knowledge, it is working as intended.

Validation

  • Supports async and no change for sync codegen

@SidongYang SidongYang mentioned this pull request Jan 2, 2025
Copy link
Owner

@daniel5151 daniel5151 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending in this PR!

At first blush, maybe-async certainly seems to tick many of the boxes I'm interested in, particularly regarding having a single unified codebase for both sync and async code.

There are, however, a few things that concern me...

Given that async-trait requires boxing futures... I would like to better understand and profile how that extra indirection affects the dead-code-elimination guarantees that the IDET pattern otherwise guarantees in release mode.

At minimum, as a next-step in this exploration: can you sprinkle some maybe-async annotations into the armv4t code, such that we can do some side-by-side comparisons between the sync and async APIs via the in-tree /scripts/test_dead_code_elim.sh script?

Alternatively - it seems that maybe_async(AFIT) exists... but using it would be a bit annoying, and when gdbstub is configured in async mode, all current IDET methods would need to return impl Trait instead of dyn Trait, since Rust doesn't have native dyn async traits yet. Maybe that sort of transformation could be done automatically by forking / extending maybe_async... unlear.

Oh, and of course - boxing futures isn't particularly no_std friendly... and it would certainly be interesting to have gdbstub's async APIs play nice with no_std projects using async infrastructure (e.g: via embassy). I wouldn't say that "perfect" async no_std support is a hard requirement for any v0 of async gdbstub... but I would like to at least understand what sorts of gaps / implications will result once we introduce async stuff into the gdbstub codebase.


P.S: don't worry about those clippy lints. If you rebase your PR on top of latest master, I've gone ahead and fixed those.

@@ -35,6 +37,7 @@ std = ["alloc"]
trace-pkt = ["alloc"]
paranoid_unsafe = []
core_error = []
sync = ["maybe-async/is_sync"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not gdbstub should present a sync vs. async API by default is an interesting question. I suppose that's a judgement call we can make later, if we decide that maybe-async is the basis of the approach we take.

@@ -154,7 +155,8 @@ impl<'a, T: Target, C: Connection> GdbStub<'a, T, C> {
/// etc...) you will need to interface with the underlying
/// [`GdbStubStateMachine`](state_machine::GdbStubStateMachine) API
/// directly.
pub fn run_blocking<E>(
#[maybe_async]
pub async fn run_blocking<E>(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just experimental code... but having a run_blocking method that is actually async seems funky. I suppose we'll have to think a bit harder about how to model / rename this API if we go down this route

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, It needs to be renamed

@SidongYang SidongYang force-pushed the feature/use-maybe-async branch 3 times, most recently from b81abcd to 70063c9 Compare January 6, 2025 03:45
@SidongYang SidongYang force-pushed the feature/use-maybe-async branch from 70063c9 to a91ab2e Compare January 6, 2025 03:48
@SidongYang
Copy link
Author

It seems that CI failed because in compiling no std example, it cannot find Box for maybe_async macro.

@SidongYang
Copy link
Author

@daniel5151 I don't want to break default features for compatibility. but I can't find good way to choose sync traits as default.
It's little tricky that sync feature is default. User who wants async will disable default features.

@daniel5151
Copy link
Owner

Thanks for iterating here!

Now that the armv4t example is working alongside async, lets run some of those experiments I mentioned above, regarding dead code elimination. i.e: comment out the supports_breakpoints (and/or its sub-IDETs), and double-check that the breakpoint code does indeed still get eliminated in the async case.

There are other, arguably more interesting experiments that I'd like to run before committing to this approach. Namely: I'd like to explore the use of maybe_async(AFIT), and whether dropping dyn Target support when using async is a viable approach. The allocations per handler method make me a bit queasy, given the otherwise aggressively allocation-optimized API of gdbstub as a whole...

More ambitiously, as I mull things over even more, I'm wondering if it might make sense to just go all-in on async APIs without relying on maybe_async, and manually using AFITs, and maybe using some blanket impls / helper structs / helper traits to support sync targets? I'm not certain if the Rust compiler will even support that sort of thing, so getting a POC working in the rust playground might be worthwhile before trying to bring the technique back into the main gdbtsub repo.

All that is to say - while it's really neat to see that maybe_async does seem to work in theory, I'm not sure I can commit to this approach until we've nailed down answers to some of these other design considerations that gdbstub adheres to.

Let me know if you think you can help dig into some of these experiments here. If not, I'll try to allocate some time at some point in the next few weeks to see if I can play around here. There are a few different gdbstub related things that I'd like to hack on, and I'm starting to feel like I might be close to the tipping point where enough things have piled up that I'll need to allocate some time towards actively developing / refactoring gdbstub myself, vs. purely acting as a shephard for other folks' PRs 😅


@daniel5151 I don't want to break default features for compatibility. but I can't find good way to choose sync traits as default. It's little tricky that sync feature is default. User who wants async will disable default features.

Indeed, I think there's no good way to get around this issue when using maybe-async. It would be nice if maybe-async had more flexibility here, wrt. having a sync_by_default + is_async set of feature flags. I wouldn't call it a blocker or anything... but yeah, unfortunate.

@SidongYang
Copy link
Author

I agree that AFIT options is best for using maybe_async. I pushed a commit that use maybe_async(AFIT)
And I've checked the deadcode elimination test script. I can't pass the test without any commenting out. but it's same as master branch. and it passed the test with returning None in support_sw_breakpoint.
And I want to work for this in this PR. But if there is better way to support async, It's okay to close this PR. You can do some experiments for this issue also it would be thanks to share that.

@SidongYang SidongYang force-pushed the feature/use-maybe-async branch from 32613d3 to 9fea11d Compare January 13, 2025 11:59
@daniel5151
Copy link
Owner

Hey, really sorry for the lack of activity here. A ton of personal and work stuff has come up over the past few weeks, and I've yet to find a moment of focus time to give feedback here... My apologies :(

I'm going to try and set aside some time this weekend to give this a review, and collect my thoughts on this PR.

@daniel5151
Copy link
Owner

I've cloned the branch locally, and when I go to run cargo run --example armv4t --no-default-features --features=std, I'm hit with a ton of errors similar to:

error[E0038]: the trait `SingleThreadResume` cannot be made into an object
  --> src/stub/core_impl/resume.rs:40:65
   |
40 | ...                   ResumeOps::SingleThread(ops) => ops.support_single_step().is_some(),
   |                                                           ^^^^^^^^^^^^^^^^^^^ `SingleThreadResume` cannot be made into an object
   |
note: for a trait to be "dyn-compatible" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
  --> src/target/ext/base/singlethread.rs:97:14
   |
76 | pub trait SingleThreadResume: Target {
   |           ------------------ this trait cannot be made into an object...
...
97 |     async fn resume(&mut self, signal: Option<Signal>) -> Result<(), Self::Error>;
   |              ^^^^^^ ...because method `resume` is `async`

...which totally makes sense, given that we've switched to using non-allocating monomorphized AFIT, instead of allocating dyn-based async_trait.

Working around these errors will require diving deeper into the nuts-and-bolts logistics of what it'd take to migrate gdbstub away from using &mut dyn ExtTrait directly, and instead using &mut impl ExtTrait wherever possible. Though, to reiterate - if we go down this route, I'd still want to preserve the existing use-cases where supports_ methods actually do return a &mut dyn T (which can work with &mut impl T), which may require implementing many new impl T for &mut dyn T blocks on each extension trait.


And I want to work for this in this PR. But if there is better way to support async, It's okay to close this PR. You can do some experiments for this issue also it would be thanks to share that.

As we keep digging into this space, I am starting to suspect that introducing async support to gdbstub is going to prove harder than trying to integrate maybe-async. e.g: in much the same way that supporting non-blocking Connection interactions culminated in a substantial rewrite of gdbstub's core architecture in #88, it may be the case that to efficiently support async Target handlers, we will need a similarly "radical" approach which re-thinks some of gdbstub core architectural assumptions.

Namely, gdbstub's current architecture revolves around the basic principle of "GdbStub accepts a &mut Target, and invokes methods on it", which makes weaving through async quite difficult, as async annotation permeate deep into the core of the GdbStub implementation. This is not only annoying to deal with on a syntactic level (i.e: needing to use something like maybe-async to keep things ergonomic), but (I suspect) will also affect LLVM's ability to optimize the resulting code, substantially ballooning it in size.

Instead, the paradigm that I've been mulling over which could serve as the "next generation" of gdbstub's architecture is a different one:

What is GdbStub was purely a state machine, and instead of invoking Target methods directly, it simply transitioned into distinct Target states, punting the details of how/when to transition to the next state to the implementor?

Here's a sketch of what I mean by this:

// Instead of having a method that GdbStub calls, as we do today...
impl SingleThreadResume for MyTarget {
    fn resume(&mut self, signal: Option<Signal>) -> Result<(), Self::Error> { todo!() }
}

// lift the target handlers _outside_ GdbStub's call-stack, having users drive 
// GdbStub's state machine themselves
async fn my_gdbstub_loop() {
    let mut target = MyTarget::new().await;
    let mut conn = MyConn::new().await;
    let mut gdb = GdbStubState::new(...)?; // no Connect, no Target - GdbStub only contains protocol state
    loop {
        gdb = match gdb {
            // still have the same top-level "lifetime" methods as the current state machine API
            GdbStubState::Idle(gdb) => { gdb.incoming_data(conn.get_byte().await) },

            // introduce new target-handler states
            GdbStubState::OnResume(gdb, signal) => { target.handle_resume(signal).await; gdb.handled_resume() },
            // ...and so on, with whatever other states you opt-in to handling

           // and also have new error states
           GdbStubState::OnError(gdb, e) => { /* handle e, maybe recover */ }

            // with a generic "fallback" case, to account for unimplemented protocol extensions
            _ => gdb.on_unsupported(),
        }
    }
}

As you can see, by fully leaning into the state machine model that was originally introduced in #88, it becomes trivial to do async operations, without having to plumb through async/await into the Target traits themselves.

Moreover, it should be entirely possible to write a "user-friendly" wrapper around this state machine architecture, such that existing users of gdbstub's trait Target APIs don't need to completely rewrite their code, by having the existing run_blocking API automatically implement those various state transitions by querying a user-provided Target.

Of course, this is just a sketch, and there are still many unanswered questions I have with this new architecture. Notably: how do we preserve the dead-code-elimination properties that the IDET technique guarantees? i.e: how can we allow users to only implement certain GdbStubState::OnX handlers, and mark all other arms as guaranteed unreachable? I'm not sure.


All that is to say - I really appreciate you digging in here, and running some of these experiences. That said - I'm not sure I'll be able to definitively answer the question of "what is the best async model in gdbstub" until I set aside some time to dive into this question myself.

I think we should close this PR for now, and I'll summarize some of these ideas I've sketched out in a more focused manner, either on the #159 issue itself, or (more likely), via a new dedicated tracking issue for evaluating the feasibility and challenges inherent behind this new gdbstub architecture.

@daniel5151 daniel5151 mentioned this pull request Jan 26, 2025
9 tasks
@SidongYang
Copy link
Author

Thanks for very detailed explanation. It helps me to understand async rust deeper. Actually I didn't understood 100% for now. So I need to think deeply about this.

I think we should close this PR for now, and I'll summarize some of these ideas I've sketched out in a more focused manner, either on the #159 issue itself, or (more likely), via a new dedicated tracking issue for evaluating the feasibility and challenges inherent behind this new gdbstub architecture.

Agreed, Anyway this PR is just experimental and just for idea. I'll close this PR with this comment. Also I would like to continue discussing on the issue #159

@SidongYang SidongYang closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants