Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw 802.15.4 support #551

Merged
merged 3 commits into from
Sep 27, 2024
Merged

Raw 802.15.4 support #551

merged 3 commits into from
Sep 27, 2024

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jun 18, 2024

I built this upon libtock-c implementation.

I focused on providing safe abstractions.

  • Tx part is quite bare, i.e. just passing a byte slice. Should it be more structured?
  • Rx part is sophisticated. There are:
    • Frame, RxRingBuffer - these build up the ring buffer that is shared with kernel;
    • RxSingleBufferOperator - this encapsulates the buffer and provides safe operations on it. It uses a single buffer, which makes it possible to lose some frames when the buffer is not allowed to the kernel. An alternative for that, RxBufferAlternatingOperator that alternates between two buffers, which guarantees no frame loss at cost of doubled memory usage, was deemed unsound by Miri and hence withdrawn from this PR.

Feedback much welcome.

Example is complete.
Unit tests are done.
Conducted basic tests on hardware as well.

bradjc
bradjc previously approved these changes Jun 19, 2024
Copy link
Contributor

@bradjc bradjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at this in depth, but going through it this looks great.

Also it might be good for @tyler-potyondy to take a look.

@alevy
Copy link
Member

alevy commented Jun 20, 2024

Looks good!

@alevy
Copy link
Member

alevy commented Jun 20, 2024

Tx part is quite bare, i.e. just passing a byte slice. Should it be more structured?

I do not think it should be more structured.

@tyler-potyondy
Copy link

Overall looks good to me (from the 15.4 side of the world; I'm less familiar with libtock-rs and can't speak too much on that front)! A few comments:

  • The ring buffer size should be the 15.4 frame size + some meta data (this is explained in the 15.4 capsule driver). Looking through this, it seems your ring buffer is just 127 bytes (15.4 frame size).
  • It appears you are not implementing the ring buffer (just swapping two buffers). This is entirely fine (and will work for most applications). You may need to implement handling multiple packets within each ring buffer if you face workloads with heavy/frequent packet reception (e.g. OpenThread). Swapping buffers is better than just using one buffer, but can still experience dropped/overwritten packets in the case of multiple packets being received prior to the swap.

Exciting to see some 15.4 support making its way into libtock-rs!

@wprzytula
Copy link
Collaborator Author

wprzytula commented Jun 21, 2024

Overall looks good to me (from the 15.4 side of the world; I'm less familiar with libtock-rs and can't speak too much on that front)! A few comments:

  • The ring buffer size should be the 15.4 frame size + some meta data (this is explained in the 15.4 capsule driver). Looking through this, it seems your ring buffer is just 127 bytes (15.4 frame size).

According to phy_driver.rs:

//! The ring buffer provided by the process must be of the form:
//!
//! ```text
//! | read index | write index | user_frame 0 | user_frame 1 | ... | user_frame n |
//! ```
//!
//! `user_frame` denotes the 15.4 frame in addition to the relevant 3 bytes of
//! metadata (offset to data payload, length of data payload, and the MIC len).
//! The capsule assumes that this is the form of the buffer. Errors or deviation
//! in the form of the provided buffer will likely result in incomplete or
//! dropped packets.

I created the following structs:

#[repr(C)]
pub struct Frame {
    pub header_len: u8,
    pub payload_len: u8,
    pub mic_len: u8,
    pub body: [u8; MAX_MTU],
}

#[repr(C)]
pub struct RxRingBuffer<const N: usize> {
    /// From where the next frame will be read by process.
    /// Updated by process only.
    read_index: u8,
    /// Where the next frame will be written by kernel.
    /// Updated by kernel only.
    write_index: u8,
    /// Slots for received frames.
    frames: [Frame; N],
}

I can't see any deviations from the kernel docs. Could you point them out?

  • It appears you are not implementing the ring buffer (just swapping two buffers).

I do not understand why you believe so. RxRingBuffer has both read and write indices, which are used to establish ring buffer semantics on the buffer. Both Operators are written in a way that makes them return frames one by one, using ring buffer pop operation. Or am I mistaken?

Exciting to see some 15.4 support making its way into libtock-rs!

Hooray!

@tyler-potyondy
Copy link

@wprzytula my mistake. I missed the RxRingBuffer fields when skimming through this earlier and did not see the popping from the ring buffer etc. Thanks for the clarification!

@wprzytula wprzytula marked this pull request as ready for review June 22, 2024 17:41
@wprzytula
Copy link
Collaborator Author

Finished implementation. I did some minor fixes to the logic, wrote a very decent test suite and polished documentation.

@wprzytula
Copy link
Collaborator Author

Applied clippy fixes.

@alevy
Copy link
Member

alevy commented Jun 23, 2024

@wprzytula what's the state of this?

It says still

Haven't tested this on hardware as well.

Is this still accurate?

@wprzytula
Copy link
Collaborator Author

@wprzytula what's the state of this?

It says still

Haven't tested this on hardware as well.

Is this still accurate?

I still haven't succeeded in implementing HIL on cc2650 (this is WIP), so yes.
I'd be very grateful if someone tried this out on another piece of hardware, e.g. Nordic boards.

@wprzytula
Copy link
Collaborator Author

I need help with understanding the Miri check failure. I'm not sure what is exactly believed to be UB and why.

Copy link
Collaborator

@jrvanwhy jrvanwhy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the slow review; workload and TockWorld combined to make this take a couple weeks.


pub(super) fn share_initial(buf: &mut RxRingBuffer<N>) -> Result<Self, ErrorCode> {
Self::share_unscoped(buf.as_mut_byte_slice())?;
Ok(Self::new(buf))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need help with understanding the Miri check failure. I'm not sure what is exactly believed to be UB and why.

share_unscoped is unsound (see further comments below), which allows share_initial to violate Rust's memory aliasing rules.

The first line of share_initial:

Self::share_unscoped(buf.as_mut_byte_slice())?;

creates a temporary borrow of buf, and passes that borrow to share_unscoped. Then, on the second line:

Ok(Self::new(buf))

buf is used again, which invalidates the temporary borrow of buf. This results in two mutable pointers pointing at the same memory: the returned RxRingBufferInKernel and the pointer that was given to the kernel by share_unscoped.

Because of the aliasing, when the kernel tries to access the buffer, it causes undefined behavior (this is the UB reported by Miri in the CI failure).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm trying to achieve is to split &mut [u8] (the mutable reference to the buffer) into:

  1. *mut [u8] - the mutable pointer to the buffer used in share_unscoped, with manual lifetime management;
  2. `<PhantomData<&mut [u8]> - a phantom to take advantage of covariance and thus preserve lifetime of the original mutable reference to the buffer.

This way, at swap, I want to unsplit those two parts and restore the original mutable reference again.

I have been told that existence of two mutable references pointing to the same memory is unsound, but I don't see how this is the case.


// Safety: The buffer being allowed here is going to be enclosed in an opaque type
// until it's unallowed again. This prevents concurrent access to the buffer by process and kernel.
let allow_rw_handle = unsafe { share::Handle::new(&allow_rw) };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle::new's safety invariant is:

impl<'handle, L: List> Handle<'handle, L> {
    /// Constructs a new `Handle`, typing its lifetime to the provided list.
    ///
    /// # Safety
    /// The calling code must guarantee that `Drop::drop` is called on `_list`'s
    /// pointee before `_list`'s pointee becomes invalid. In other words,
    /// `_list`'s pointee may not be forgotten or leaked.
    pub unsafe fn new(_list: &'handle L) -> Self {
        Handle { _list: PhantomData }
    }

In particular, it says "_list's pointee may not be forgotten or leaked". In this case, _list is &allow_rw, so its pointee is allow_rw, which is explicitly forgotten later in the function (core::mem::forget(allow_rw)).

share_unscoped does not follow this requirement, making it unsound. In general: share_unscoped does not do anything to guarantee that the buffer will be unallowed before it is used again by the application.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general: share_unscoped does not do anything to guarantee that the buffer will be unallowed before it is used again by the application.

That's true. It's only intended to be used as a helper function, not as a safe API. If I mark it unsafe, will it be enough for it?

Handle::new's safety invariant is:

impl<'handle, L: List> Handle<'handle, L> {
    /// Constructs a new `Handle`, typing its lifetime to the provided list.
    ///
    /// # Safety
    /// The calling code must guarantee that `Drop::drop` is called on `_list`'s
    /// pointee before `_list`'s pointee becomes invalid. In other words,
    /// `_list`'s pointee may not be forgotten or leaked.
    pub unsafe fn new(_list: &'handle L) -> Self {
        Handle { _list: PhantomData }
    }

In particular, it says "_list's pointee may not be forgotten or leaked". In this case, _list is &allow_rw, so its pointee is allow_rw, which is explicitly forgotten later in the function (core::mem::forget(allow_rw)).

share_unscoped does not follow this requirement, making it unsound.

OK, share_unscoped itself allow memory violation (concurrent mutation). But again, it's not public safe API. The idea is that share_initial is sound, because:

  1. as long as RxRingBufferInKernel exists, it quasi-borrows the buffer (by means of covariance_phantom), thus preventing accesing the buffer from userspace;
  2. when RxRingBufferInKernel is dropped, kernel is un-allowed the buffer, so now only userspace can access the buffer.

Why isn't the above true? I can easily convince myself this is sound, and I'm sad I can't convince Rust about the same.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there's a relatively-easy fix to avoid the aliasing issue, but unfortunately I don't think there is an easy fix to make RxRingBufferInKernel sound.

A buffer that has been shared with the kernel via Read-Write Allow MUST be un-Allowed before its memory is deallocated. RxRingBufferInKernel does not currently guarantee this: if the caller calls core::mem::forget on it, then Drop::drop will never be called, and the shared buffer will be deallocated without being first un-Allowed (resulting in UB).

I'm only aware of a handful of ways to write a sound API that uses Read-Write Allow:

  1. Use the same pattern as share::scope (have a function call a closure and guarantee that the buffer is un-allowed when the closure returns).
  2. Use Pin.
  3. Require the buffer to be 'static.
  4. Use dynamic memory allocation (note: we don't currently have dynamic memory allocation implemented in libtock-rs, but we want to have it as an option).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there's a relatively-easy fix to avoid the aliasing issue, but unfortunately I don't think there is an easy fix to make RxRingBufferInKernel sound.

@jrvanwhy Could you please share the idea how to avoid the aliasing issue?

@alevy
Copy link
Member

alevy commented Jul 17, 2024

@wprzytula, sorry for the extended silence on this PR. A bunch of us are extensively discussing how to resolve the soundness issues correctly (e.g. https://github.com/tock/tock/blob/master/doc/wg/network/notes/network-notes-2024-07-15.md#154-libtock-rs-driver). Sorry this is not getting more attention while that's happening.

It turns out not to be obvious what the right fix would be, so no specific actionable recommendation just yet.

@wprzytula
Copy link
Collaborator Author

@wprzytula, sorry for the extended silence on this PR. A bunch of us are extensively discussing how to resolve the soundness issues correctly (e.g. https://github.com/tock/tock/blob/master/doc/wg/network/notes/network-notes-2024-07-15.md#154-libtock-rs-driver). Sorry this is not getting more attention while that's happening.

It turns out not to be obvious what the right fix would be, so no specific actionable recommendation just yet.

I have read the discussion you refer to. I understand your doubts and the current point now, thank you.

@wprzytula
Copy link
Collaborator Author

@wprzytula what's the state of this?
It says still

Haven't tested this on hardware as well.

Is this still accurate?

I still haven't succeeded in implementing HIL on cc2650 (this is WIP), so yes. I'd be very grateful if someone tried this out on another piece of hardware, e.g. Nordic boards.

I finally succeeded in writing HIL for cc2650 802.15.4 radio, and so I ran tests of this PR on it. Everything seems to work correctly, both for RxSingleOperator and RxBufferAlternatingOperator.

@wprzytula
Copy link
Collaborator Author

When I think of it, it's only the RxRingBufferInKernel, required for RxBufferAlternatingOperator to work, that causes UB according to Miri.
RxSingleBufferOperator is accepted. As it is good enough (RxBufferAlternatingOperator is only better in the way that it does not miss frames if a frame comes when user is examining the buffer, preventing it from being accessed by kernel), perhaps we could merge this PR without RxBufferAlternatingOperator-related code? This would give libtock-rs's users access to basic networking!

WDYT @alevy @jrvanwhy @bradjc @tyler-potyondy ?

@lschuermann
Copy link
Member

As it is good enough (RxBufferAlternatingOperator is only better in the way that it does not miss frames if a frame comes when user is examining the buffer, preventing it from being accessed by kernel), perhaps we could merge this PR without RxBufferAlternatingOperator-related code? This would give libtock-rs's users access to basic networking!

I think that is a great move forward! Nonetheless, we do want to figure out a way to support Tock's buffer swapping semantics for "lossless" kernel -- userspace communication in the long term; it's one of the primary advantages of this kernel API. That being said, I agree that having something is better than nothing, and in particular if the potential loss of packets is sufficiently small to not cause havoc or be recoverable in practice.

@alevy
Copy link
Member

alevy commented Aug 27, 2024

@wprzytula yes, do this.

@wprzytula
Copy link
Collaborator Author

I've removed RxBufferAlternatingOperator and its controversial code. Please run CI, it should pass now.

@wprzytula wprzytula requested a review from jrvanwhy August 28, 2024 05:41
@lschuermann

This comment was marked as off-topic.

@wprzytula

This comment was marked as off-topic.

@lschuermann

This comment was marked as off-topic.

@wprzytula

This comment was marked as off-topic.

@brghena
Copy link
Contributor

brghena commented Aug 28, 2024

It is very unclear why it keeps requiring approval to run actions. The repo and org are set to require it for first-time contributors, but I really thought that was once-only.

To try to "fix" it, I have added @wprzytula as a read-only contributor. The should result in zero gained permissions they didn't have before since the repo is public, but I'm hoping that listing them as a collaborator maybe lets CI run? Maybe?

@wprzytula
Copy link
Collaborator Author

To try to "fix" it, I have added @wprzytula as a read-only contributor. The should result in zero gained permissions they didn't have before since the repo is public, but I'm hoping that listing them as a collaborator maybe lets CI run? Maybe?

Hooray! This works.

@lschuermann
Copy link
Member

lschuermann commented Aug 28, 2024

The should result in zero gained permissions they didn't have before since the repo is public, but I'm hoping that listing them as a collaborator maybe lets CI run?

It's great that this works, but I believe this is a bug on GitHub's end. It used to only require approval once. I don't like adding contributors to repositories as a permanent fix; we'll for sure forget to maintain / clean that list up.

@wprzytula
Copy link
Collaborator Author

The current CI failure seems to be about MSRV. Can someone confirm? What are we doing about that?

@jrvanwhy
Copy link
Collaborator

The current CI failure seems to be about MSRV. Can someone confirm? What are we doing about that?

Our policy is to bump the MSRV whenever doing so is convenient. You may need to increase the nightly toolchain version as well. See #541 for an example of how to do so.

@jrvanwhy
Copy link
Collaborator

Oh, and I suggest increasing the toolchain versions in a separate PR, just in case the toolchain bump causes compilation failures for somebody else' PR. If the toolchain version increase passes CI we can merge it quickly.

@wprzytula wprzytula force-pushed the raw-802.15.4 branch 2 times, most recently from 50bcdc4 to 201aeaa Compare August 28, 2024 20:21
@wprzytula
Copy link
Collaborator Author

@jrvanwhy Ready.

lschuermann
lschuermann previously approved these changes Aug 29, 2024
Copy link
Member

@lschuermann lschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good! I'm neither an expert on libtock-rs, nor on 15.4. However, I do believe that the unsafe buffer cast seems sound, and the rest of the good looks pretty good as well.

version = "0.1.0"
authors = [
"Tock Project Developers <tock-dev@googlegroups.com>",
"Wojciech Przytuła <woj.przytula@gmail.com>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether any other crate in the Tock ecosystem is attributed to a particular person. I'm not necessarily opposed to this, but I'm also not aware of any potential implications of it. We should perhaps talk about whether we want this, in a spirit similar to tock/tock#3318.

That being said, this document should also apply to libtock-rs, and adding copyright statement to comment headers in individual source files is definitely fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with removing this altogether.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several other crates in the libtock-rs repository name an individual in authors. I agree that having license headers and documenting attribution there is better, but I'm fine with this as-is (particularly because we don't currently have license headers).

apis/net/ieee802154/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines +66 to +68
// SAFETY: any byte value is valid for any byte of Self,
// as well as for any byte of [u8], so casts back and forth
// cannot break the type system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an actionable comment, but it'd be really nice if Rust had a way of asserting that a given type is unconditionally valid; i.e., a type that is always a valid instance of itself if constructed over arbitrary, initialized memory.

From my understanding of Rust's type system and safety invariants, I do believe that this operation is sound. I would be slightly worried if this was not a tightly packed struct, and there might be niches in structs that aren't exposed of any given field where Rust might assume them to be initialized to a given value.

However, all of Frame::body, Frame, and RxRingBuffer should be tightly packed and are composed solely of types (u8s, arrays of u8s) for which all underlying values in memory correspond to a valid instance of that type, and where those types do not carry any higher-level invariants (as, e.g., references do).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an actionable comment, but it'd be really nice if Rust had a way of asserting that a given type is unconditionally valid; i.e., a type that is always a valid instance of itself if constructed over arbitrary, initialized memory.

I believe you're looking for zerocopy::FromBytes. You're right it's not in the base language or standard libraries though (but zerocopy is extremely well tested).

@wprzytula wprzytula dismissed lschuermann’s stale review August 29, 2024 14:22

The merge-base changed after approval.

The driver provides bare yet correct support for all operations but rx.
Rx support is in turn sophisticated, consisting of many higher-level
abstractions that ensure memory safety and prevent losing frames.

This commit introduces the driver, next commits add tests and
a usage example.
Two things are worth noticing:
1. I discovered that due to a flaw in ring buffer design,
   only N-1 slots are usable there. This is space-inefficient.
2. In order to inject upcalls at a small moment between subscribing
   to the RECEIVED_FRAME upcall and calling yield_wait, a hacky way
   is employed with custom FakeSyscalls implementation.
   Such upcall injection is necessary, because:
   - before subscription, scheduling an upcall is a no-op;
   - if we don't schedule an upcall before yield_wait, the test
     framework purposefully panics to prevent deadlock.
@wprzytula
Copy link
Collaborator Author

@lschuermann @jrvanwhy Why is this still not merged? Are there still some action points to be resolved?

Copy link
Contributor

@hudson-ayers hudson-ayers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as a proxy for Leon's approval that was wiped by the merge-base change.

@alevy alevy added this pull request to the merge queue Sep 27, 2024
Merged via the queue into tock:master with commit 8bacc47 Sep 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants