Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add DirectreadNorFlash trait #22

Closed
wants to merge 1 commit into from

Conversation

ia0
Copy link
Contributor

@ia0 ia0 commented Jan 9, 2022

Some (most?) flash provide direct read access to the flash, like the nRF52840. This PR adds a trait for them.

Compared to ReadNorFlash:

  • This enables zero-cost reads (in particular there is no copy of the slice).
  • This prevents mutable operations until the slices that have been read are dropped (they could be copied to hold the content longer resulting in the same as using ReadNorFlash::read in the first place).

@MathiasKoch
Copy link
Collaborator

My fear around this would be that we end up seeing a whole set of library crates that requires DirectreadNorFlash, without supporting the slightly more expensive implementation of just ReadNorFlash. Wonder if we could somehow do a default implementation of one of them or similar support, to make it easier for library authors to do implementations that works with the best available one.

Already we have the same issue in NorFlash & MultiwriteNorFlash, where we end up with a segmented library eco system, and this would probably introduce the same issue on the read-end.

Just my 2 cents, otherwise I am all for the zero-copy versions, and I personally think we should have support for this! Just a bit worried that we end up with so many traits that we break the clean API abstration we are actually trying to create with this.

Thoughts @Dirbaio @eldruin ? This feels like the same discussion you guys have had with E-H as well?

@ia0
Copy link
Contributor Author

ia0 commented Jan 9, 2022

Good point. I didn't think about enforcing users to write portable code as an API objective. What I could suggest to help in this direction would be:

  1. Make it clear in the documentation that ReadNorFlash (resp. NorFlash) is the default trait for read (resp. write) access.
  2. To make the point above clearer, all other traits are moved to a separate module with a clear name, e.g. non_portable.
  3. To make the point above clearer, the module is feature-gated. The feature would be disabled by default and would have a clear name too (probably the same name, something that suggests that using those traits is not portable).

This way, users who write portable libraries or binaries, should be incentivized to only use ReadNorFlash and NorFlash.

Regarding a smart interface that would encompass ReadNorFlash and DirectreadNorFlash, let me try some ideas. I initially thought it wouldn't be possible, but it might.

@ia0
Copy link
Contributor Author

ia0 commented Jan 9, 2022

Ok so there's a solution but there's a few drawbacks:

  • Users need to write flash.read(offset, buffer.into()) instead of flash.read(offset, buffer).
  • The new Bytes struct is twice bigger as &mut [u8]: 16 bytes versus 8 bytes on 32-bits architecture.
  • To benefit from the no-copy, users need to call no_copy when they're done using a buffer.
  • To be able to hold multiple read slices at the same time, read must take &self instead of &mut self.

The last drawback is probably the biggest, because implementations that actually need to modify something to read (are there any?) would probably need to have a mutex to protect those modifications.

There's also the drawback that trying to get no-copy code means understanding lifetimes. But that problem is already present anyway with the DirectreadNorFlash trait. It's just under one more layer with this solution.

@MathiasKoch
Copy link
Collaborator

Cool! I will take a look at this in depth tomorrow, but at quick glance,

Users need to write flash.read(offset, buffer.into()) instead of flash.read(offset, buffer).

Might be solvable by

fn read<'i, 'o, B>(&'o self, offset: u32, bytes: B) -> Result<Bytes<'i, 'o>, Self::Error>
where
    B: Into<Bytes<'i, 'o>>;

@ia0
Copy link
Contributor Author

ia0 commented Jan 9, 2022

Nice! I've updated the solution to use Into<Bytes>. It indeed removes the first issue.

@ia0
Copy link
Contributor Author

ia0 commented Jan 10, 2022

Actually another drawback I forgot:

  • Users still need to pass a buffer even when they only want to support direct-read flash only.

@eldruin
Copy link
Member

eldruin commented Jan 14, 2022

Interesting idea! Do you have some PoC implementations for this?
I see the problem with ecosystem libraries requiring DirectreadNorFlash for performance reasons or due to premature optimization and then splitting the ecosystem a bit. Over at embedded-hal we are actually going through a trait merge process.
However, the situation here is different, and I think both traits make sense.

As for default implementations, off the top of my head I think it should be possible to define a default implementation for ReadNorFlash::read given DirectreadNorFlash::direct_read. The other way around would not be possible, though.

@ia0
Copy link
Contributor Author

ia0 commented Jan 14, 2022

Do you have some PoC implementations for this?

I'm not sure what you mean. If you mean chips with direct read, then the internal flash of nRF52840 is like that, but most probably more than 80% of the internal flash out there are like that. If you mean some application using this property, then there is this persistent storage library I wrote: https://github.com/google/OpenSK/tree/stable/libraries/persistent_store. You can see that I designed a similar trait than the NorFlash with direct reads: https://github.com/google/OpenSK/blob/f2496a8e6d71a4e838884996a1c9b62121f87df2/libraries/persistent_store/src/storage.rs#L37-L75. The store on top of it is here: https://github.com/google/OpenSK/blob/stable/libraries/persistent_store/src/store.rs.

I see the problem with ecosystem libraries requiring DirectreadNorFlash for performance reasons or due to premature optimization and then splitting the ecosystem a bit.

I totally agree with that and I didn't think about it until @MathiasKoch brought it to my attention. I'm actually now rather against having such a DirectreadNorFlash trait (as well as the MultiwriteNorFlash trait) directly exposed without portability warnings.

it should be possible to define a default implementation for ReadNorFlash::read given DirectreadNorFlash::direct_read

Yes, but would that play well with Rust trait system? Since DirectreadNorFlash would inherit from ReadNorFlash, which would in turn depend on the first for its default implementation.

we are actually going through a trait merge process

Is there some documentation? Also, I'm less and less convinced adding this trait to be a good idea. I would be more in favor of tracking demand in an issue, close this PR proposal, and reassess when the demand is growing. What do you think?

@Dirbaio
Copy link
Contributor

Dirbaio commented Jan 17, 2022

Memory-mapped is not necessarily faster for external flashes. Reading from the memory-mapped area will cause a cache miss for each cache line you read, which will trigger many small SPI transactions, each with its own overhead (like sending the address, etc). Also, the core is stalled while the cache gets filled. Normal reads will do a single long SPI read transaction, and if you do them async the core can go do something else while the read is being done.

IMO memory-mapping is mainly only useful for executing code from the external flash, if you're gong to use it as data storage memory-mapping is not faster and reduces portability.

Also, may I bikeshed on the name? MemoryMapped sounds better to me, "direct" read is much less specific.

@ia0
Copy link
Contributor Author

ia0 commented Jan 17, 2022

Ok so let me recap why we believe this is a bad idea to have a trait/feature for memory-mapped flashes:

  • Users may depend on it for simplicity or as premature optimization. We would like to enforce users to write portable libraries (i.e. that can run on as many flashes as possible).
  • The design is not without drawbacks if we want to support both types of flash in a unified way:
    • The usability is not great (the user needs to explicitly drop read slices).
    • read(&mut self) must be read(&self) to permit reading multiple slices at the same time.
    • The user still needs to provide a buffer in case the flash is not memory-mapped.
    • This would most probably be a breaking change.
  • The need for such feature is not obvious or well-motivated.

So unless there are objections, I would close this feature request as "Benefit is too low compared to cost".

@ia0 ia0 closed this Jan 22, 2022
@ia0 ia0 deleted the direct branch January 22, 2022 15:35
@chrysn
Copy link

chrysn commented Feb 26, 2022

The link in https://github.com/rust-embedded-community/embedded-storage/compare/master...ia0:direct_combined?expand=1 doesn't resolve any more.

(I'd expect to read something like fn read_maybe_memmapped(&self, addr: usize, buffer: &mut MaybeUninit<[u8]>) -> &[u8] there, where the output may point into the populated buffer or into the mapped memory -- but maybe it's something else).

Has this been taken up in a continued issue yet, or is that part of #23?

@ia0
Copy link
Contributor Author

ia0 commented Feb 26, 2022

The link still works for me, could you try again? But indeed, it's better to copy the solution here for history in case I ever delete the branch or fork.

pub struct Bytes<'i, 'o> {
	input: &'i mut [u8],
	output: Option<&'o [u8]>,
}

impl<'i, 'o> Bytes<'i, 'o> {
	/// Prevents Drop to copy the content.
	pub fn no_copy(mut self) {
		self.output = None;
	}
}

impl<'i, 'o> core::ops::Deref for Bytes<'i, 'o> {
	type Target = [u8];

	fn deref<'a>(&'a self) -> &'a [u8] {
		match self.output {
			None => self.input,
			Some(x) => x,
		}
	}
}

impl<'i, 'o> From<&'i mut [u8]> for Bytes<'i, 'o> {
	fn from(input: &'i mut [u8]) -> Bytes<'i, 'o> {
		Bytes {
			input,
			output: None,
		}
	}
}

impl<'i, 'o> Drop for Bytes<'i, 'o> {
	fn drop(&mut self) {
		if let Some(output) = self.output {
			self.input.copy_from_slice(output);
		}
	}
}

#[test]
fn bytes_size() {
	assert_eq!(
		core::mem::size_of::<Bytes>(),
		2 * core::mem::size_of::<&mut [u8]>()
	);
}

#[test]
fn bytes_usage() {
	struct Flash<const N: usize> {
		storage: [u8; N],
	}
	const DIRECT: u32 = 0;
	const COPY: u32 = 1;
	impl<const N: usize> ReadNorFlash for Flash<N> {
		type Error = ();
		const READ_SIZE: usize = 1;
		fn read<'i, 'o>(
			&'o self,
			mode: u32,
			bytes: impl Into<Bytes<'i, 'o>>,
		) -> Result<Bytes<'i, 'o>, ()> {
			let mut bytes = bytes.into();
			if bytes.len() > N {
				return Err(());
			}
			match mode {
				DIRECT => bytes.output = Some(&self.storage[..bytes.len()]),
				COPY => bytes.input.copy_from_slice(&self.storage[..bytes.len()]),
				_ => unreachable!(),
			}
			Ok(bytes)
		}
		fn capacity(&self) -> usize {
			N
		}
	}

	const N: usize = 3;
	let storage = [0x53; N];
	let flash = Flash { storage };

	// Direct read and drop: copy to buffer.
	let mut buffer = [0; N];
	flash.read(DIRECT, &mut buffer[..]).unwrap();
	assert_eq!(buffer, storage);

	// Direct read and keep: no copy.
	let mut buffer = [0; N];
	let result = flash.read(DIRECT, &mut buffer[..]).unwrap();
	assert_eq!(*result, storage);
	result.no_copy();
	assert_eq!(buffer, [0; N]);

	// Copy read.
	let mut buffer = [0; N];
	flash.read(COPY, &mut buffer[..]).unwrap();
	assert_eq!(buffer, storage);

	// Multiple copy read.
	let mut buffer_a = [0; N];
	let mut buffer_b = [0; N];
	flash.read(COPY, &mut buffer_a[..]).unwrap();
	flash.read(COPY, &mut buffer_b[..]).unwrap();
	assert_eq!(buffer_a, storage);
	assert_eq!(buffer_b, storage);

	// Multiple direct read: requires read(&self) instead of read(&mut self).
	let mut buffer_a = [0; N];
	let mut buffer_b = [0; N];
	let result_a = flash.read(DIRECT, &mut buffer_a[..]).unwrap();
	let result_b = flash.read(DIRECT, &mut buffer_b[..]).unwrap();
	assert_eq!(*result_a, storage);
	assert_eq!(*result_b, storage);
	result_a.no_copy();
	result_b.no_copy();
	assert_eq!(buffer_a, [0; N]);
	assert_eq!(buffer_b, [0; N]);
}

The read function becomes:

        fn read<'i, 'o>(
		&'o self,
		offset: u32,
		bytes: impl Into<Bytes<'i, 'o>>,
	) -> Result<Bytes<'i, 'o>, Self::Error>;

Has this been taken up in a continued issue yet, or is that part of #23?

This has been abandoned as described in #22 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants