Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wp/ updated delete file to use pk as filename #33

Open
wants to merge 54 commits into
base: main
Choose a base branch
from

Conversation

rustchain64
Copy link
Collaborator

Implements, getting the hash just after upload to then use a fetch get from the client to fetch that under RwLock

@rustchain64
Copy link
Collaborator Author

At this stage, most of the features are working and tested,
However, even though get_slices was working last Friday, there is some kind of bug happening.
Also, mulitple different mime-types of many sizes needs to be uploaded and tested, and compared.

Copy link
Member

@cryptoquick cryptoquick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clean up the "GOT HERE", "AM I HERE??", extra spaces, etc. Less git noise to review, and it makes the code less readable.

src/backend/fs.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
Comment on lines 171 to 176
let mut file = fs::File::create(&path)?;

// let mut file = OpenOptions::new()
// .create_new(true)
// .write(true)
// .open(&path)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you'd go with this approach. This overwrites the catalog. We should be checking if the catalog already exists in this code, but it seems this code also got commented out. Please put the other code back in, and let's use the old OpenOptions code here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solved an Internal Error 500 while using multi-part.
Good to point out it was over-writing the catalog.
This change has no value since we are not even using multi-part.

assert_eq!(std::io::ErrorKind::InvalidData, err.kind());

Ok(decoded)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code here? Can't we just use extract_slice from carbonado core?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this section, as I was using it for A/B testing on the results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have not yet removed it. Only request my review once everything has been addressed.

src/frontend/http.rs Outdated Show resolved Hide resolved
src/frontend/http.rs Outdated Show resolved Hide resolved
src/frontend/http.rs Outdated Show resolved Hide resolved
src/backend/fs.rs Show resolved Hide resolved
src/frontend/http.rs Outdated Show resolved Hide resolved
tests/file.rs Outdated Show resolved Hide resolved
@rustchain64 rustchain64 requested a review from cryptoquick May 2, 2023 15:34
@rustchain64
Copy link
Collaborator Author

This branch has been merged and lifted to accommodate
Streaming

cryptoquick and others added 2 commits May 4, 2023 08:25
* Stream `post_file` (#34)

* Implement HTTP body streaming version of post_file.

* Refactor streaming interface format. Fix tests.

* Reduce number of files created in test by increasing chunk size to 1MB.

* Fix check catalog_exists test. Cleanup unused libraries.

* Log node public key in config. Correct Error Message in fs backend. Refactor file_stream to use explicit FileStream type alias (#35)

* Update carbonado to 0.3.0-rc.7 and secp256k1 to 0.27.0. Fix misnamed variable in tests.

* Add drive_redundancy option to config.

* Update directories crate. Default volume directory is now the Carbonado folder in the home directory. Configurable drive redundancy.

* Fixes in response to review items.

---------

Co-authored-by: Wesley Parr <rustchain64@gmail.com>

* Merge main.

* Streaming improvements (#37)

* Update unfold in write method to produce stream chunks of uniform segment size.

* Parallelize write_file. Add config node_shared_secret method and /key endpoint. Improve tests.

* Debugging...

* Fixes.

* Stream read file using axum response StreamBody. Parallelized decoding using par_then.

* Cleanup.

* Cleanup.

---------

Co-authored-by: Wesley Parr <rustchain64@gmail.com>
Copy link
Member

@cryptoquick cryptoquick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a number of things to address. Also, be sure there are no unused variables present.

.DS_Store Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add this file, please remove

assert_eq!(std::io::ErrorKind::InvalidData, err.kind());

Ok(decoded)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have not yet removed it. Only request my review once everything has been addressed.

@@ -277,9 +327,8 @@ pub fn delete_file(pk: Secp256k1PubKey, file_bytes: &[u8]) -> Result<()> {
Ok(())
}

#[allow(unused_variables)]
// TODO: These aren't worth breaking out into their own functions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read this TODO item. Once addressed, remove it.

src/backend/fs.rs Show resolved Hide resolved
@@ -58,6 +59,15 @@ async fn post_file(
Ok((StatusCode::OK, hash.to_hex().to_string()))
}

/* TODO: This seems unnecessary, and could introduce concurrency and security issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see this TODO. Once addressed, remove it.

@rustchain64 rustchain64 requested a review from cryptoquick May 5, 2023 18:21
@rustchain64
Copy link
Collaborator Author

Test write_read() while file streaming doesn't finish before
read_write_delete_file()
This causes a count the show twice as many files since they haven't cleared before the second test.
To see accurate count numbers, you need to run the tests seperately.

Copy link
Member

@cryptoquick cryptoquick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work so far on delete_file. Not so good work on read_slices.

Please update the description to annotate the issues this closes. The RwLock comment seems outdated.

.DS_Store Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file still needs to be removed. You need to add that to your global gitignore (no need to add it to the project gitignore.)

Copy link
Collaborator Author

@rustchain64 rustchain64 May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the (global) .gitignore for Mac

Copy link
Collaborator Author

@rustchain64 rustchain64 May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran
sudo find / -name ".DS_Store" -depth -exec rm {} ;
ran cargo test, no .DS_Store present

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does not appear to be used, and besides, it's in the wrong place. Please remove this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, not used for testing at this anymore

file_bytes: &[u8],
//slice_start: u64, //let slice_start = 65536;
//slice_len: u16, // let slice_len = 8192;
) -> Result<Vec<u8>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes no sense to provide the entire file to this method. This is supposed to be for retrieving just a slice of the file. It needs to return the range of the raw bytes on disk. It can make the disk IO calls by reading the catalog and then the appropriate segment.

I've updated the issue so hopefully the task is more clear:
#18

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those values where for initial testing only.
actual slice data comes from the params.

Comment on lines 265 to 303
use std::io::prelude::*;

// Start by encoding some input.
let (encoded_input, hash) = bao::encode::encode(file_bytes);
debug!(">>> encoded INPUT LENGTH:: {:?}", &encoded_input.len());
debug!(">>> fs hash hash:: {:?}", &hash);

// Slice the encoding. These parameters are multiples of the chunk size, which avoids
// unnecessary overhead.
let slice_start = 65536;
let slice_len = 8192;
let encoded_cursor = std::io::Cursor::new(&encoded_input);
let mut extractor = bao::encode::SliceExtractor::new(encoded_cursor, slice_start, slice_len);
let mut slice = Vec::new();
extractor.read_to_end(&mut slice)?;

// Decode the slice. The result should be the same as the part of the input that the slice
// represents. Note that we're using the same hash that encoding produced, which is
// independent of the slice parameters. That's the whole point; if we just wanted to re-encode
// a portion of the input and wind up with a different hash, we wouldn't need slicing.
let mut decoded = Vec::new();
let mut decoder = bao::decode::SliceDecoder::new(&*slice, &hash, slice_start, slice_len);
decoder.read_to_end(&mut decoded)?;

debug!(
"usize vs length: {:?}",
&encoded_input[slice_start as usize..][..slice_len as usize].len()
);

// Like regular decoding, slice decoding will fail if the hash doesn't match.
// let mut bad_slice = slice.clone();
// debug!("BAD SLICE");
// let last_index = bad_slice.len() - 1;
// bad_slice[last_index] ^= 1;
// let mut decoder = bao::decode::SliceDecoder::new(&*bad_slice, &hash, slice_start, slice_len);
// let err = decoder.read_to_end(&mut Vec::new()).unwrap_err();
// assert_eq!(std::io::ErrorKind::InvalidData, err.kind());

Ok(decoded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this makes any sense. Do not submit this code again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworking slices by reading catalog, not files as you stated.

for entry in fs::read_dir(path)? {
trace!("Delete CATALOG File at {:?}", entry);
fs::remove_file(entry?.path())?;
pub async fn delete_file(hash: &String) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should pass &str, not &String, this is not a pattern we use.

@@ -89,9 +86,10 @@ async fn key(Path(pk): Path<String>) -> Result<impl IntoResponse, AppError> {

pub async fn start() -> Result<()> {
let app = Router::new()
.route("/remove/:pk/:blake3_hash", delete(remove_file))
.route("/remove/:blake3_hash", delete(remove_file))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the pk... We'll need this to be passed for the changes I'm making in #39. I've trimmed that PR down so we can get that merged and you can use that to make sure file deletion is namespaced by pk.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your changes address that.

@rustchain64 rustchain64 changed the title Wp/upload get last hash Wp/ updated delete file to use pk as filename May 8, 2023
@rustchain64
Copy link
Collaborator Author

rustchain64 commented May 8, 2023

While Testing Locally, test write_read ... FAILED
but passes remotely

test write_read ... FAILED

failures:

---- write_read stdout ----
Error: No such file or directory (os error 2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants