Skip to content

Commit

Permalink
Add ability to request checksum in an S3 HeadObject request (#1083)
Browse files Browse the repository at this point in the history
* Add option to retrieve additional checksums with HeadObject

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Add changelog entry and comment

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Remove import condition for s3express_tests

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Appease clippy

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

* Appease clippy

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>

---------

Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
  • Loading branch information
dannycjones authored Oct 28, 2024
1 parent e72d7ac commit 8f2770b
Show file tree
Hide file tree
Showing 15 changed files with 253 additions and 45 deletions.
7 changes: 6 additions & 1 deletion mountpoint-s3-client/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

### Other changes

* No other changes.
* Add parameter to request checksum information as part of a `HeadObject` request.
If specified, the result should contain the checksum for the object if available in the S3 response.
([#1083](https://github.com/awslabs/mountpoint-s3/pull/1083))

### Breaking changes

Expand All @@ -12,6 +14,9 @@
([#1058](https://github.com/awslabs/mountpoint-s3/pull/1058))
* `HeadObjectResult` no longer provides the bucket and key used in the original request.
([#1058](https://github.com/awslabs/mountpoint-s3/pull/1058))
* `head_object` method now requires a `HeadObjectParams` parameter.
The structure itself is not required to specify anything to achieve the existing behavior.
([#1083](https://github.com/awslabs/mountpoint-s3/pull/1083))

## v0.11.0 (October 17, 2024)

Expand Down
9 changes: 5 additions & 4 deletions mountpoint-s3-client/src/failure_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ use pin_project::pin_project;
use crate::object_client::{
CopyObjectError, CopyObjectParams, CopyObjectResult, DeleteObjectError, DeleteObjectResult, ETag, GetBodyPart,
GetObjectAttributesError, GetObjectAttributesResult, GetObjectError, GetObjectRequest, HeadObjectError,
HeadObjectResult, ListObjectsError, ListObjectsResult, ObjectAttribute, ObjectClient, ObjectClientError,
ObjectClientResult, PutObjectError, PutObjectParams, PutObjectRequest, PutObjectResult, PutObjectSingleParams,
UploadReview,
HeadObjectParams, HeadObjectResult, ListObjectsError, ListObjectsResult, ObjectAttribute, ObjectClient,
ObjectClientError, ObjectClientResult, PutObjectError, PutObjectParams, PutObjectRequest, PutObjectResult,
PutObjectSingleParams, UploadReview,
};

// Wrapper for injecting failures into a get stream or a put request
Expand Down Expand Up @@ -167,9 +167,10 @@ where
&self,
bucket: &str,
key: &str,
params: &HeadObjectParams,
) -> ObjectClientResult<HeadObjectResult, HeadObjectError, Self::ClientError> {
(self.head_object_cb)(&mut *self.state.lock().unwrap(), bucket, key)?;
self.client.head_object(bucket, key).await
self.client.head_object(bucket, key, params).await
}

async fn put_object(
Expand Down
10 changes: 5 additions & 5 deletions mountpoint-s3-client/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,11 +72,11 @@ pub mod config {
/// Types used by all object clients
pub mod types {
pub use super::object_client::{
Checksum, ChecksumAlgorithm, CopyObjectParams, CopyObjectResult, DeleteObjectResult, ETag, GetBodyPart,
GetObjectAttributesParts, GetObjectAttributesResult, GetObjectRequest, HeadObjectResult, ListObjectsResult,
ObjectAttribute, ObjectClientResult, ObjectInfo, ObjectPart, PutObjectParams, PutObjectResult,
PutObjectSingleParams, PutObjectTrailingChecksums, RestoreStatus, UploadChecksum, UploadReview,
UploadReviewPart,
Checksum, ChecksumAlgorithm, ChecksumMode, CopyObjectParams, CopyObjectResult, DeleteObjectResult, ETag,
GetBodyPart, GetObjectAttributesParts, GetObjectAttributesResult, GetObjectRequest, HeadObjectParams,
HeadObjectResult, ListObjectsResult, ObjectAttribute, ObjectClientResult, ObjectInfo, ObjectPart,
PutObjectParams, PutObjectResult, PutObjectSingleParams, PutObjectTrailingChecksums, RestoreStatus,
UploadChecksum, UploadReview, UploadReviewPart,
};
}

Expand Down
33 changes: 26 additions & 7 deletions mountpoint-s3-client/src/mock_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ use tracing::trace;
use crate::checksums::crc32c_to_base64;
use crate::error_metadata::{ClientErrorMetadata, ProvideErrorMetadata};
use crate::object_client::{
Checksum, ChecksumAlgorithm, CopyObjectError, CopyObjectParams, CopyObjectResult, DeleteObjectError,
Checksum, ChecksumAlgorithm, ChecksumMode, CopyObjectError, CopyObjectParams, CopyObjectResult, DeleteObjectError,
DeleteObjectResult, ETag, GetBodyPart, GetObjectAttributesError, GetObjectAttributesParts,
GetObjectAttributesResult, GetObjectError, GetObjectRequest, HeadObjectError, HeadObjectResult, ListObjectsError,
ListObjectsResult, ObjectAttribute, ObjectClient, ObjectClientError, ObjectClientResult, ObjectInfo, ObjectPart,
PutObjectError, PutObjectParams, PutObjectRequest, PutObjectResult, PutObjectSingleParams,
GetObjectAttributesResult, GetObjectError, GetObjectRequest, HeadObjectError, HeadObjectParams, HeadObjectResult,
ListObjectsError, ListObjectsResult, ObjectAttribute, ObjectClient, ObjectClientError, ObjectClientResult,
ObjectInfo, ObjectPart, PutObjectError, PutObjectParams, PutObjectRequest, PutObjectResult, PutObjectSingleParams,
PutObjectTrailingChecksums, RestoreStatus, UploadReview, UploadReviewPart,
};

Expand Down Expand Up @@ -376,6 +376,10 @@ pub struct MockObject {
etag: ETag,
parts: Option<MockObjectParts>,
object_metadata: HashMap<String, String>,
/// S3 checksums associated with the object.
///
/// Typically, at most one of the checksums should be set.
checksum: Checksum,
}

impl MockObject {
Expand All @@ -395,6 +399,7 @@ impl MockObject {
etag,
parts: None,
object_metadata: HashMap::new(),
checksum: Checksum::empty(),
}
}

Expand All @@ -408,6 +413,7 @@ impl MockObject {
etag,
parts: None,
object_metadata: HashMap::new(),
checksum: Checksum::empty(),
}
}

Expand All @@ -431,6 +437,7 @@ impl MockObject {
etag,
parts: None,
object_metadata: HashMap::new(),
checksum: Checksum::empty(),
}
}

Expand All @@ -450,6 +457,10 @@ impl MockObject {
self.restore_status = restore_status;
}

pub fn set_checksum(&mut self, checksum: Checksum) {
self.checksum = checksum;
}

pub fn len(&self) -> usize {
self.size
}
Expand Down Expand Up @@ -676,6 +687,7 @@ impl ObjectClient for MockClient {
&self,
bucket: &str,
key: &str,
params: &HeadObjectParams,
) -> ObjectClientResult<HeadObjectResult, HeadObjectError, Self::ClientError> {
trace!(bucket, key, "HeadObject");
self.inc_op_count(Operation::HeadObject);
Expand All @@ -686,12 +698,19 @@ impl ObjectClient for MockClient {

let objects = self.objects.read().unwrap();
if let Some(object) = objects.get(key) {
// Checksum information is opt-in
let checksum = match params.checksum_mode {
Some(ChecksumMode::Enabled) => object.checksum.clone(),
None => Checksum::empty(),
};

Ok(HeadObjectResult {
size: object.size as u64,
last_modified: object.last_modified,
etag: object.etag.clone(),
storage_class: object.storage_class.clone(),
restore_status: object.restore_status,
checksum,
})
} else {
Err(ObjectClientError::ServiceError(HeadObjectError::NotFound))
Expand Down Expand Up @@ -1676,7 +1695,7 @@ mod tests {
put_request.complete().await.unwrap();

// head_object returns storage class
let head_result = client.head_object(bucket, key).await.unwrap();
let head_result = client.head_object(bucket, key, &HeadObjectParams::new()).await.unwrap();
assert_eq!(head_result.storage_class.as_deref(), storage_class);

// list_objects returns storage class
Expand All @@ -1699,14 +1718,14 @@ mod tests {
let head_counter_1 = client.new_counter(Operation::HeadObject);
let delete_counter_1 = client.new_counter(Operation::DeleteObject);

let _result = client.head_object(bucket, "key").await;
let _result = client.head_object(bucket, "key", &HeadObjectParams::new()).await;
assert_eq!(1, head_counter_1.count());
assert_eq!(0, delete_counter_1.count());

let head_counter_2 = client.new_counter(Operation::HeadObject);
assert_eq!(0, head_counter_2.count());

let _result = client.head_object(bucket, "key").await;
let _result = client.head_object(bucket, "key", &HeadObjectParams::new()).await;
let _result = client.delete_object(bucket, "key").await;
let _result = client.delete_object(bucket, "key").await;
let _result = client.delete_object(bucket, "key").await;
Expand Down
7 changes: 4 additions & 3 deletions mountpoint-s3-client/src/mock_client/throughput_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ use crate::mock_client::{
use crate::object_client::{
CopyObjectError, CopyObjectParams, CopyObjectResult, DeleteObjectError, DeleteObjectResult, ETag, GetBodyPart,
GetObjectAttributesError, GetObjectAttributesResult, GetObjectError, GetObjectRequest, HeadObjectError,
HeadObjectResult, ListObjectsError, ListObjectsResult, ObjectAttribute, ObjectClient, ObjectClientResult,
PutObjectError, PutObjectParams, PutObjectResult, PutObjectSingleParams,
HeadObjectParams, HeadObjectResult, ListObjectsError, ListObjectsResult, ObjectAttribute, ObjectClient,
ObjectClientResult, PutObjectError, PutObjectParams, PutObjectResult, PutObjectSingleParams,
};

/// A [MockClient] that rate limits overall download throughput to simulate a target network
Expand Down Expand Up @@ -168,8 +168,9 @@ impl ObjectClient for ThroughputMockClient {
&self,
bucket: &str,
key: &str,
params: &HeadObjectParams,
) -> ObjectClientResult<HeadObjectResult, HeadObjectError, Self::ClientError> {
self.inner.head_object(bucket, key).await
self.inner.head_object(bucket, key, params).await
}

async fn put_object(
Expand Down
49 changes: 48 additions & 1 deletion mountpoint-s3-client/src/object_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ pub trait ObjectClient {
&self,
bucket: &str,
key: &str,
params: &HeadObjectParams,
) -> ObjectClientResult<HeadObjectResult, HeadObjectError, Self::ClientError>;

/// Put an object into the object store. Returns a [PutObjectRequest] for callers
Expand Down Expand Up @@ -206,6 +207,35 @@ pub enum ListObjectsError {
NoSuchBucket,
}

/// Parameters to a [`head_object`](ObjectClient::head_object) request
#[derive(Debug, Default, Clone)]
#[non_exhaustive]
pub struct HeadObjectParams {
/// Enable to retrieve checksum as part of the HeadObject request
pub checksum_mode: Option<ChecksumMode>,
}

impl HeadObjectParams {
/// Create a default [HeadObjectParams].
pub fn new() -> Self {
Self::default()
}

/// Set option to retrieve checksum as part of the HeadObject request
pub fn checksum_mode(mut self, value: Option<ChecksumMode>) -> Self {
self.checksum_mode = value;
self
}
}

/// Enable [ChecksumMode] to retrieve object checksums
#[non_exhaustive]
#[derive(Clone, Debug)]
pub enum ChecksumMode {
/// Retrieve checksums
Enabled,
}

/// Result of a [`head_object`](ObjectClient::head_object) request
#[derive(Debug)]
#[non_exhaustive]
Expand All @@ -232,6 +262,11 @@ pub struct HeadObjectResult {
/// Objects in flexible retrieval storage classes (such as GLACIER and DEEP_ARCHIVE) are only
/// accessible after restoration
pub restore_status: Option<RestoreStatus>,
/// Checksum of the object.
///
/// HeadObject must explicitly request for this field to be included,
/// otherwise the values will be empty.
pub checksum: Checksum,
}

/// Errors returned by a [`head_object`](ObjectClient::head_object) request
Expand Down Expand Up @@ -647,7 +682,7 @@ impl fmt::Display for ObjectAttribute {
///
/// See [Checksum](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Checksum.html) in the *Amazon
/// S3 API Reference* for more details.
#[derive(Debug)]
#[derive(Clone, Debug)]
pub struct Checksum {
/// Base64-encoded, 32-bit CRC32 checksum of the object
pub checksum_crc32: Option<String>,
Expand All @@ -662,6 +697,18 @@ pub struct Checksum {
pub checksum_sha256: Option<String>,
}

impl Checksum {
/// Construct an empty [Checksum]
pub fn empty() -> Self {
Self {
checksum_crc32: None,
checksum_crc32c: None,
checksum_sha1: None,
checksum_sha256: None,
}
}
}

/// Metadata about object parts from GetObjectAttributes API.
///
/// See [GetObjectAttributesParts](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectAttributesParts.html)
Expand Down
3 changes: 2 additions & 1 deletion mountpoint-s3-client/src/s3_crt_client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1305,8 +1305,9 @@ impl ObjectClient for S3CrtClient {
&self,
bucket: &str,
key: &str,
params: &HeadObjectParams,
) -> ObjectClientResult<HeadObjectResult, HeadObjectError, Self::ClientError> {
self.head_object(bucket, key).await
self.head_object(bucket, key, params).await
}

async fn put_object(
Expand Down
Loading

0 comments on commit 8f2770b

Please sign in to comment.