Support HTTP responses containing multiple ZSTD frames #2583

Andrey36652 · 2025-03-05T18:55:09Z

proper tests and fix for #2574

yanns

This is looking good to me.
Of course, we can multiply the tests, like

multiple frames in multiple chunks, where one frame = one chunk
multiple frames in multiple chunks, where frames and chunks do not align

Here is the one I was working on:

#[tokio::test]
async fn test_chunked_fragmented_multiple_frames_in_multiple_chunks() {
    const DELAY_BETWEEN_RESPONSE_PARTS: tokio::time::Duration =
        tokio::time::Duration::from_millis(1000);
    const DELAY_MARGIN: tokio::time::Duration = tokio::time::Duration::from_millis(50);
    const RESPONSE_CONTENT_2: &str = "another message here";

    let server = server::low_level_with_response(|_raw_request, client_socket| {
        Box::new(async move {
            let zstded_content_1 = zstd_compress(RESPONSE_CONTENT.as_bytes());
            let zstded_content_2 = zstd_compress(RESPONSE_CONTENT_2.as_bytes());

            let response_first_part = [
                COMPRESSED_RESPONSE_HEADERS,
                format!(
                    "Transfer-Encoding: chunked\r\n\r\n{:x}\r\n",
                    zstded_content_1.len() + zstded_content_2.len()
                )
                .as_bytes(),
                &zstded_content_1,
            ]
            .concat();

            client_socket
                .write_all(response_first_part.as_slice())
                .await
                .expect("response_first_part write_all failed");
            client_socket
                .flush()
                .await
                .expect("response_first_part flush failed");

            tokio::time::sleep(DELAY_BETWEEN_RESPONSE_PARTS).await;

            client_socket
                .write_all(zstded_content_2.as_slice())
                .await
                .expect("response_second_part write_all failed");
            client_socket
                .flush()
                .await
                .expect("response_second_part flush failed");

            client_socket
                .write_all(b"\r\n0\r\n\r\n")
                .await
                .expect("write_all failed");
            client_socket.flush().await.expect("flush failed");
        })
    });

    let start = tokio::time::Instant::now();
    let res = reqwest::Client::new()
        .get(format!("http://{}/", server.addr()))
        .send()
        .await
        .expect("response");

    assert_eq!(
        res.text().await.expect("text"),
        format!("{RESPONSE_CONTENT}{RESPONSE_CONTENT_2}")
    );
    assert!(start.elapsed() >= DELAY_BETWEEN_RESPONSE_PARTS - DELAY_MARGIN);
}

But I guess we cannot test all combinations here. The code is generic enough.

tests/zstd.rs

seanmonstar · 2025-03-06T13:16:33Z

Thanks! Something I'd like to know before we merge:

My question is, is that the right place to do this? Is it expected in zstd to have multiple members? I believe in other algorithms that's supported but atypical. If you want to investigate that, it'd be good to have a solid answer before we blindly flip that switch.

yanns · 2025-03-06T13:25:38Z

Thanks! Something I'd like to know before we merge:

My question is, is that the right place to do this? Is it expected in zstd to have multiple members? I believe in other algorithms that's supported but atypical. If you want to investigate that, it'd be good to have a solid answer before we blindly flip that switch.

I've tried to find more information about it, but could not really find something good.
What I found:

zstd can use multiple frames: https://datatracker.ietf.org/doc/html/rfc8878#name-frames

Zstandard compressed data is made up of one or more frames. Each frame is independent and can be decompressed independently of other frames. The decompressed content of multiple concatenated frames is the concatenation of each frame's decompressed content.

Files with multiple frames have frames skipped? indygreg/python-zstandard#59

Since the official tools transparently handle multiple frames, I'm leaning towards that being the default for python-zstandard - at least for streaming operations

I could not find anything in the HTTP spec specifying if a client must handle multiple zstd frames.

Andrey36652 · 2025-03-06T14:15:45Z

Is it expected in zstd to have multiple members?

Yes. Per https://www.rfc-editor.org/rfc/rfc8878.html#section-3.1-1

There are examples in the past when lib not supporting multi framed ZSTD decompression by default leads to confusion:

There are cases when developers want fine control over compressed ZSTD content (like indygreg/python-zstandard#59 and https://docs.rs/zstd-framed/latest/zstd_framed/ ) but I don't think that's the intended usecase for reqwest

Co-authored-by: Yann Simon <yann.simon.fr@gmail.com>

yanns · 2025-03-07T14:02:47Z

I could try this patch on an internal service.
We could not activate zstd before because of the issue.
Now it's working perfectly.

seanmonstar

Thanks for the fix!

Andrey36652 added 2 commits March 5, 2025 19:48

add tests for multi framed ZSTD compressed body

55f4ed9

support multiple ZSTD frames in compressed HTTP response

47d9f8b

Andrey36652 mentioned this pull request Mar 5, 2025

issue when decompressing zstd-compressed response bodies #2574

Closed

yanns approved these changes Mar 6, 2025

View reviewed changes

tests/zstd.rs Outdated Show resolved Hide resolved

tests/zstd.rs Outdated Show resolved Hide resolved

Apply suggestions from code review

3ed3fae

Co-authored-by: Yann Simon <yann.simon.fr@gmail.com>

seanmonstar approved these changes Mar 7, 2025

View reviewed changes

seanmonstar merged commit 6f9d0ee into seanmonstar:master Mar 7, 2025
36 checks passed

Andrey36652 mentioned this pull request Mar 8, 2025

test HTTP connection reuse with new zstd fix #2587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support HTTP responses containing multiple ZSTD frames #2583

Support HTTP responses containing multiple ZSTD frames #2583

Andrey36652 commented Mar 5, 2025

yanns left a comment

seanmonstar commented Mar 6, 2025

yanns commented Mar 6, 2025

Andrey36652 commented Mar 6, 2025

yanns commented Mar 7, 2025

seanmonstar left a comment

Support HTTP responses containing multiple ZSTD frames #2583

Support HTTP responses containing multiple ZSTD frames #2583

Conversation

Andrey36652 commented Mar 5, 2025

yanns left a comment

Choose a reason for hiding this comment

seanmonstar commented Mar 6, 2025

yanns commented Mar 6, 2025

Andrey36652 commented Mar 6, 2025

yanns commented Mar 7, 2025

seanmonstar left a comment

Choose a reason for hiding this comment