Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional gzip compression #1389

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kannibalox
Copy link
Contributor

This introduces a single new value command network.gzip_response_min_size for determining if the response should be gzipped (I'm very open to better names for that variable). It'll additionally require the SCGI client to announce support for gzip responses in the ACCEPT_ENCODING header. Since this is of dubious benefit to most use cases, it's disabled by default via setting it to a value of less than 0. This also technically makes zlib a new dependency for rtorrent, but since it's already a hard dependency for libtorrent that didn't seem unreasonable.

A fun little interaction is that calling network.gzip_response_min_size over RPC impacts the response immediately.

@rakshasa
Copy link
Owner

network.scgi.use_gzip and network.scgi.gzip.min_size

src/rpc/scgi_task.cc Outdated Show resolved Hide resolved
src/rpc/scgi_task.cc Outdated Show resolved Hide resolved
src/rpc/scgi_task.cc Outdated Show resolved Hide resolved
should_compress = false;
}
}
header += "\r\n";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of std::strings causes unnecessary copying of buffers.

Rewrite both the gzip compressor and this to write directly to m_buffer, pass e.g. a lambda function to gzip_compressor that does the writing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes a chicken-and-egg problem where we need to know the Content-Length to figure out where the write cursor in m_buffer should start from (given that there's a more than decent chance we might drop a digit in the size string in the course of compression), but we don't know the length until the compression itself is complete.

The only way I can think to work around that is to first write the compressed output directly a little ways into the m_buffer and std::memmove it back to the correct position after the headers have been written. I'll go in that direction unless you tell me otherwise.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see why you need to do it in such a convoluted way, just pass a lambda function that does the writing and have it resize m_buffer if more is needed.

You can change m_buffer to std::vector to make it cleaner.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

src/rpc/scgi_task.h Outdated Show resolved Hide resolved
src/rpc/scgi_task.h Outdated Show resolved Hide resolved
@kannibalox kannibalox force-pushed the feature/gzip-response branch from e74b4fe to 8de5eca Compare January 27, 2025 05:47
static const int max_content_size = (2 << 23);
static constexpr unsigned int default_buffer_size = 2047;
static constexpr int max_header_size = 2000;
static constexpr int max_content_size = (2 << 23);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned int

static constexpr int max_content_size = (2 << 23);

static int gzip_min_size() { return m_min_compress_response_size; }
static void set_gzip_min_size(int size) { m_min_compress_response_size = size; }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should throw input_error on bad size.

unsigned int m_buffer_size;

ContentType m_content_type{XML};
bool m_client_accepts_compressed_response = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{false} to be consistent.

// indeterminate state, but m_position and m_bufferSize remain the
// same.
bool
SCgiTask::gzip_compress_response(const char* buffer, uint32_t length, std::string_view header_template) {
Copy link
Owner

@rakshasa rakshasa Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too ugly a solution, not what I had in mind.

Add to utils a function like this:

gzip_compress(const char* buffer, uint32_t length, std::function<bool(char*,uint32_t,uint32_t)>)

The lambda function will write from m_buffer+max_header_size (where max_header_size is calculated from the current header template plus max length of the printed response size).

It will receive deflateBound value so it can resize on the first call.

Once done writing, copy the bytes from the header and response size to m_buffer so they end at the start of the written gzip'ed data. Then event_write starts sending from m_buffer+m_header_offset.

Don't use snprintf for a whole template, instead the header should be two static const strings, and you copy them and the response size.

if (m_client_accepts_compressed_response &&
gzip_enabled() &&
length > gzip_min_size() &&
gzip_compress_response(buffer, length, header)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like how a failure to compress falls back to plaintext, this should only happen if there's a serious bug so fail it completely.

m_buffer_size = length + header_size;

snprintf(m_buffer, m_buffer_size, header.c_str(), length);
std::memcpy(m_buffer + header_size, buffer, length);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a compressed path in a separate function, also put the plaintext in one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants