Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP (127KB) and HTTPS (766KB) #16

Open
polarathene opened this issue Oct 16, 2024 · 4 comments
Open

HTTP (127KB) and HTTPS (766KB) #16

polarathene opened this issue Oct 16, 2024 · 4 comments

Comments

@polarathene
Copy link

polarathene commented Oct 16, 2024

You can build httpget with nightly for some improvements in size reduction: neonmoe/minreq#111

Reproduction

I've adapted the referenced example to httpget build. I've omitted some extra optimizations from the linked issue which would be approx 20KB smaller (or 50%+ with UPX for further reduction via compression).

# Reproduction environment via Docker:
docker run --rm -it --workdir /build fedora:41

# - `lld` is optional, it's used with `-C link-arg=-fuse-ld=lld`,
#   slight improvement over the internal LLD linker the rust toolchain bundles
# - `musl-gcc` is required for `--features tls` since we're building from a glibc host:
dnf install -y lld git gcc musl-gcc rustup-init

# Nightly rust with the musl target (for static build) and `rust-src` (for `-Z build-std`)
# rustc 1.84.0-nightly (e7c0d2750 2024-10-15)
rustup-init -y \
  --profile minimal \
  --component rust-src \
  --target x86_64-unknown-linux-musl \
  --default-toolchain nightly
. "$HOME/.cargo/env"

git clone --depth 1 https://github.com/cryptaliagy/httpget .

HTTP only (127.5KB)

$ cargo +nightly build --release --target x86_64-unknown-linux-musl
$ du --bytes target/x86_64-unknown-linux-musl/release/httpget
534760  target/x86_64-unknown-linux-musl/release/httpget

$ RUSTFLAGS='-C link-arg=-fuse-ld=lld -C relocation-model=static' \
  cargo +nightly build --release \
  --target x86_64-unknown-linux-musl \
  -Z build-std=std,panic_abort \
  -Z build-std-features=panic_immediate_abort

du --bytes target/x86_64-unknown-linux-musl/release/httpget
127488  target/x86_64-unknown-linux-musl/release/httpget

HTTPS (766KB)

$ cargo +nightly build --release --target x86_64-unknown-linux-musl --features tls

$ du --bytes target/x86_64-unknown-linux-musl/release/httpget
1509720 target/x86_64-unknown-linux-musl/release/httpget

$ RUSTFLAGS='-C link-arg=-fuse-ld=lld -C relocation-model=static' \
  cargo +nightly build --release \
  --target x86_64-unknown-linux-musl \
  --features tls \
  -Z build-std=std,panic_abort \
  -Z build-std-features=panic_immediate_abort

du --bytes target/x86_64-unknown-linux-musl/release/httpget
995024  target/x86_64-unknown-linux-musl/release/httpget

Size improvement cargo update + UPX

For the --features tls results, if you update Cargo.lock with cargo update the sizes are 1,268,056 vs 766,320, which is a nice improvement in savings.

  • No improvement observed in size for an HTTP only build.
  • Your README states 1.2MB for rustls, that was true with the Cargo.lock before update dependencies #15

Use UPX to bring the HTTPS build down to just 432KB 😎 (for an HTTP-only build, this reduces down to about 70KB)

$ dnf install -y upx
$ upx --lzma target/x86_64-unknown-linux-musl/release/httpget
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2024
UPX 4.2.4       Markus Oberhumer, Laszlo Molnar & John Reiser    May 9th 2024

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
    763216 ->    432356   56.65%   linux/amd64   httpget
@polarathene
Copy link
Author

polarathene commented Oct 17, 2024

Dynamic linked gnu target (for glibc + openssl) for 112KB (HTTPS) / 84.6KB (HTTP)

For base images which need glibc + openssl for other binaries, a dynamically linked httpget could make sense for HTTPS if you'd like to keep the added weight minimal.

This would require adding another feature to Cargo.toml to use minreq/native-tls instead of minreq/https-rustls.

# Additional build flags from `neonmoe/minreq/issues/111` used, only provides a 10KB additional reduction:
# NOTE: Unlike the original issue message, these results are from a slightly modified `httpget`,
# but it should be roughly equivalent.
RUSTFLAGS='-Z location-detail=none -Z fmt-debug=none -C link-arg=-fuse-ld=lld -C relocation-model=static' \
  cargo build --release --target x86_64-unknown-linux-gnu --features tls \
  -Z build-std=std,panic_abort -Z build-std-features=panic_immediate_abort,optimize_for_size
# NOTE: ldd is not exactly accurate for knowing which externally linked dependencies are required,
# `libz` is actually from `libcrypto` (part of OpenSSL 3)
$ ldd target/x86_64-unknown-linux-gnu/release/httpget
        linux-vdso.so.1 (0x00007fff311dc000)
        libssl.so.3 => /lib64/libssl.so.3 (0x00007f3cfb0f5000)
        libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f3cfac44000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f3cfaa52000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f3cfaa31000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3cfb1d2000)

$ dnf install -y patchelf
# Direct libraries (linked by the binary only)
$ patchelf --print-needed target/x86_64-unknown-linux-gnu/release/httpget
libssl.so.3
libcrypto.so.3
libc.so.6

# `libz` isn't available on Google Distroless,
# but since this isn't actually a directly linked library for `httpget`, this is not a concern:
$ patchelf --print-needed /lib64/libcrypto.so.3
libz.so.1
libc.so.6

If you did consider glibc dynamic linking, you should consider the build host will set the minimum glibc version to whatever it has. One way around that is via cargo-zigbuild which allows you to build with Zig and provide a glibc version baseline that's more acceptable.

UPX compression (HTTPS => 53KB, HTTP => 43KB)

With a compressed executable via UPX:

$ upx --lzma target/x86_64-unknown-linux-gnu/release/httpget

                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2024
UPX 4.2.4       Markus Oberhumer, Laszlo Molnar & John Reiser    May 9th 2024

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
    112024 ->     53088   47.39%   linux/amd64   httpget


# For reference, dynamically linked HTTP only build with UPX is 43KB:
upx --lzma target/x86_64-unknown-linux-gnu/release/httpget
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2024
UPX 4.2.4       Markus Oberhumer, Laszlo Molnar & John Reiser    May 9th 2024

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
     84600 ->     43256   51.13%   linux/amd64   httpget

NOTE: UPX will prevent inspection of linked libraries, appearing as a static one:

$ patchelf --print-needed target/x86_64-unknown-linux-gnu/release/httpget
patchelf: no section headers. The input file is probably a statically linked, self-decompressing binary

$ ldd target/x86_64-unknown-linux-gnu/release/httpget
        not a dynamic executable

Upstream reductions for minreq (70-100KB less possible for HTTPS before UPX)

NOTE: This isn't too relevant if the image is FROM scratch with no existing trust store or private CA certs.

If neonmoe/minreq#111 gets resolved, a separate feature for using rustls-native-certs for static builds would also work well for 70KB less (no webpki bundled) when the image already has ca-certificates available, or when you'd like to have support for self-signed certs from your private CA :)

If the other improvements from that issue were also tackled it'd be to get a static HTTPS build that is 376KB with UPX 😎

@cryptaliagy
Copy link
Owner

Thank you for the comprehensive breakdown! I'll investigate this on my end and see what the implication of those nightly features are.

I think I'm more partial to the idea of publishing a compressed and an uncompressed binary side-by-side so folks can choose their own threat model. I'm not super familiar with UPX or with binary compression in that way so I don't want to force that on folks.

I've already opened #17 to start updating the cargo file, I will see if I can do an update to the docs with the updated binary sizes and publish a new version to start with

@polarathene
Copy link
Author

what the implication of those nightly features are.

A bit verbose, but this information might assist you with that:

RUSTFLAGS='-C link-arg=-fuse-ld=lld -C relocation-model=static'

  • -fuse-ld=lld is a linker arg that allows you to use the system LLD instead of the one in the current rust toolchain. You will need to have lld package installed. The size reduction from this seems rather minimal in this case (around 10KB on average) so you could ignore it.
  • relocation-model=static is for a truly static binary. It can have quite a reduction in size, I've seen MB reductions on much larger binaries, but in this case it's only around 10-30KB.
    • This is similar to the -Wl,--no-pie linker arg where if you use file command on the binary it will report that it is statically linked instead of static pie linked. I think it also does a few other changes beyond disabling PIE.
    • Relocations / PIE is related to security feature to make memory locations less predictable to an attacker IIRC. Some of these benefits are effectively lost though when you produce a static binary, and in this usage case for a fixed HTTP/HTTPS request for a healthcheck in a minimal image I would imagine any such risks aren't all that applicable. I'm not an expert on the subject, but I can link to some related resources about this setting I've come across in the past if you'd like that.
-Z build-std=std,panic_abort \
-Z build-std-features=panic_immediate_abort

When using -Z build-std with panic = "abort", you need to specifically add panic_abort. There is also core and alloc, but these aren't relevant here when we add std. At least std or core is required to be paired with panic_abort AFAIK to avoid a build failure.

-Z build-std-features is complimentary, in this case we're using panic_immediate_abort.

  • This makes the panic!() message you have redundant, as it strips away the extra functionality, instead you get Illegal Instruction output instead of the panic message.
  • If you place a panic!() call at the start of main() so it always triggers, you'll get a binary build that's only 20KB because everything after it is now unreachable, add a println!() before the panic!() and you'll see your message still works before the panic triggers. The exit status from this panic becomes 132 (aka SIGILL, Illegal Instruction).
  • This addition does bring notable size reduction when this change in behaviour is acceptable.

For size impacts of these tweaks, I did make some notes about their individual impact here.

AFAIK, these optimizations are all fine for httpget, but building on nightly sometimes breaks requiring CI to pin a nightly release if that happens. Probably not a concern for httpget since releases are not frequent and the dependency tree isn't that big.


I think I'm more partial to the idea of publishing a compressed and an uncompressed binary side-by-side so folks can choose their own threat model.
I'm not super familiar with UPX or with binary compression in that way so I don't want to force that on folks.

Oh that's perfectly ok, it was just added for reference, note that the issue title doesn't reference the UPX size reductions. I share some insights here as to caveats to keep in mind when deciding if UPX is appropriate.

In this case for static builds I think it's fine, but it's quite easy to use a separate stage in a Dockerfile that grabs the binary and runs UPX on it before copying that over to the final stage if someone wants the added compression.

At the current size without UPX involved, I think most do not have a need push it down further with compression, it'll already be compressed with the image over the network pulls, and in constrained environments it'd use more CPU and memory which would be more valuable than disk.

If you decide to publish with such it should be clear that UPX was used, especially in other projects where chasing disk size improvements can unintentionally negatively impact runtime costs.

@cryptaliagy
Copy link
Owner

Thank you so much for this detailed follow-up! I'll be trying to schedule the work sometime in the next couple of months between my work-work and class work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants