New audit: cache poisoning #261

woodruffw · 2024-12-09T16:25:52Z

Cache poisoning is difficult/impossible for us to detect in the general case, since it's a runtime aspect.

However, we can flag certain patterns that we know are likely to result in exploitable cache poisoning, e.g.:

The workflow has a release/tag trigger, indicating that it's a release process
The workflow has one or more steps that include caching, such as setup-python with the cache: key or setup-node with similar. There are probably many of these, and we need to figure them out/enumerate them.

The text was updated successfully, but these errors were encountered:

ubiratansoares · 2024-12-11T19:48:32Z

@woodruffw If you don't have plans to take this one, I'd love to give it a try 🙂

woodruffw · 2024-12-11T19:59:18Z

@woodruffw If you don't have plans to take this one, I'd love to give it a try 🙂

Go for it! This one has some interesting edge cases (e.g. there might be a top-level release trigger but then an if: that limits it on a job with a cache), but we can figure those out as you iterate 🙂

Off the top of my head, here's an initial set of actions we want to catch here:

actions/setup-python: has cache:
actions/setup-node: has cache:
ruby/setup-ruby: has bundler-cache:
astral-sh/setup-uv has enable-cache:
actions-rust-lang/setup-rust-toolchain: has cache: plus some others: https://github.com/actions-rust-lang/setup-rust-toolchain#inputs
- cache appears to be on by default for this one!
any use of Swatinem/rust-cache at all (in a release workflow)?
any use of actions/cache or actions/cache/restore at all (in a release workflow)?

ubiratansoares · 2024-12-11T22:58:22Z

Thanks for sharing your thoughts. Those considerations definitely help 🚀

I'm particular interested having some upfront chat around the set of Actions we want to check.

My first take would be having the Actions provided by Github as the hardcoded ones to go for. These are not going anywhere (hopefully 😂), and they look like a good place to start.

Those Actions would be :

However, I really like the idea of dynamically retrieve a list of Actions we want to inspect for cacheability, using a schema that exposes at least

the GHA coordinate (e.g. astral-sh/setup-uv)
the input we want to evaluate (e.g. enable-cache) along with a hint on the default value (if any)

That would allow an user running a zizmor version a behind the latest to also benefit from new Actions being listed as suitable for a cache-poisoning attack. Perhaps we can maintain such a list as a .json file along with the website assets and deploy it over Github Pages, hence retrieving it easily without any Github PAT.

In addition, adding a new GHA to the list of ones we want to check is a matter of reviewing/merging a PR without any code changes (in general). We could even have a CI check to validate the .json we'll consume from the CLI. 😄

This approach could be a second iteration, since there is some design to consider for the data schema (e.g., should we track versions for those Actions as well?), or we can for it right away.

In the end of the, either having a static or dynamic list of suitable Actions to check, it should be straightforward add them, because the ecosystem is huge, and new additions surely will come! For example, as a long term JVM user, I'm aware that gradle/setup-gradle also caches and restores on top of actions/toolkit by default 🙈

I'd love to learn your thoughts on this 🙂

Should we be pedantic for this audit ? If yes, only for persona=auditor?
Do we want to this check to be 100% offline for all Actions we support?
On the other way around: do we want to be 100% online for all Actions we support, without any hardcoded coordinates?
Alternatively in between: do we want to have at least some verifications purely based on harcoded GHA coordinates?

woodruffw · 2024-12-12T00:33:39Z

Re: data files: you read my mind! This is something I've been thinking about more generally, as a way to improve zizmor's online audit performance (since repeated paginated calls to GH's APIs are not fast) and avoid having to release new versions to release data changes (like you said).

The big challenge there IMO is making sure that things gracefully degrade in an offline setting. In practice, I think that means:

Each zizmor version should ship some version of the data it needs
If the user is online, we should opportunistically try to fetch a newer copy, but fail gracefully (with a warning) if the fetch fails
If the local version of the data gets "too" stale (>48h?), we should still use it but with a warning encouraging the user to go online to get a newer copy

TL;DR: I think this is a great idea, and that we should do it. However, I think it's a separate set of tasks than this audit itself, so I think I'd like to start things here with just a hard-coded list, and then we can do follow-ups for designing a static API.

Should we be pedantic for this audit ? If yes, only for persona=auditor?

IMO this should be non-pedantic, since it'll be pretty filtered by default (there should be a pretty low noise floor on "has caching enabled and is a release workflow).

Do we want to this check to be 100% offline for all Actions we support?

100% offline, yep 🙂

Alternatively in between: do we want to have at least some verifications purely based on harcoded GHA coordinates?

Long term I think we'll want to collapse the distinction between the two: the audit itself will always be offline, but the age of the data it uses will depend on whether the user is online or offline 🙂

woodruffw · 2024-12-12T00:58:25Z

I've opened #278 as a separate tracker for designing a static API.

ubiratansoares · 2024-12-12T09:12:36Z

Thanks @woodruffw ! I fully agree with having a separated task for that, meanwhile let's get started with the offline take 🙂

ubiratansoares · 2024-12-13T22:25:18Z

@woodruffw Just raised a first draft!

woodruffw · 2024-12-19T15:42:24Z

#294

woodruffw added the new-audit New audits label Dec 9, 2024

woodruffw mentioned this issue Dec 9, 2024

Analysis roadmap #1

Closed

28 tasks

woodruffw assigned ubiratansoares Dec 11, 2024

woodruffw added this to the 1.0 milestone Dec 11, 2024

ubiratansoares mentioned this issue Dec 13, 2024

feat: initial cache-poisoning audit #294

Merged

woodruffw closed this as completed Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New audit: cache poisoning #261

New audit: cache poisoning #261

woodruffw commented Dec 9, 2024

ubiratansoares commented Dec 11, 2024

woodruffw commented Dec 11, 2024 •

edited

Loading

ubiratansoares commented Dec 11, 2024

woodruffw commented Dec 12, 2024

woodruffw commented Dec 12, 2024

ubiratansoares commented Dec 12, 2024

ubiratansoares commented Dec 13, 2024

woodruffw commented Dec 19, 2024

New audit: cache poisoning #261

New audit: cache poisoning #261

Comments

woodruffw commented Dec 9, 2024

ubiratansoares commented Dec 11, 2024

woodruffw commented Dec 11, 2024 • edited Loading

ubiratansoares commented Dec 11, 2024

woodruffw commented Dec 12, 2024

woodruffw commented Dec 12, 2024

ubiratansoares commented Dec 12, 2024

ubiratansoares commented Dec 13, 2024

woodruffw commented Dec 19, 2024

woodruffw commented Dec 11, 2024 •

edited

Loading