[networks]: add prune command #914

saehejkang · 2025-11-22T00:33:01Z

Type of Change

Bug fix
New feature
Breaking change
Documentation update

Motivation and Context

Closes #893

Testing

Tested locally
Added/updated tests
Added/updated docs

…check - Implement server-side prune() in NetworksService with atomic operations - Only prune networks in .running state (preserves non-running networks) - Add comprehensive test coverage (5 tests covering all edge cases) - Follow existing patterns (similar to VolumesService.prune()) - Add networkPrune XPC route and client method This merges the architectural improvements from PR apple#914 with the correct logic from PR apple#906, addressing jlogan's request to combine the best of both implementations.

jglogan · 2025-12-02T21:52:48Z

@saehejkang @suhasramanand

Is doing prune in the server needless complexity for network prune?

I'm asking this as a design question, not as an explicit criticism of the approach.

Our principal concern for these prune operations is consistency – if we run some arbitrary collection of operations on a set of related resources, we don't want to leave the system in an inconsistent state. Principally I think that's dangling references; as an example, after a prune we should never have a case where a container refers to a non-existent network.

The network delete operation is already consistent on its own – a naive but functional implementation of prune would just to try deleting each network in turn regardless of what refers to it. Ideally there'd be a distinct EBUSY-like error so one could differentiate those (which could be silently ignored) versus other conditions.

I think the first, client-side-only, PR precomputed the set of unreferenced networks up front which would reduce the number of trips to the server that would return EBUSY – a simple and nice optimization.

Is there something that a server-side implementation of prune can do that can't be done by combining existing operations in ContainerClient? The guiding principle here is to keep the server side as slender and simple as possible, so that it's easier to reliably maintain.

It's probably worthwhile looking at what we need/want to do for other resources. For the image prune stuff there was a prune() operation for imagerefs all the way down in containerization, but we extracted that part out to container based on the same principle – the ImageStore only needs to ensure consistent pruning of content blobs, while pruning imagerefs can be dealt with in container.

Sources/Services/ContainerAPIService/Networks/NetworksService.swift

suhasramanand · 2025-12-02T22:01:49Z

@jglogan I agree that a client-side approach would align better with the existing design pattern.

Looking at ImagePrune (in Sources/ContainerCommands/Image/ImagePrune.swift), it:

Lists images and containers on the client
Filters to determine which images to delete
Calls ClientImage.delete() for each one
Only uses server-side for cleanupOrphanedBlobs() (blob cleanup, not imageref pruning)

A client-side network prune would follow the same pattern:

ClientNetwork.list() to get all networks
ClientContainer.list() to determine which networks are in use
Filter on the client to identify unreferenced networks
Call ClientNetwork.delete() for each one, handling EBUSY-like errors gracefully

This keeps the server minimal and consistent with how image prune works. The network delete operation already ensures consistency, so the client-side approach should be sufficient.

The main benefit of server-side would be atomicity via withContainerList, but if network delete already handles consistency correctly, that may not be necessary.

saehejkang · 2025-12-03T05:37:26Z

Our principal concern for these prune operations is consistency

This may have been where the confusion first occurred, as a single design pattern does not seem to be used consistently. I took reference from the volume prune command, which is more of a server-side approach. Furthermore, the image prune command is mainly a client-side approach.

Is there something that a server-side implementation of prune can do that can't be done by combining existing operations in ContainerClient?

There is nothing on the server-side implementation that can't be done by combining operations in the ContainerClient.

It's probably worthwhile looking at what we need/want to do for other resources.

Below is a list of the prune commands and the current/future implementation approach.

image prune - client-side approach
volume prune - server-side approach
container prune - server-side approach (in progress Implement container prune #904)
network prune - approach dependent on this discussion

Is there any reason why any of these resources would EVER need to use a server-side approach (besides image prune with the content blobs)?

Regarding the network prune command, we have two PRs, each proposing a valid approach. A decision on which one to merge is needed, and I will defer that to the maintainers. Finally, if we decide on the client-side approach, it wouldn’t be fair for me to simply make the changes here and then merge my PR.

jglogan · 2025-12-03T13:44:56Z

single design pattern does not seem to be used consistently

This. I've blocked out some time today to look at this (and the error handling bit I mentioned) from a broader perspective, and then will follow up here and we can discuss how to move forward.

jglogan · 2025-12-04T01:05:38Z

OK, here are my thoughts after a little reflection:

Let's keep the server APIs minimal, where we do the basic resource management operations reliably and consistently (focusing on getting contending operations on a single resource correct).
The client is where we'll do composite operations, and operations on collections. The client shall be responsible for dealing with partial success - we should consistently report errors where operations could be performed on some resources and not others (making exceptions for things like prune where an operation may not be able to be performed because of a state change race).

The principal disadvantage that I see right now is performance. Our APIs operate on a single resource at once today, each one locking the container collection. This isn't really a new issue though: container volume rm --all would incur the same penalties.

@saehejkang and @suhasramanand What do you think?

saehejkang · 2025-12-04T03:19:08Z

Let's keep the server APIs minimal, where we do the basic resource management operations

This makes sense to me. Just to be clear, this is for get, create, delete. operations for a single resource?

The client is where we'll do composite operations, and operations on collections

This also makes sense to me. Again, to be clear, this is for operations like network prune, as it would be going through the collection of containers/networks, and deleting unused networks?

Are we revisiting any commands (refer to my note here) and making the proper updates?
Question here

Our APIs operate on a single resource at once today, each one locking the container collection.

Is this going to also be revisited in the future? Are we ever going to build APIs that operate on more than a single resource?

jglogan · 2025-12-04T12:58:15Z

Are we revisiting any commands (refer to my note here) and making the proper updates?

Yes. Let's get network prune working this way, and then use it as a template for the container prune PR review, and we can rework volume prune so that all use the same pattern.

Question here

Some reasons we might need a server-side approach:

Performance, as mentioned before - if we're operating on a lot of resources, it could be that coalescing all the work under a single call could yield a decent speedup.
Transactional operations - if we find that we need a compound operation that needs "all or nothing" semantics across some set of resources, perhaps that's better done in the server? I don't have a concrete example in mind for this though.

Our APIs operate on a single resource at once today, each one locking the container collection.

Is this going to also be revisited in the future? Are we ever going to build APIs that operate on more than a single resource?

There's nothing stopping us from that, or for moving compound operations like prune into the API server. The principal motivation for moving in the current direction (keeping things simple, and as you pointed out, consistent) for now is to start cleaning up the client code and get it in a good state for developers. Once that's done we can let time and experience guide how we enhance the client API and the underlying API server.

The thing that gets under my skin at present is that our client (the APIs and the core types) is dispersed across ContainerClient and Services. The dependencies between targets within the project are messier than they need to be. I want to change this so that ContainerClient contains all of the SDK material, such that you or I can look at the docc just for ContainerClient and find what we need to code against container.

There's a lot more developer documentation that could be done to support this. We still lack docs for our extension mechanisms (the plugin system in particular), for example.

saehejkang · 2025-12-05T04:46:45Z

Yes. Let's get network prune working this way, and then use it as a template for the container prune PR review, and we can rework volume prune so that all use the same pattern.

I would like to spearhead this initiative and complete it to fruition. Once network prune is wrapped up, I can go back and help review the container prune PR, and then work on updates to volume prune.

cleaning up the client code and get it in a good state for developers

I want to change this so that ContainerClient contains all of the SDK material, such that you or I can look at the doc just for ContainerClient and find what we need to code against container.

There's a lot more developer documentation that could be done to support this.

I completely agree that cleanup is important and keeping things simple/consistent is important for quality. It is all coming full circle, as I remember asking a question in the discussions about how the SDK/ContainerClient works. I am sure that working on these updates will help me wrap my head around everything and I can work on adding some docs, hopefully in the near future.

Awesome points above and thank you for the all the explanations about the design/ your thought process!

If we decide on the client-side approach, it wouldn’t be fair for me to simply make the changes here and then merge my PR.

How do we want to proceed with network prune?

jglogan · 2025-12-05T21:17:10Z

@saehejkang Consider yourself signed up for these...I'll watch for an update on this PR, we can merge it and yep, go ahead and move to the next! Thank you.

saehejkang · 2025-12-06T20:57:58Z

Sources/ContainerCommands/Network/NetworkPrune.swift

+            }
+
+            let networksToPrune = allNetworks.filter { network in
+                network.id != ClientNetwork.defaultNetworkName && !networksInUse.contains(network.id)


I purposefully left the check for if a network is running because the issue did not call for it. Furthermore, if the network is not being used by any containers, I feel should it not be pruned, no matter the state?

saehejkang force-pushed the add-network-prune-command branch from aa9d622 to fd0e2af Compare November 22, 2025 18:44

jglogan requested changes Dec 2, 2025

View reviewed changes

Sources/Services/ContainerAPIService/Networks/NetworksService.swift Outdated Show resolved Hide resolved

saehejkang marked this pull request as draft December 6, 2025 19:50

add network prune command + tests

3dbc333

saehejkang force-pushed the add-network-prune-command branch from f9961dc to 3dbc333 Compare December 6, 2025 20:53

saehejkang marked this pull request as ready for review December 6, 2025 20:56

saehejkang commented Dec 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[networks]: add prune command #914

[networks]: add prune command #914

saehejkang commented Nov 22, 2025 •

edited

Loading

Uh oh!

jglogan commented Dec 2, 2025

Uh oh!

Uh oh!

suhasramanand commented Dec 2, 2025 •

edited

Loading

Uh oh!

saehejkang commented Dec 3, 2025 •

edited

Loading

Uh oh!

jglogan commented Dec 3, 2025

Uh oh!

jglogan commented Dec 4, 2025

Uh oh!

saehejkang commented Dec 4, 2025

Uh oh!

jglogan commented Dec 4, 2025 •

edited

Loading

Uh oh!

saehejkang commented Dec 5, 2025

Uh oh!

jglogan commented Dec 5, 2025

Uh oh!

saehejkang Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[networks]: add prune command #914

Are you sure you want to change the base?

[networks]: add prune command #914

Conversation

saehejkang commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of Change

Motivation and Context

Testing

Uh oh!

jglogan commented Dec 2, 2025

Uh oh!

Uh oh!

suhasramanand commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saehejkang commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jglogan commented Dec 3, 2025

Uh oh!

jglogan commented Dec 4, 2025

Uh oh!

saehejkang commented Dec 4, 2025

Uh oh!

jglogan commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saehejkang commented Dec 5, 2025

Uh oh!

jglogan commented Dec 5, 2025

Uh oh!

saehejkang Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

saehejkang commented Nov 22, 2025 •

edited

Loading

suhasramanand commented Dec 2, 2025 •

edited

Loading

saehejkang commented Dec 3, 2025 •

edited

Loading

jglogan commented Dec 4, 2025 •

edited

Loading