import RFC-13

lsf37 · lsf37 · commit 565cb3801991 · 2024-06-14T10:02:45.000+10:00
See https://sel4.atlassian.net/browse/RFC-13 Signed-off-by: Gerwin Klein <gerwin.klein@proofcraft.systems>
diff --git a/src/active/0130-mcs-grant-reply.md b/src/active/0130-mcs-grant-reply.md
@@ -0,0 +1,243 @@
+<!--
+  SPDX-License-Identifier: CC-BY-SA-4.0
+  Copyright 2022 Kry10
+-->
+
+# MCS: Improve constraints on grant via reply
+
+- Author: Matthew Brecknell
+- Proposed: 2022-11-04
+
+## Summary
+
+In MCS configurations, the ability to grant capabilities when replying to a call
+is determined entirely by the grant right on the reply capability. If a server
+can create reply objects by retyping untyped memory, then it can also grant
+capabilities to its clients. Consequently, many realistic systems would not have
+the authority confinement property that we would expect.
+
+This RFC proposes two alternatives:
+
+- Both alternatives restrict the server's ability to grant via reply according
+  to whether or not the grant right is present on the receive endpoint
+  capability at the time the server called seL4_Recv.
+
+- The [first alternative] keeps the grant bit on reply object capabilities.
+  Servers that want to grant via reply must have grant rights on both the
+  receive endpoint capability and the reply object capability.
+
+- The [second alternative] removes the grant bit from reply object capabilities.
+  Servers that want to grant via reply only need a grant right on the receive
+  endpoint, but can't make reply object capabilities that mask those grant
+  rights.
+
+[first alternative]: https://github.com/seL4/seL4/pull/874
+[second alternative]: https://github.com/seL4/seL4/pull/945
+
+## Motivation
+
+The [seL4 integrity theorems] show that seL4, in non-MCS configurations, can
+enforce access control policies. For example, one can construct a system in
+which untrusted client and server components can communicate through an
+endpoint, but cannot write to each other's private memory.
+
+The theorems are conditional: They require system setups that constrain the
+propagation of authority. In particular, grant rights are too powerful for the
+access control model, so one of the conditions is that untrusted components
+cannot grant or be granted capabilities.
+
+We would like to prove similar integrity theorems for MCS configurations, and
+therefore need appropriate mechanisms to limit capability grant rights between
+untrusted components. Currently, the only way to prevent an MCS server granting
+to its clients is to prevent it from retyping untyped memory. That's too severe
+a limitation. It's also an indirect mechanism, which would be difficult to
+express in an access control model.
+
+This RFC proposes a more direct mechanism for limiting an MCS server's ability
+to grant to its clients, and should provide a clearer path to useful integrity
+theorems for MCS configurations.
+
+[seL4 integrity theorems]: https://trustworthy.systems/publications/nictaabstracts/Sewell_WGMAK_11.abstract
+
+## Guide-level explanation
+
+We give a brief account of the history that led to the current situation.
+
+In the non-MCS kernel, there are no reply objects. Reply capabilities are
+generated by the kernel when a call is received through an endpoint, and
+inserted into a slot in the receiver's TCB. The kernel sets the grant right on
+the reply capability iff the grant right was set on the endpoint cap the
+receiver used when it called `seL4_Recv`.
+
+The intention of the design is that the receiver's endpoint capability controls
+the receiver's ability to grant capabilities when it replies. The implementation
+is the grant right on the reply capability, but the kernel ensures that the
+grant right on the generated reply capability is determined by the grant right
+on the receiver's endpoint capability. The intention is important, because
+endpoint rights are crucial to the ability to construct systems with useful
+access control properties.
+
+MCS changed the object-capability model, introducing reply objects and
+scheduling contexts. Reply objects make protected procedure call stacks more
+explicit, allowing scheduling contexts to be passed from caller to receiver and
+back again, potentially recursively. Reply object capabilities are supplied by
+the receiver, and the kernel internally links the reply object with the calling
+thread when a call is received.
+
+With respect to grant rights for replies, MCS reused parts of the implementation
+from the non-MCS kernel, but did not capture its intention. In both MCS and
+non-MCS, the ability to grant when replying is controlled by the grant bit on
+the reply capability. However, the way that bit is set is very different. In
+MCS, the grant right on a reply object capability is typically controlled by the
+receiver, if it can create reply objects. The grant right on the receiver's
+endpoint capability has no effect. Consequently, it is harder to construct MCS
+systems with useful access control properties.
+
+In this RFC, we propose changes that limit an MCS receiver's ability to grant
+when replying, according to the rights on the receiver's endpoint capability.
+The kernel propagates the endpoint grant right via the reply object.
+
+## Reference-level explanation
+
+We propose to add a boolean `canGrant` field to the MCS reply object. It is
+controlled by the kernel, and is only meaningful when the reply object is
+associated with either a receiver waiting for a caller, or a caller waiting for
+a reply. In those cases, the `canGrant` field indicates whether the
+`capcanGrant` bit was set on the receiver's endpoint capability when it called
+`seL4_Recv` or `seL4_ReplyRecv`.
+
+Regardless of whether a caller is waiting on the endpoint, the `canGrant` field
+of the reply object is set in receiveIPC, which is part of either `seL4_Recv` or
+`seL4_ReplyRecv`. In `receiveIPC`, the kernel has access to both the receiver's
+endpoint capability and its reply object capability, so it can directly copy the
+`capCanGrant` flag from the receive endpoint capability to the `canGrant` field
+of the reply object.
+
+If there is already a caller waiting, then the send phase of the call completes
+immediately, and the caller becomes blocked on the reply object.
+
+If the receiver has to block waiting for a caller, then either:
+
+- The receiver's IPC is cancelled before a caller arrives, and the reply object
+  becomes dissociated from the receiver. In this case, the `canGrant` value that
+  was set on the reply object is never used.
+
+- The receiver remains blocked until a caller arrives, and the `canGrant` field
+  of the reply object still reflects the capCanGrant bit that was set on the
+  receiver's endpoint capability. The send phase completes, and the caller
+  becomes blocked on the reply object.
+
+Therefore, regardless of how the caller becomes blocked on the reply object, the
+`canGrant` field of the reply object reflects the capCanGrant bit on the endpoint
+capability used to receive that call. This is the property we'll need to prove a
+useful integrity theorem.
+
+This RFC includes two alternatives:
+
+- The [first alternative] only makes the change just described. The grant bit on
+  the reply object capability (which is separate from the canGrant field of the
+  reply object) is retained, and acts as a mask against the grant right on the
+  receiver's endpoint capability. Therefore, to grant via a reply, a receiver
+  would need grant rights on both the receive endpoint capability and the reply
+  object capability.
+
+- The [second alternative] makes the change just described, and additionally
+  removes the grant bit on reply object capabilities. To grant via a reply, a
+  receiver only needs a grant right on the receive endpoint, but can't make a
+  reply object capability that masks a grant right conferred by the receive
+  endpoint.
+
+## Drawbacks
+
+The two alternatives are compared in the next section.
+
+Both alternatives have the potential to break existing MCS systems that include
+threads that:
+
+- receive calls on endpoint capabilities without grant rights, yet
+
+- expect to be able to grant capabilities when replying.
+
+Any such systems would need to be updated to add grant rights to the receive
+endpoint capabilities.
+
+Unfortunately, the breakage will only become evident when the caller tries to
+the use the capabilities it should have been granted.
+
+A mitigating factor is that non-MCS systems already require the grant bit to be
+set appropriately on receive endpoint capabilities, so systems ported from
+non-MCS to MCS should not be affected.
+
+## Rationale and alternatives
+
+We've already covered the rationale, so it only remains to compare the two
+alternatives.
+
+### Retaining the grant bit in reply object capabilities
+
+The [first alternative] retains the grant bit in the reply object capability.
+The code change is minimal. Any existing MCS systems that make use of the
+ability to mask grant rights via the reply object capability will continue to
+enjoy that ability, so the change does not give any thread permissions it did
+not already have.
+
+### Removing the grant bit in reply object capabilities
+
+The [second alternative] removes the grant bit in the reply object capability.
+This results in a simpler implementation and API semantics, but removes a
+feature that might be used in some MCS systems. There is [no corresponding
+feature in the non-MCS kernel][1], so the feature removal is only potentially
+relevant to MCS systems.
+
+Unfortunately, the only reasonable way to remove the feature is silently. This
+means that an existing MCS system that currently uses the feature to mask the
+grant rights of reply object capabilities will continue to work (assuming the
+corresponding receive endpoint capabilities have grant rights), but might
+include threads which gain permissions that the system designer did not intend.
+That is, there might be threads that did not previously have the right to grant
+capabilities when replying, that could gain that right when this alternative is
+implemented.
+
+The rationale for this alternative is that we think it is unlikely that there
+are any such systems. We are not aware of any real use cases, and have not been
+able to construct a hypothetical use case that is not [more easily implemented
+without this feature][2].
+
+If there are any use cases, we would like to hear about them.
+
+[1]: https://github.com/seL4/seL4/pull/945#issuecomment-1302684909
+[2]: https://github.com/seL4/seL4/pull/945#issuecomment-1301609528
+
+
+## Prior art
+
+The main inspiration for this change is the non-MCS version of seL4, which:
+
+- already uses the grant bit on a receiver's endpoint capability to govern the
+  receiver's ability to grant capabilities when replying.
+
+- [does not provide any facility to remove the grant right from a reply
+  capability][1].
+
+## Unresolved questions
+
+Is there a realistic use case for a grant bit on the reply object capability,
+that can be used to mask a grant right conferred by the receive endpoint
+capability?
+
+This is the main factor in choosing between the two alternatives. So far, the
+discussion on this issue has only shown a [hypothetical use case][3], which [did
+not survive scrutiny][2], and which for non-MCS systems, was based on a [mistaken
+understanding of the API semantics][1].
+
+If you do use `seL4_CNode_Mint` to mask the grant rights of MCS reply object
+capabilities, we would like to hear from you.
+
+[3]: https://github.com/seL4/seL4/pull/874#issuecomment-1174670248
+
+## Disposition
+
+The TSC as approved the option of the RFC that removes the grant right on the
+reply cap (the [second alternative]). Should there be appear a particular use
+case for it after all, it should be easy enough to consider another RFC to add
+it.