From 0bf62f2df201c3aaab787e7fdc8fcd2513699b29 Mon Sep 17 00:00:00 2001 From: Wojciech Ozga Date: Tue, 12 Mar 2024 10:22:00 -0500 Subject: [PATCH 1/7] Pass over the release candidate version of the spec Signed-off-by: Wojciech Ozga --- specification/contributors.adoc | 3 +- specification/glossary.adoc | 128 ++++++++++++----------- specification/intro.adoc | 7 +- specification/overview.adoc | 47 +++++---- specification/refarch.adoc | 80 ++++++++------- specification/sbi_cove.adoc | 176 ++++++++++++++++---------------- specification/swlifecycle.adoc | 105 +++++++++---------- specification/threatmodel.adoc | 90 ++++++++-------- 8 files changed, 322 insertions(+), 314 deletions(-) diff --git a/specification/contributors.adoc b/specification/contributors.adoc index cdc8fe1..637cea6 100644 --- a/specification/contributors.adoc +++ b/specification/contributors.adoc @@ -8,4 +8,5 @@ Andrew Bresticker, Andy Dellow, Atish Patra, Atul Khare, Beeman Strong, Christian Bollis, Dingji Li, Dong Du, Dylan Reid, Eckhard Delfs, Fabrice Marinet, Guerney Hunt, Jiewen Yao, Kailun Qin, Manuel Offenberg, Nicholas Wood, Nick Kossifidis, Osman Koyuncu, Qing Li, Rajnesh Kanwal, -Ravi Sahita (Editor), Rob Bradford, Samuel Ortiz, Vedvyas Shanbhogue, Yann Loisel +Ravi Sahita (Editor), Rob Bradford, Samuel Ortiz, Vedvyas Shanbhogue, +Wojciech Ozga, Yann Loisel diff --git a/specification/glossary.adoc b/specification/glossary.adoc index f638628..e84c322 100644 --- a/specification/glossary.adoc +++ b/specification/glossary.adoc @@ -2,76 +2,78 @@ == Glossary |=== -| Hypervisor or Virtual Machine Monitor (VMM) | HS mode software -that manages Virtual Machines by virtualizing hart, guest physical memory and -IO resources. This document uses the term VMM and hypervisor interchangeably -for this software entity. -| VM | Virtual Machines hosted by a VMM +| AIA | Advanced interrupt architecture (AIA) is an architecture for handling interrupts. -| Host software | All software elements including type-1 or type-2 HS-mode VMM -and OS; U mode user-space VMM tools; ordinary VMs hosted by the VMM that -emulate devices. The hosting platform is typically a multi-tenant platform -that hosts multiple mutually distrusting Tenants. - -| Tenant software | All software elements including VS-mode guest kernel -software, and guest user-space software (in VU-mode) that are deployed -by the workload owner (in a multi-tenant hosting environment). - -| Trusted Computing Base (TCB); Also, System/Platform TCB | The hardware, -software and firmware elements that are trusted by a relying party to -protect the confidentiality and integrity of the relying parties' workload -data and execution against a defined adversary model. In a system with -separate processing elements within a package on a socket, the TCB -boundary is the package. In a multi-socket system the TCB extends across -the socket-to-socket interface, and is managed as one system TCB. +| ABI | Application binary interface (ABI). -| Application Processor (AP) | APs can support commodity operating systems, +| AP | Application processors (AP)s can support commodity operating systems, hypervisors/VMMs and applications software workloads. The AP subsystem may contain several processing units, on-chip caches, and other controllers for interfacing with memory, accelerators, and other fixed-function logic. Multiple APs may be used within a logical system. -| RISC-V Supervisor Domains | RISC-V privileged architecture <> defines -the S-mode for execution of supervisor software. S-mode software may optionally -enable Hypervisor extension to host virtual machines. Typically, there is a -single supervisor domain of execution with access to all physical memory. -*Supervisor Domains* <> is a RISC-V privileged architecture extension to -support physical address space (memory and devices) isolation for more than one -supervisor domain. Supervisor domains enable the reduction of the supervisor -Trusted Computing Base (TCB), with differential access to memory and other -platform resources e.g. as used in this Confidential VM Extension (CoVE) spec. +| Attestation | The process by which a relying party can assess the +security posture of the confidential workload based on verifying a set of +HW-rooted cryptographically-protected evidence. + +| CDI | Compound device identifier (CDI) is the value that represents the hardware, +software and firmware combination measured by the TCB elements transitively. +A CDI is the output of a DICE <> and is passed to the entity which is +measured by the previous TCB layer. The CDI is a secret that may be +certified to use for attestation protocols. -| Confidential Computing | The protection of data in use by performing -computation in a Hardware-based and Attestable Trusted Execution Environment. +| Confidential Computing | A computing paradigm that protects data in use by performing +computation in a hardware-based TEE. -| Confidential VM Extension (CoVE)| The set of non-ISA RISC-V ABI extensions +| CoVE | Confidential VM extension (CoVE) is the set of non-ISA RISC-V ABI extensions defined in this specification that enables confidential computing on RISC-V platforms. In some deployment models, the CoVE ABI leverages the RISC-V ISA extensions specified in the RISC-V Supervisor Domains specification <>. -CoVE is a Trusted Execution Environment ABI for Application Processors. A +CoVE is a Trusted Execution Environment (TEE) ABI for Application Processors (APs). A supervisor domain that provides HW-isolation for workload data assets when in use (user/supervisor code/data) and provides HW-attestable confidentiality and integrity protection against specific attack vectors per a specified adversary and threat model. -| TVM | TEE or Confidential VM - A VM instantiation of an confidential workload - | Confidential application or library | A user-mode application or library instantiation in a TVM. The user-mode application may be supported via a trusted runtime. The user-mode library may be hosted by a surrogate process runtime. -| Attestation | The process by which a relying party can assess the -security posture of the confidential workload based on verifying a set of -HW-rooted cryptographically-protected evidence. +| Confidential memory | Memory that is subject to access-control, +confidentiality and integrity mechanisms per the threat model for use in the +CoVE system. Confidential memory may also be used by non-TCB/ +hosting software with appropriate TCB controls on the configuration, +e.g., a separate key used for TCB and non-TCB elements. -| TEE Security Manager (TSM) | HS-mode software module that acts as -the trusted (in TCB) intermediary between the VMM and the TVM. This -module extends the TCB chain on the CoVE platform. +| Host software | All software elements including type-1 or type-2 HS-mode VMM +and OS; U-mode user-space VMM tools; ordinary VMs hosted by the VMM that +emulate devices. The hosting platform is typically a multi-tenant platform +that hosts multiple mutually distrusting software owned by different tenants. + +| Hypervisor | is software running in HS-mode that manages virtual machines (VMs) by virtualizing hart, guest physical memory and input/output (IO) resources. + +| IMSIC | Incoming message signaled interrupt controller (IMSIC). + +| MMIO | Memory mapped I/O (MMIO). + +| MMU | Memory management unit (MMU). + +| MTT | Memory Tracking Table (MTT). + +| RISC-V Supervisor Domains | RISC-V privileged architecture <> defines +the S-mode for execution of supervisor software. S-mode software may optionally +enable the Hypervisor extension to host virtual machines. Typically, there is a +single supervisor domain of execution with access to all physical memory. +*Supervisor Domains* <> is a RISC-V privileged architecture extension to +support physical address space (memory and devices) isolation for more than one +supervisor domain. Supervisor domains enable the reduction of the supervisor +Trusted Computing Base (TCB), with differential access to memory and other +platform resources, e.g., as used in this specification. -| RoT | Isolated HW/SW subsystem with an immutable ROM firmware and -isolated compute and memory elements that form the Trusted Compute Base +| RoT | Root of trust (RoT) is the isolated hardware/software subsystem with an immutable ROM firmware and +isolated compute and memory elements that form the Trusted Compute Base (TCB) of a TEE system. The RoT manages cryptographic keys and other security critical functions such as system lifecycle and debug authorization. The RoT provides trusted services to other software on the platform such @@ -81,31 +83,33 @@ attestation etc. The RoT may be an integrated or discrete element <>, and may take on the role of a Device Identification Composition Engine (DICE) as defined in <>. -| Confidential memory | Memory that is subject to access-control, -confidentiality and integrity mechanisms per the threat model for use in the -CoVE system. Confidential memory may also be used by non-TCB/ -hosting software with appropriate TCB controls on the configuration, -e.g a separate key used for TCB and non-TCB elements. - -| SVN | Security Version Number - Meta-data about the TCB components +| SVN | Security version number (SVN) is the meta-data about the Trusted Compute Base (TCB) components that conveys the security posture of the TCB. The SVN is a monotonically -increasing version number updated when security changes must be reflected in -the attestation. The SVN is hence provided as part of the attestation +increasing number that represents TCB's version. It gets increased with TCB updates, causing these updates to be reflected in the attestation. The SVN is hence provided as part of the attestation information as part of the evidence of the TCB in use. The SVN is typically combined with other meta-data elements when evaluating the attestation information. -| CDI | Compound Device Identifier - This value represents the hardware, -software and firmware combination measured by the TCB elements transitively. -A CDI is the output of a DICE <> and is passed to the entity which is -measured by the previous TCB layer. The CDI is a secret that may be -certified to use for attestation protocols. +| TSM | TEE security manager (TSM) is a software module that enforces TEE security guarantees on a platform. It acts as +the trusted intermediary between the VMM and the TVM. TSM extends the TCB chain on the CoVE platform and is therefore subject to attestation. + +| Tenant software | All software elements owned and deployed by a tenant in a multi-tenant hosting environment. These elements include VS-mode guest kernel and VU-mode guest user-space software. + +| TCB; Also, System/Platform TCB | Trusted computing base (TCB) is the hardware, +software, and firmware elements that are trusted by a relying party to +protect the confidentiality and integrity of the relying parties' workload +data and execution against a defined adversary model. In a system with +separate processing elements within a package on a socket, the TCB +boundary is the package. In a multi-socket system the TCB extends across +the socket-to-socket interface, and is managed as one system TCB. + +| TEE | Trusted execution environment (TEE) is a set of hardware and software mechanisms that allow creating attestable and isolated execution environment. -| AIA | Advanced Interrupt Architecture +| TVM | TEE VM (TVM) also known as Confidential VM. It is a VM instantiation of an confidential workload. -| IMSIC | Incoming Message Signaled Interrupt Controller +| Virtual Machine (VM) | Guest operating system hosted by a VMM. -| MMIO | Memory Mapped I/O +| VMM | Virtual machine monitor (VMM) is used interchangeably with the term hypervisor in this document. |=== diff --git a/specification/intro.adoc b/specification/intro.adoc index e57316c..4b27c9d 100644 --- a/specification/intro.adoc +++ b/specification/intro.adoc @@ -3,10 +3,13 @@ == Introduction This document describes the Confidential VM Extension (CoVE) interface for -a scalable Trusted Execution Environment(TEE) for hardware virtual-machine-based +a scalable Trusted Execution Environment (TEE) for hardware virtual-machine-based workloads on RISC-V-based platforms. This CoVE interface specification enables application workloads that require confidentiality to reduce the Trusted Computing Base (TCB) to a minimal TCB, specifically, keeping the host OS/VMM -and other software outside the TCB. The proposed specification supports an +and other software outside the TCB. +% Do we want to talk here about IO devices as well? +The proposed specification supports an architecture that can be used for Application and Virtual Machine workloads, while minimizing changes to the RISC-V ISA and privilege modes. +% What is the meaning of "Application" here? When I read "Application and Virtual Machine" I think of a "process-based" and "VM-based" TEEs, i.e., SGX and TDX like. But this contradicts the initial sentence in this paragraph that says that CoVE provides TEE for VM-based workloads. \ No newline at end of file diff --git a/specification/overview.adoc b/specification/overview.adoc index e59eb66..2606601 100644 --- a/specification/overview.adoc +++ b/specification/overview.adoc @@ -4,12 +4,12 @@ == Architecture Overview and Threat Model Virtualization platforms are typically comprised of several components including -platform firmware, host OS, VMM, and the actual payloads that run on them -(typically in a VM). A monolith Supervisor Domain exists with the host OS/VMM -including device drivers and services forming the TCB. This model is well +platform firmware, host OS, VMM, and the actual workload (typically in a VM) that run on them. +A monolith Supervisor Domain exists with the host OS/VMM +including device drivers and services forming the Trusted Computing Base (TCB). This model is well established, but the downside is that most platform components are in the TCB. -This aspect is ill-suited for Confidential Computing workloads that rely on -HW-Attested Trusted Execution Environments, and strive to minimize the software +This aspect is ill-suited for confidential computing workloads that rely on +hardware-attested Trusted Execution Environments (TEEs), and strive to minimize the software and hardware TCB. This specification describes the CoVE architecture which enables a new class @@ -25,7 +25,7 @@ role as the resource manager (for both legacy VMs and TVMs). The resources managed by the hosting supervisor domain (OS/VMM) include memory, CPU, I/O resources and platform capabilities to host the TVM workload. The terms hosting supervisor domain and OS/VMM are used interchangeably in this -specification. The underlying isolation mechanisms for supervisor domains +specification. The underlying memory isolation mechanisms for supervisor domains (Smmtt) is agnostic of the number of supervisor domains. [id=dep1] @@ -38,8 +38,8 @@ that operates in HS-mode and manages resources granted to it by the Hosting Supervisor Domain Manager (the OS/VMM). The Confidential Supervisor Domain Manager is called the " *TEE Security Manager* " or *(TSM)* - it acts as the trusted intermediary between TEE and non-TEE workloads on the same platform. -The TSM should have a minimal HW-attested footprint. The TCB (which includes -the TSM and HW) enforces strict confidentiality and integrity security +The TSM should have a minimal hardware-attested footprint. The TCB (which includes +the TSM and hardware) enforces strict confidentiality and integrity security properties for workloads in this supervisor domain. The Root Security Manager is an M-mode software module (called the " *TSM-driver* ") which isolates the Confidential Supervisor Domain from all other Supervisor domains and other @@ -48,12 +48,12 @@ confidential). The responsibility of the TSM is to enforce the security objectives accorded to TEE workloads assigned to that supervisor domain. The VMM is expected to continue to manage the security for non-confidential workloads, and importantly the resource-assignment and scheduling management -functions for all workloads (confidential and non-confidential). +functions for all confidential and non-confidential workloads. -In this scheme, compute resources like memory start off as traditional +In this scheme, compute resources, such as memory, start off as traditional untrusted resources owned by the non-confidential/hosting supervisor domain, and are expected to be donated/transitioned to the confidential supervisor domain -via ABI supported by the TSM. Once the conversion process is complete, +via application binary interface (ABI) supported by the TSM. Once the conversion process is complete, confidential memory may be assigned to one or more TVMs by the TSM. A converted confidential resource may be freely assigned to another TVM within the same supervisor domain when it is no longer in use. However, an @@ -71,7 +71,7 @@ are pages that are demand-paged in and are expected to be zero'ed by the TSM to prevent attacks from the host software on the TVM. The TSM also enforces that the host does not overlap them with existing (present) G-stage mappings for the TVM. The non-confidential TVM-defined regions include those for shared-pages and -MMIO. +memory-mapped I/O (MMIO). The TSM implements ABI that are accessed by the OS/VMM in the Hosting Supervisor Domain Manager via a *Trusted Execution Environment Interface (TEEI)*. This ABI @@ -112,7 +112,7 @@ as *COVH* that includes functions to manage the lifecycle of the TVM, such as creating, adding pages to a TVM, scheduling a TVM for execution, etc., in an OS/platform agnostic manner. The TSM also provides an ABI to the TVM contexts: A set of guest ABIs known as *COVG* that enables the TVM workload to request -attestation functions, memory management functions or paravirtualized IO. +attestation functions, memory management functions, or paravirtualized IO. In order to isolate the TVMs from the host OS/VMM and non-confidential VMs, the supervisor domains (that contain the TSM state) must be isolated first - @@ -120,17 +120,17 @@ this is achieved by enforcing isolation for memory assigned to the supervisor domain that the TSM occupies - this is called the *TSM-memory-region.* The TSM-memory-region is expected to be a static region of memory that holds the TSM code and data. This region must be access-controlled from all software outside -the TCB (e.g. using Smmtt), and may be additionally protected against physical +the TCB (e.g., using Smmtt), and may be additionally protected against physical access via cryptographic mechanisms. -Access to the TSM- memory-region and execution of code from the +Access to the TSM-memory-region and execution of code from the TSM-memory-region (for the TSM ABIs) is enforced in hardware via the maintenance of the execution context (ASID, VMID and SDID) maintained per hart. This context is enabled per-hart via the TEECALL interface to context switch into the confidential supervisor domain context via the TSM-driver and disabled via the TEERET interface to context restore to the hosting supervisor domain. Access to TEE-assigned memory is allowed for the hart when the access is -permitted as per the active permissions enforced by the MMU for the supervisor +permitted as per the active permissions enforced by the memory management unit (MMU) for the supervisor domain active on the hart (enforced through Sv and Smmtt for CoVE). This per-hart execution context is used by the processor to enforce access-control properties on memory accessed by TEE workloads managed by the TSM. The @@ -141,18 +141,17 @@ TSM functionality should be explicitly limited to support only the security primitives to ensure that the OS/VMM and non-confidential VMs do not violate the security of the TVMs through the resource management actions of the OS/VMM. These security primitives require the TSM to enforce TVM virtual-hart -state save and restore, as well as enforcing invariants for memory assigned -to the TVM (including G-stage translation). The host OS/VMM provides the -typical VM resource management functionality for memory, IO etc. +state save and restore, as well as enforcing invariants for memory assigned +to the TVM, including G-stage translation. The host OS/VMM provides the +typical VM resource management functionality for memory, IO, etc. -Confidential VMs (managed by a VMM) are shown in figure 1 and Confidential -applications (managed by an untrusted host OS) are shown in the -architecture <>. As evident from the architecture, the difference +<> shows Confidential VMs managed by a VMM and <> shows Confidential +applications managed by an untrusted host OS. As evident from the architecture, the difference between these two scenarios is the software TCB (owned by the tenant within the TVM) for the tenant workload - in the application TEE case, a minimal guest runtime may be used; whereas in the VM TEE case, an enlightened -guest OS is expected in the TVM TCB. Other SW models that map to the VU/VS -modes of operation are also possible as TEE workloads. Importantly, the HW +guest OS is expected in the TVM TCB. Other software models that map to the VU/VS +modes of operation are also possible as TEE workloads. Importantly, the hardware mechanisms needed for both cases are identical, and can be supported with the CoVE ABI. diff --git a/specification/refarch.adoc b/specification/refarch.adoc index 85ff254..dda2d38 100644 --- a/specification/refarch.adoc +++ b/specification/refarch.adoc @@ -16,14 +16,14 @@ phases: the conversion of memory to confidential memory and the assignment of confidential memory (alongwith the enforcement of properties on use) to TVMs. To enforce isolation across Host and Confidential supervisor domains, CoVE requires isolation of physical memory (that supports paging when enabled). There -are two deployment models described below (1,2). CoVE ABI is applicable for both +are two deployment models described below (1 and 2). CoVE ABI is applicable for both modes - this specification focuses on the first deployment model (1) where a primary host supervisor domain is used to host confidential workloads in a secondary confidential domain. . The TSM operates in S/HS mode as a peer supervisor domain manager to the hosting supervisor domain which operates in S/HS mode as well. This model uses -the MTT along with G-stage PT for confidential TVM isolation (where 1st +the Memory Tracking Table (MTT) along with G-stage page tables (PT) for confidential TVM isolation (where the 1st stage PT is used by the Guest OS normally). The MTT is used to assign physical memory to the Confidential supervisor domain called *Confidential* memory and memory accessible to the hosting supervisor domain called *Non-Confidential*. @@ -31,7 +31,7 @@ MTT allows dynamic programming of the per-domain access permissions. This model is shown in <> . The TSM is the only root HS mode component on the platform, hence, G-stage -page tables can be used to enforce isolation between confidential TVMs and +page tables (PT) can be used to enforce isolation between confidential TVMs and ordinary VMs. In this model the host VMM must execute in the de-privileged VS mode and the TSM must provide nested virtualization of the H-extension controls. This model may be suitable for client/embedded systems and is shown in <>. @@ -40,7 +40,7 @@ A TVM and/or TSM needs to access both types of memory: * Confidential memory - used for TVM/TSM code and security-sensitive data; including state such as 1st-stage, G-stage page tables. -* Non-confidential memory - used only for shared data, e.g. communication +* Non-confidential memory - used only for shared data, e.g., communication between the TVM/TSM and the non-TCB host software and/or non-TCB IO devices. The TSM COVH ABI provides interfaces to the OS/VMM to convert / donate @@ -80,19 +80,19 @@ unique memory encryption key. These additional protection aspects are platform and implementation dependent. ==== -Confidential and non-confidential memory are both always assigned by the VMM -i.e. the hosting supervisor domain - the TSM-driver is expected to manage the +Confidential and non-confidential memory are both always assigned by the VMM, +i.e., the hosting supervisor domain - the TSM-driver is expected to manage the isolation for confidential memory assigned to any of the secondary supervisor domains by programming the Memory Tracking Table (MTT). The desired security properties of memory tracking are discussed below. The TSM (within a supervisor domain) manages page-based allocation using the G-stage page table from the set -of confidential memory regions that are enforced by the memory tracking table. +of confidential memory regions that are enforced by the MTT. Four aspects of memory isolation are impacted due to this dynamic configurable property of the MTT: ==== Address Translation/Page Walk -The figure 2 below describes a reference model for memory tracking lookup where +Figure 2 describes a reference model for memory tracking lookup where the physical address derived from the two-stage address translation and protection mechanism is looked up via the MTT configured for the active supervisor domain to get the access permissions for the physical address. This @@ -105,27 +105,27 @@ image::https://github.com/riscv/riscv-smmtt/blob/main/images/fig2.png?raw=true[] ==== Management of isolation for Confidential Physical Memory -The SW TCB (TSM) manages the assignment of physical memory to the Confidential -supervisor domain, while the HW TCB (hart MMU including virtual memory system, +The software TCB (specifically TSM) manages the assignment of physical memory to the Confidential +supervisor domain, while the hardware TCB (specifically the hart MMU including virtual memory system, MTT Extensions) enforces the access-control for confidential memory against other supervisor domains. The region sizes at which the memory tracking enforces isolation may be multiples of the architectural page sizes supported by the hart MMU. The IOMMU is expected to support a similar memory tracking lookup to enable a device/function trusted by the TVM to directly access TVM confidential memory regions. For the CoVE reference architecture this TCB -consists of the HW (e.g. MMU, IOMMU, Memory Controller) and the SW/FW elements - +consists of the hardware (e.g., MMU, IOMMU, Memory Controller) and the software/firmware elements - TSM-driver and the TSM. The TSM-driver is responsible for enforcing isolation of confidential memory regions (consisting of multiple pages via MTT) and the TSM is responsible for enforcing isolation of confidential memory pages among TVMs (via G-stage translation) - pages assigned to the TVM may be exclusively accessible to the condidential supervisor domain or may be shared with the -hosting supervisor domain (e.g. to allow for paravirtualized IO access). +hosting supervisor domain (e.g., to allow for paravirtualized IO access). [NOTE] ==== The TSM may manage additional attributes on TVM-assigned pages such as: -TVM-owner, Page-sub-type, TLB versioning information, Locking semaphore and -additional metadata etc. This extended memory tracking information managed by +TVM-owner, Page-sub-type, Translation Lookaside Buffer (TLB) versioning information, Locking semaphore and +additional metadata, etc. This extended memory tracking information managed by the TSM software is referred to as the Extended Memory Tracking Table (EMTT). ==== @@ -142,15 +142,15 @@ relax data accesses to non-confidential memory (via MTT) to allow for IO accesses. ==== Cached translations/TLB management -During confidential memory conversion or reclamation, the HW TCB -and SW TCB (TSM) must enforce via memory-management fences +During confidential memory conversion or reclamation, the hardware TCB +and software TCB (TSM) must enforce via memory-management fences that stale data is not accessible to the TVM (or the hosting OS/VMM). During confidential memory assignment to a TVM (or during conversion of confidential memory to shared), the TCB must enforce that stale translations may not be held to memory yielded by a TVM (and used by the host for another TVM or VM or the host). These properties are implemented by the TSM in conjunction with -the HW (e.g. MTT cache invalidations) via the proposed COVH interface. +the hardware (e.g., MTT cache invalidations) via the proposed COVH interface. [NOTE] ==== @@ -165,19 +165,19 @@ memory being returned to the host via `sbi_covg_share_memory_region`. === TSM initialization -The CoVE architecture requires a hardware Root-of-trust for supporting -TCB measurement, reporting and storage <>. The Root-of-trust for -Measurement (RTM) is defined as the TCB component that performs a +The CoVE architecture requires a hardware root-of-trust (RoT) for supporting +TCB measurement, reporting and storage <>. The root-of-trust for +measurement (RTM) is defined as the TCB component that performs a measurement of an entity and cryptographically signs it as attestation evidence subsequently reported to a relying party. The -Root-of-trust for Reporting (RTR) is typically a HW RoT that reliably +root-of-trust for reporting (RTR) is typically a hardware RoT that reliably provides authenticity and non-repudiation services for the purposes of attesting to the origin, integrity and security version of platform TCB components. Each TCB layer should have associated security version numbers (SVN) to allow for TCB recovery in the event of security vulnerabilities discovered in a prior version of the TCB layer. -During platform initialization, HW/FW elements form the RTM that measure the +During platform initialization, hardware and firmware elements form the RTM that measure the TSM-driver. The TSM-driver acts as the RTM for the TSM loaded on the platform. The TSM-driver initializes the TSM-memory-region for the TSM - this TSM-memory-region must be in confidential memory. The TSM binary may be @@ -189,13 +189,13 @@ binary via the TSM-driver. In both cases, the TSM binary loaded must be measured and may be authenticated (per cryptographic signature mechanisms) by the TSM-driver during the loading process, so that the TSM used is reflected in the -attestation rooted in a HW RoT. The authentication process provides +attestation rooted in a hardware RoT. The authentication process provides additional control to restrict TSM binaries that can be loaded on the -platform based on policies such as version, vendor etc. In addition to the +platform based on policies such as version, vendor, etc. In addition to the measurements, a security version number (SVN) of the TSM should be recorded by the TSM-driver into the firmware measurement registers accessible only to the TSM-driver and higher privilege components. The measurements and -versions of the HW RoT, the TSM-driver and the TSM will subsequently be +versions of the hardware RoT, the TSM-driver and the TSM will subsequently be provided as evidence of a specific TSM being loaded on a specific platform. During initialization, the TSM-driver will initialize a TSM-data region @@ -220,7 +220,7 @@ assigned to the TVM by the VMM. === TSM operation and properties The TSM implements COVH APIs that are invoked by the OS/VMM or by -the TVMs, e.g. by the VMM to grant a TVM a confidential memory page and +the TVMs, e.g., by the VMM to grant a TVM a confidential memory page and setup second-stage mapping, activate a TVM virtual hart on a physical hart etc. The TSM security routines are invoked by the OS/VMM via an ECALL with the service call specified via registers. These service calls trap to the @@ -236,8 +236,8 @@ medeleg). The TSM saves the TVM state and invokes the TSM-driver via an ECALL (TEERET with reason) to initiate the return of execution control to the OS/VMM if required. The TSM-driver restores the context for the OS/VMM via the -per-hart control sub-structure THCS.hssa (See <>).This canonical -flow is shown in figure 3. +per-hart control sub-structure THCS.hssa (See <>). Figure 3 shows this canonical +flow. Beyond the basic operation described above, the following different operational models of the TSM may be supported by an implementation: @@ -384,10 +384,10 @@ sstatus.sie. Under these circumstances the saving of the TVM state is the TSM responsibility. When TVM is executing, hideleg will only delegate VS-mode external -interrupt, VS-mode SW interrupt, and VS-mode timer interrupts to the TVM. -S-mode SW/Timer/External interrupts are delegated to the TSM (with the +interrupt, VS-mode software interrupt, and VS-mode timer interrupts to the TVM. +S-mode Software/Timer/External interrupts are delegated to the TSM (with the behavior described above). _All other interrupts_ , M-mode -SW/Timer/External, bus error, high temp, RAS etc. are not delegated and +Software/Timer/External, bus error, high temp, RAS etc. are not delegated and delivered to M-mode/TSM-driver. Under these circumstances the saving of the state is the TSM-driver responsibility. Also since scrubbing the TVM state is the TSM responsibility, the TSM-driver may pend an S-mode interrupt to @@ -409,8 +409,8 @@ TVMs are prevented from execution after that point. === TSM and TVM Isolation TSM (and all TVMs) memory is granted by the host OS/VMM but is isolated -(via access-control and/or confidentiality-protection) by the HW and TCB -elements. The TSM, TVM and HW isolation methods used must be evident in the +(via access-control and/or confidentiality-protection) by the hardware and TCB +elements. The TSM, TVM and hardware isolation methods used must be evident in the attestation evidence provided for the TVM since it identifies the hardware and the TSM-driver. @@ -428,11 +428,13 @@ and G-stage paging hardware, the root security manager (TSM-driver) must use MTT to isolate supervisor domain memory. In this deployment model, TEE and TVM address spaces are identified by supervisor domain identifiers (Smsdid) to maintain the isolation during access and in internal -address translation caches, e.g. Hart TLB lookup may be extended with the +address translation caches, e.g., Hart TLB lookup may be extended with the SDID in addition to the ASID, VMID for workloads in the Confidential supervisor domain. TVM memory isolation must support sparse memory management models and architectural page-sizes of 4KB, 64K, 2MB, 1GB (and optionally -512GB). The hardware may implement the MTT as specified in the Smmtt +512GB). +% Should 64K be 64KB? Is there RISC-V MMU spec for 64KB pages? +The hardware may implement the MTT as specified in the Smmtt privileged ISA extension, or other approaches may be used such as a flat table. The memory tracking table may be enforced at the memory controller, or in a page table walker. @@ -446,7 +448,7 @@ example, The hardware may use the Supervisor Domain Identifier during execution (and memory access) to cryptographically isolate memory associated with a TEE which may be encrypted and additionally cryptographically integrity-protected using a MAC on the memory contents. The MAC may be -maintained at various granularity - e.g. cache block size or in multiples +maintained at various granularity, e.g., cache block size or in multiples of cache blocks. *TVM isolation* is the responsibility of the TSM via the G-stage @@ -462,7 +464,7 @@ management>>. As described above, TVMs can access both classes of memory - isolated memory - which has confidentiality and access-control properties for memory exclusive to the TVM, and non-confidential memory which is memory accessible to the host -OS/VMM and is used for untrusted operations (e.g. virtio, gRPC communication +OS/VMM and is used for untrusted operations (e.g., virtio, gRPC communication with the host). If the confidential memory is access-controlled only, the TSM and TSM-driver are the authority over the access-control enforcement. If the confidential memory is using memory encryption (instead or in addition), the @@ -489,7 +491,7 @@ monitoring: In order to support probe-mode debugging of the TSM, the RoT must support an authorized debug of the platform. The authentication mechanism used for debug authorization is implementation-specific, but must support the -security properties described in the Section 3.12 of the RISC-V Debug +security properties described in Section 3.12 of the RISC-V Debug Support specification version 1.0.0-STABLE <>. The RoT may support multiple levels of debug authorization depending on access granted. For probe-based debugging of the hardware, the RoT performing debug @@ -536,7 +538,7 @@ virtual and physical counters as well. It must not delegate the LCOFI interrupt defined in the Advanced Interrupt Architecture (AIA) to inject the LCOFI interrupt when the physical counter corresponding to the virtual counter overflows. The physical counters naturally inhibit counting in S/HS and M. The -TSM must save and clear counter/event selector values as control transitions to +TSM must save and clear counter/event selector values as control transitions to the VMM or a different TVM that is using hpm. On a transition back to the host OS/VMM, the TSM must restore the saved hardware performance monitoring event triggers and counter enables. If the TSM uses the SBI PMU extension instead of diff --git a/specification/sbi_cove.adoc b/specification/sbi_cove.adoc index ea5a90e..fdde08e 100644 --- a/specification/sbi_cove.adoc +++ b/specification/sbi_cove.adoc @@ -33,10 +33,11 @@ as allocated in <>. ], config:{lanes: 1, hspace:1024}} .... -Other future specifications (e.g. CoVE-IO) may need to extend one of the three +Other future specifications (e.g., CoVE-IO) may need to extend one of the three CoVE SBI extensions with domain specific functions. In order to support that requirement each one of the CoVE extensions SBI function IDs (`FID`) in the availabe 64K range is split into separate namespaces. +% what 64K above means? 64KB? The main CoVE specification uses FIDs from 0 to 1023 (inclusive), and other specifications can extend the CoVE SBI by reserving a FID range after 1024. @@ -55,34 +56,34 @@ Below are the reserved CoVE FID namespaces: |=== === TEEI - COVH runtime interface -ECALL invocation from VS (guest OS) causes traps that are handled by the +ECALL invocation from VS-mode (guest OS) causes traps that are handled by the TSM module (enforced via `medeleg` configuration). The TSM then may provide intrinsics via the COVG (CoVE-Guest ABI) to the TVM to provide attestation and other trusted services. The TSM may allow the TEE (application or VM) to request host (untrusted) services via the COVH (CoVE host-ABI). ==== Operational model for the CoVE Host Extension -Executing confidential workloads in a CoVE requires a sequence of one or more of +Executing confidential workloads in a CoVE-enabled system requires a sequence of one or more of the steps detailed below. These steps are performed by the non-TCB hosting entity like the OS/VMM (host) in conjunction with the TSM. -. Platform TSM detection and capability enumeration -. Conversion of non-confidential memory to confidential memory -. Trusted VM (TVM) creation -. Donating confidential memory to the TSM for TVM page management -. Defining TVM confidential memory regions -. Mapping TVM code and data payload to confidential-memory regions -. Creating TVM VCPUs -. Finalizing TVM creation -. Scheduling TVM execution -. Management of TVM secure interrupts -. Handling and servicing TVM faults and exits -. Mapping TVM demand-zero confidential memory regions -. Mapping TVM non-confidential shared pages on demand -. Processing TVM-access to MMIO regions -. Tearing down TVMs -. Reassignment of confidential memory for other TVMs -. Reclaiming confidential memory for non-confidential VMs +. Platform TSM detection and capability enumeration. +. Conversion of non-confidential memory to confidential memory. +. Trusted VM (TVM) creation. +. Donating confidential memory to the TSM for TVM page management. +. Defining TVM confidential memory regions. +. Mapping TVM code and data payload to confidential-memory regions. +. Creating TVM vCPUs. +. Finalizing TVM creation. +. Scheduling TVM execution. +. Management of TVM secure interrupts. +. Handling and servicing TVM faults and exits. +. Mapping TVM demand-zero confidential memory regions. +. Mapping TVM non-confidential shared pages on demand. +. Processing TVM-access to MMIO regions. +. Tearing down TVMs. +. Reassignment of confidential memory for other TVMs. +. Reclaiming confidential memory for non-confidential VMs. ===== Platform TSM detection and capability enumeration Platform support for the TSM can be detected by probing for the EXT_COVE and @@ -96,18 +97,17 @@ process further ECALLs. TVMs are created using the sbi_covh_create_tvm(). This creates a TVM with state set to `TVM_INITIALIZING`. The host must assign confidential memory for page tables, payload mapping, and -VCPUs before it can be -transitioned into a `TVM_RUNNABLE` state. +vCPUs before it can be transitioned into the `TVM_RUNNABLE` state. ===== TVM memory management The host is responsible for the following memory management functions: -. Converting non-confidential memory to confidential memory -. Donating confidential memory for the TVM page-table pool -. Defining confidential memory regions -. Mapping TVM code and data payload to confidential TVM-pages -. Mapping zero-page confidential pages to the TVM regions -. Mapping non-confidential pages TVM-defined regions for shared-pages / MMIO +. Converting non-confidential memory to confidential memory. +. Donating confidential memory for the TVM page-table pool. +. Defining confidential memory regions. +. Mapping TVM code and data payload to confidential TVM-pages. +. Mapping zero-page confidential pages to the TVM regions. +. Mapping non-confidential pages TVM-defined regions for shared-pages / MMIO. ===== Converting non-confidential memory to confidential memory Platform memory is non-confidential by default, and must be converted to @@ -158,24 +158,24 @@ The region can be sparsely populated, and since the host cannot directly access confidential memory, it must copy the TVM code and data payload from non-confidential memory to confidential memory by calling `sbi_covh_add_tvm_measured_pages()`. This operation requires the host to convert -a sufficient number of non-confidential pages to confidential (by calling -`sbi_covh_convert_pages()`, or by using converted pages that aren't currently +a sufficient number of non-confidential pages to confidential by calling +`sbi_covh_convert_pages()` or by using converted pages that aren't currently assigned to a TVM. The TSM copies the payload for the TVM from non-confidential -pages to confidential pages, and extends the corresponding measurements for the +pages to confidential pages and extends the corresponding measurements for the TVM. -===== VCPU shared state -Host needs access to some of the TVM CSRS and GPRs to handle TVM exits. For +===== vCPU shared state +Host needs access to some of the TVM CSRs and GPRs to handle TVM exits. For example, the host needs `htval` to determine the fault address, `a0`-`a7` GPRs -are needed to handle forwarded ECALLs and so on. For this purpose, the host and -TSM use NACL Extension based shared memory interface <>, from now on called +to handle forwarded ECALLs and so on. For this purpose, the host and +TSM use the Nested Acceleration (NACL) extension based shared memory interface <>, from now on called NACL shared memory to avoid confusion with shared memory pages between TVM and the host. The NACL shared memory interface is between TSM and the host and TSM is responsible for writing any trap-related CSRs and GPRs needed by the host to handle the exception. TSM is also responsible for reading the returned result -and forwarding it to the TVM. Further details about which CSRs and GPRS are used +and forwarding it to the TVM. Further details about which CSRs and GPRs are used by the TSM and the host can be found in <>. The layout of NACL shared memory is shown below as `struct nacl_shmem` and `scratch` space layout for TSM is shown as @@ -235,10 +235,9 @@ are supposed to use from NACL shared memory. It also describes the operation allowed for each entity in terms of `R` (read) and `W` (write) permissions. Note that the TSM and the host can read/write to any of the fields without any faults but the -permissions depict the expected use case. For write only -CSRs or GPRs TSM is supposed to ignore any modifications by the host. TSM is -only supposed to take modifications from CSRs or GPRs -with read permission such as `a0` and `a1` GPRs. +permissions depict the expected use case. For write only accesses to +CSRs or GPRs, TSM is supposed to ignore any modifications by the host. TSM should only take modifications from CSRs or GPRs, e.g., `a0` and `a1` GPRs, +when it has the read permission. [#table_tsm_csr_updates_in_nacl] .TSM NACL CSRs and GPRs @@ -285,19 +284,19 @@ interrupt ticking. [TIP] ==== It's recommended that the TSM should transform the load or store instruction -to/from `a0` before writing to the htinst CSR. +to/from `a0` before writing to the `htinst` CSR. So that `a0` will be the only GPR used for MMIO emulation reducing the GPRs accessible to the host. ==== -===== VCPU creation +===== vCPU creation The host must register CPUs/harts with the TSM before they can be used for TVM execution by calling `sbi_covh_create_tvm_vcpu()`. The NACL shared memory interface is used between the host and the TSM for processing TVM exits from `sbi_covh_run_tvm_vcpu()`. ===== TVM execution -Following the assignment of memory and VCPU resources, the host can transition +Following the assignment of memory and vCPU resources, the host can transition the guest into a `TVM_RUNNABLE` state by calling `sbi_covh_finalize_tvm()`. The host must set up TVM Boot vCPU execution parameters like the entrypoint (`ENTRY_PC`) and boot argument (`ENTRY_ARG`) using arguments to @@ -346,7 +345,7 @@ decoding the contents of the NACL shared memory region. ===== Management of secure interrupts -The host can use the Tee Interrupt Extension (EXT_COVI) to manage secure TVM +The host can use the TEE Interrupt Extension (EXT_COVI) to manage secure TVM interrupts on platforms with AIA support. @@ -416,7 +415,7 @@ Also the reclamation is of the confidential pages, and the shared memory pages provided by the host may be unique from those pages so that host has the option to service the request on the TVM synchronously or asynchronously. -Both sharing and unsharing operations are destructive, i.e. the contents of +Both sharing and unsharing operations are destructive, i.e., the contents of memory in the range to be converted are lost. [caption="Figure {counter:image}: ", reftext="Figure {image}"] @@ -437,7 +436,7 @@ image::tvm_runtime_execution.svg[] This common extension enumerates capabilities for supervisor domains such as number of active supervisor domains and capabilities of each supervisor domain, -e.g. used for CoVE. +e.g., used for CoVE. [#sbi_supd_get_active_domains] === Function: Enumerate active supervisor domains (FID #0) @@ -447,12 +446,12 @@ struct sbiret sbi_supd_get_active_domains(unsigned long active_domains); ----- -Returns a 64 bit vector with bits set for supervisor domains that are active. +Returns a 64-bit vector with bits set for supervisor domains that are active. Default value is 1 since supervisor domain 0 is always required (the hosting domain). For each non-0 position bit set, the SDID with the value of that bit position may be used per the <> convention to invoke functions -supported for that domain e.g. COVH. For active domains, other extensions -may be invoked to get capabilities specific to that domain e.g. the +supported for that domain, e.g., COVH. For active domains, other extensions +may be invoked to get capabilities specific to that domain, e.g., the `sbi_covh_get_tsm_info` must be invoked to get information from a supervisor domain supporting CoVE TSM capabilities. @@ -474,7 +473,7 @@ The following enums are referenced by several functions described below. [source, C] ------------------- enum tsm_page_type { - /* 4KiB */ + /* 4 KiB */ PAGE_4K = 0, /* 2 MiB */ PAGE_2MB = 1, @@ -535,11 +534,11 @@ struct tsm_info { * state in sbi_covh_create_tvm_vcpu(). */ unsigned long tvm_state_pages; - /* The maximum number of VCPUs a TVM can support. */ + /* The maximum number of vCPUs a TVM can support. */ unsigned long tvm_max_vcpus; /* - * The number of 4kB pages which must be donated to the TSM when - * creating a new VCPU. + * The number of 4KB pages which must be donated to the TSM when + * creating a new vCPU. */ unsigned long tvm_vcpu_state_pages; }; @@ -585,7 +584,7 @@ Begins the process of converting `num_pages` of non-confidential memory starting at `base_page_address` to confidential-memory. On success, pages can be assigned to TVMs only following subsequent calls to `sbi_covh_global_fence()` and `sbi_covh_local_fence()` that complete the conversion process. The implied -page size is 4KiB. +page size is 4KB. The `base_page_address` must be page-aligned. @@ -613,7 +612,7 @@ struct sbiret sbi_covh_reclaim_pages(unsigned long base_page_address, ------- Reclaims `num_pages` of confidential memory starting at `base_page_address`. The pages must not be currently assigned to an active TVM. The implied page -size is 4KiB. +size is 4KB. The possible error codes returned in `sbiret.error` are shown below. @@ -696,8 +695,8 @@ information about the parameters that should be used to populate ---- struct tvm_create_params { /* - * The base physical address of the 16KiB confidential memory region - * that should be used for the TVM's page directory. Must be 16KiB-aligned. + * The base physical address of the 16KB confidential memory region + * that should be used for the TVM's page directory. Must be 16KB-aligned. */ unsigned long tvm_page_directory_addr; /* @@ -757,17 +756,17 @@ Transitions the TVM specified by `tvm_guest_id` from the `TVM_INITIALIZING` state to a `TVM_RUNNABLE` state. Also, sets the entry point (`ENTRY_PC`) using `entry_sepc` and boot argument (`ENTRY_ARG`) -using `entry_arg` for the boot VCPU. Both `entry_sepc` and `entry_arg` are +using `entry_arg` for the boot vCPU. Both `entry_sepc` and `entry_arg` are included in the measurement -of the TVM. `entry_sepc` is the address in TVM binary to start the boot VCPU +of the TVM. `entry_sepc` is the address in TVM binary to start the boot vCPU from and `entry_arg` is -the address of guest fdt and is passed as an argument to the boot VCPU in `a1` +the address of guest flattened device tree (FDT) and is passed as an argument to the boot vCPU in `a1` GPR. -`tvm_identity_addr` points to a 64 bytes buffer containing a host-defined TVM +`tvm_identity_addr` points to a 64-bytes buffer containing a host-defined TVM identity. This piece of data can be used to bind TVMs to a host-defined identity -(e.g. an attestation service public key, a guest configuration file hash, an -attestation policy description, etc). Although this piece of data is included in +(e.g., an attestation service public key, a guest configuration file hash, an +attestation policy description, etc.). Although this piece of data is included in the TVM attestation certificate as a dedicated TVM claim (`tvm-identity`), it is *not* included in the TVM measurements. That allows for the host to optionally personalize cryptographically identical @@ -777,8 +776,8 @@ The semantics of this piece of data is defined by the host and can be ignored by both the guest and the attestation services. However, when being used, the TVM identity can be leveraged as follows: -1. The host passes some information to the guest through e.g. some out-of-band -VM orchestration mechanisms. This could be e.g. the hash value for a policy +1. The host passes some information to the guest through, e.g., some out-of-band +VM orchestration mechanisms. This could be, e.g., the hash value for a policy file the guest is expected to apply at runtime. 2. The guest compares the passed host data with the `tvm-identity` attestation certificate claim and can decide to use it or not depending on this local @@ -792,8 +791,7 @@ this verifiable TVM identity. Giving TVMs an identity is optional and the TSM must not include a TVM identity claim in the TVM attestation token when `tvm_identity_addr` is set to 0. When a TVM identity is provided, the `tvm_identity_addr` must be different than -0 -and 64B-aligned. +0 and 64B-aligned. The TSM enforces that a TVM virtual harts cannot be entered unless the TVM measurement is committed @@ -856,7 +854,7 @@ Marks the range of TVM physical address space starting at `tvm_gpa_addr` as reserved for the mapping of confidential memory. The memory region length is specified by `region_len`. -Both `tvm_gpa_addr` and `region_len` must be 4kB-aligned, and the region must +Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned, and the region must not overlap with a previously defined region. This call must not be made after calling `sbi_covh_finalize_tvm()`. @@ -884,7 +882,7 @@ struct sbiret sbi_covh_add_tvm_page_table_pages(unsigned long tvm_guest_id, unsigned long num_pages); ----- Adds `num_pages` confidential memory starting at `base_page_address` to the -TVM's page-table page-pool. The implied page size is 4KiB. +TVM's page-table page-pool. The implied page size is 4KB. Page table pages may be added at any time, and a typical use case is in response to a TVM page fault. @@ -1023,14 +1021,14 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_create_tvm_vcpu] -=== Function: COVE Host Create TVM VCPU (FID #13) +=== Function: COVE Host Create TVM vCPU (FID #13) [source, C] ----- struct sbiret sbi_covh_create_tvm_vcpu(unsigned long tvm_guest_id, unsigned long tvm_vcpu_id, unsigned long tvm_state_page_addr); ----- -Adds a VCPU with ID `vcpu_id` to the TVM specified by `tvm_guest_id`. +Adds a vCPU with ID `vcpu_id` to the TVM specified by `tvm_guest_id`. `tvm_state_page_addr` must be page-aligned and point to a confidential memory region used to hold the TVM's vCPU state, and must be `tsm_info::tvm_state_pages` pages in length. This call must not be made after @@ -1039,7 +1037,7 @@ calling `sbi_covh_finalize_tvm()`. The possible error codes returned in `sbiret.error` are shown below. [#table_sbi_covh_create_tvm_vcpu_errors] -.COVE Host Create TVM VCPU Errors +.COVE Host Create TVM vCPU Errors [cols="2,3", width=90%, align="center", options="header"] |=== | Error code | Description @@ -1051,25 +1049,25 @@ The possible error codes returned in `sbiret.error` are shown below. |=== [#sbi_covh_run_tvm_vcpu] -=== Function: COVE Host Run TVM VCPU (FID #14) +=== Function: COVE Host Run TVM vCPU (FID #14) [source, C] ----- struct sbiret sbi_covh_run_tvm_vcpu(unsigned long tvm_guest_id, unsigned long tvm_vcpu_id); ----- -Runs the VCPU specified by `tvm_vcpu_id` in the TVM specified by `tvm_guest_id`. +Runs the vCPU specified by `tvm_vcpu_id` in the TVM specified by `tvm_guest_id`. The `tvm_guest_id` must be in a "runnable" state (requires a prior call to `sbi_covh_finalize_tvm()`). The function does not return unless the TVM exits with a trap that cannot be handled by the TSM. -*Returns* SBI_SUCCESS in sbiret.value if the TVM exited with a resumable VCPU +*Returns* SBI_SUCCESS in sbiret.value if the TVM exited with a resumable vCPU interrupt or exception, and non-zero otherwise. In the latter case, attempts to call `sbi_covh_run_tvm_vcpu()` with the same `tvm_vcpu_id` will fail. The possible error codes returned in `sbiret.error` are shown below. [#table_sbi_covh_run_tvm_vcpu_errors] -.COVE Host Run TVM VCPU Errors +.COVE Host Run TVM vCPU Errors [cols="2,3", width=90%, align="center", options="header"] |=== | Error code | Description @@ -1081,7 +1079,7 @@ The possible error codes returned in `sbiret.error` are shown below. | SBI_ERR_FAILED | The operation failed for unknown reasons. |=== -The TSM updates the hosts `scause` CSR. The host should use the `scause` field +The TSM updates the host's `scause` CSR. The host should use the `scause` field to determine whether the exit was caused by an interrupt or exception, and then use the additional information in the NACL shared memory region to determine further course of action (if sbiret.value is 0). @@ -1284,7 +1282,7 @@ struct sbiret sbi_covh_tvm_remove_pages(unsigned long tvm_guest_id, Removes mappings for invalidated pages in the specified range of guest physical address space. The range to be unmapped must already have been invalidated and fenced, and must lie within a removable region of the guest's physical address -space. The TSM zeros out all PTEs within the specified range and returns the +space. The TSM zeros out all page table entries (PTEs) within the specified range and returns the ownership of the pages to the host if previously owned by the TVM. The possible error codes returned in `sbiret.error` are shown below. @@ -1350,7 +1348,7 @@ struct tvm_aia_params { */ uint32_t guest_index_bits; /* - * The number of guest interrupt files to be implemented per VCPU. + * The number of guest interrupt files to be implemented per vCPU. * Implementations may reject configurations with guests_per_hart > 0 if * nested IMSIC virtualization is not supported. */ @@ -1381,13 +1379,13 @@ struct sbiret sbi_covi_set_tvm_aia_cpu_imsic_addr(unsigned long tvm_guest_id, unsigned long tvm_vcpu_imsic_gpa); ------- -Sets the guest physical address of the specified VCPU’s virtualized IMSIC to +Sets the guest physical address of the specified vCPU’s virtualized IMSIC to `tvm_vcpu_imsic_gpa`. The `tvm_vcpu_imsic_gpa` must be valid for the AIA -configuration that was set by `sbi_covi_init_tvm_aia()`. No two VCPUs may share +configuration that was set by `sbi_covi_init_tvm_aia()`. No two vCPUs may share the same `tvm_vcpu_imsic_gpa`. This can be called only after `sbi_covi_init_tvm_aia()` and before -`sbi_covh_finalize_tvm()`. All VCPUs in an AIA-enabled TVM must have their +`sbi_covh_finalize_tvm()`. All vCPUs in an AIA-enabled TVM must have their IMSIC configuration set prior to calling `sbi_covh_finalize_tvm()`. The possible error codes returned in `sbiret.error` are shown below. @@ -1603,10 +1601,10 @@ struct sbiret sbi_covi_rebind_aia_imsic_clone(unsigned long tvm_guest_id, unsigned long tvm_vcpu_id); ------- -TSM clones the old guest interrupt file of the specified VCPU. The cloned copy -is maintained in VCPU specific structure visible to TSM only. The host must make +TSM clones the old guest interrupt file of the specified vCPU. The cloned copy +is maintained in vCPU specific structure visible to TSM only. The host must make sure to invoke this from the old physical CPU. The guest interrupt file after -this is free to be reclaimed or bound to another VCPU. +this is free to be reclaimed or bound to another vCPU. The possible error codes returned in `sbiret.error` are shown below. @@ -1664,7 +1662,7 @@ Marks the specified range of TVM physical address space starting at `tvm_gpa_addr` as used for emulated MMIO. Upon return, all accesses by the TVM within the range are trapped and may be emulated by the host. -Both `tvm_gpa_addr` and `region_len` must be 4kB-aligned, and the region must +Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned, and the region must not overlap with a previously defined region. This call will result in an exit to the host on success. @@ -1691,7 +1689,7 @@ Removes the specified range of TVM physical address space starting at `tvm_gpa_addr` from the emulated MMIO regions. Upon return, all accesses by the TVM within the range will result in a page fault. -Both `tvm_gpa_addr` and `region_len` must be 4kB-aligned, and the region must +Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned, and the region must not overlap with a previously defined region. This call will result in an exit to the host on success. @@ -1727,7 +1725,7 @@ completed. Attempts to run it with `sbi_covh_run_tvm_vcpu()` will fail. Any guest page faults taken by other TVM vCPUs in the invalidated pages continue to be reported to the host. -Both `tvm_gpa_addr` and `region_len` must be 4kB-aligned. +Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned. The possible error codes returned in sbiret.error are: @@ -1769,7 +1767,7 @@ with `sbi_covh_run_tvm_vcpu()` will fail. Any guest page faults taken by other TVM vCPUs in the invalidated pages continue to be reported to the host. -Both `tvm_gpa_addr` and `region_len` must be 4kB-aligned. +Both `tvm_gpa_addr` and `region_len` must be 4KB-aligned. [#table_sbi_covg_unshare_memory_region_errors] .COVE Guest Unshare Memory Region diff --git a/specification/swlifecycle.adoc b/specification/swlifecycle.adoc index 5a3ddf0..4283438 100644 --- a/specification/swlifecycle.adoc +++ b/specification/swlifecycle.adoc @@ -34,7 +34,7 @@ function. A TVM context may be created and initialized by using the `sbi_covh_create_tvm()` function - this global init function allocates a set of pages for the TVM global control structure and resets the control -fields that are immutable for the lifetime of the TVM e.g. configuration of +fields that are immutable for the lifetime of the TVM, e.g., configuration of which RISC-V CPU extensions the TVM is allowed to use, debug and pmon capabilities enabled etc. @@ -71,7 +71,7 @@ The VMM uses `sbi_covh_run_tvm_vcpu()` to (re)activate a virtual hart for a specific TVM (identified by the unique identifier). This TEECALL traps into the TSM-driver which affects the context switch to the TSM - The TSM then manages the activation of the virtual hart on the calling physical hart. During -this activation the TCB trusted firmware can enforce that +this activation the TCB's firmware can enforce that stale TLB entries that govern guest physical to system physical page access have been evicted across all hart TLBs. There may also be TLB flushes for the virtual-harts due to VS-stage translation changes (guest virtual to @@ -83,7 +83,7 @@ to ensure these IPIs are delivered through the IMSIC associated with the guest TVM. Each TVM is allocated a guest interrupt file during TVM initialization. -During TVM execution, the HW enforces TSM-driven policies for memory +During TVM execution, the hardware enforces TSM-driven policies for memory isolation for confidential memory accessed by the TVM software - the following hardware enforcement is recommended to address the threat model described in <>: @@ -125,8 +125,8 @@ extend the runtime measurement registers by invoking the of kernel or application modules that are loaded in the TVM. Also during execution, a remote relying party may challenge the TVM to -provide attestation evidence that the TVM is executing as a HW-rooted TEE. -The TVM code may in response request a TSM-signed (hence HW-measurement +provide attestation evidence that the TVM is executing as a hardware-rooted TEE. +The TVM code may in response request a TSM-signed (hence hardware-measurement rooted) attestation evidence via `sbi_covg_get_evidence()` - this evidence structure contains signed hash of the TVM measurements (including the runtime and initial measurements) and is replay-protected via a TVM @@ -176,10 +176,10 @@ memory access-control for memory assigned to the TVMs. These rules are enforced by the TSM and the CPU MMU: . Contents of a TVM page assigned (initially measured or lazy-initialized) -to the TVM is bound to the Guest PA assigned to the TVM during TVM operation. +to the TVM is bound to the Guest physical address (GPA) assigned to the TVM during TVM operation. . A TVM page can only be assigned to a single TVM, and mapped via a single GPA unless aliases are allowed in which case, such aliases must be tracked -by the TSM). Aliases in the virtual address space are under the purview of +by the TSM. Aliases in the virtual address space are under the purview of the TVM OS. . VS-stage address translation - A TVM page mapping must be translated only via VS-stage translation structures which are contained in pages @@ -195,7 +195,7 @@ non-confidential pages that are not assigned to any TVM or the TSM - this is for example for untrusted IO. .. Circular mappings in the G-stage paging structures are disallowed. . Access to shared memory pages must be explicitly signaled by the TVM via -the GPA and enforced for memory access for the TVM by the HW. +the GPA and enforced for memory access for the TVM by the hardware. ==== Information tracked per physical page @@ -208,14 +208,14 @@ Actual page sizes supported are implementation-specified. |=== | *Memory Type* | *Confidential or Non-confidential (enforced via MTT)* | Page-Type | Reserved - page that may not be assigned to any TEE entity -If the Memory type is Confidential, the following page types may be used: +If the Memory Type is Confidential, the following page types may be used: * Unassigned - page not assigned to any TEE (TSM or TVM) * TVM - page assigned to a TVM (mapped via HGAT). * TSM - page used by the TSM (for MTT and other control structures) | Page Owner | If the Memory Type is Confidential and Page-Type is TVM, -this value holds the identifier (e.g. PPN) for the TVM control page (4KB TEE- +this value holds the identifier (e.g., PPN) for the TVM control page (4KB TEE- TSM-TVM page); else it is 0. -| Page sub-type | Following types apply If Memory Type is Confidential and +| Page sub-type | Following types apply if Memory Type is Confidential and Page-Type is TVM: * HGATP - pages used for HGATP structures * Data - pages used for TVM content @@ -225,10 +225,11 @@ Following types apply If Memory Type is Confidential and Page-Type is TSM: * VHCS - pages used for TVM VHCS (virtual hart control structures) | Page TLB version | TLB version in which the page mapping was invalidated to allow for VMM memory management. If the page is Unassigned, the TLB version is -per the global TLB mgmt. If the page is assigned to a TVM, it is versioned per -the TVM-local TLB mgmt. -| Additional meta-data | Locking state e.g. +per the global TLB management. If the page is assigned to a TVM, it is versioned per +the TVM-local TLB management. +| Additional meta-data | Locking state |=== +% HGAT above what does it stand for, hypervisor guest address translation? should it be HGATP, or HGATP should be HGAT? ==== Page walk and Translation caching considerations @@ -250,7 +251,7 @@ is transferred between TEE and non-TEE domains via sbi_covh_convert_pages. Post measured boot, the system memory map must be available to the TSM on load (accessed as part of initialization of the TSM). This memory map structure may -be placed in the memory that is accessible only to the HW and SW TCB. VMM-chosen +be placed in the memory that is accessible only to the hardware and software TCB. VMM-chosen memory regions must be a strict subset of this set of memory regions. Memory regions used for the TSM are marked as reserved by the TSM-driver in this memory map - the TSM uses its memory space to host an Extended MTT (EMTT). @@ -278,7 +279,7 @@ page_tlb_version. Page conversion involves the following steps by the TSM: * Verify page(s) donated by the VMM is/are Non-Confidential page(s) * Initiates a new TLB version tracking cycle via `sbi_covh_convert_pages()` - invalidates MTT entries (synchronized) for the requested page(s) and size as -pages being converted to confidential (i.e. "in transition") +pages being converted to confidential (i.e., "in transition") * TSM enforces a TLB versioning scheme (described below) and using that enforces that the VMM performs the invalidation of the hart TLBs (via IPIs) to remove any cached mappings - VMM performs a local fence operation on each hart @@ -288,7 +289,7 @@ harts for the batch of pages selected for conversion, and marks those mappings as usable as confidential memory. * At this point non-TCB/hosting supervisor domain software cannot create new TLB entries to donated pages - since host software accesses to confidential -memory pages will fault (including implicit accesses) +memory pages will fault (including implicit accesses). ==== Global and per-TVM TLB management @@ -303,15 +304,15 @@ TLB version. A similar TLB version is managed associated with the physical address in the EMTT. If the VMM initiates memory conversion to confidential, or any change to an -assigned confidential and present GPA mapping for a TVM (e.g. remove, relocate, -promote etc.) - then it must execute the following sequence (enforced by TSM) to +assigned confidential and present GPA mapping for a TVM, e.g., remove, relocate, +promote etc., then it must execute the following sequence (enforced by TSM) to affect that change: * Invalidate the mapping it wants to modify (page or range of pages). This step -prevents new cached mappings from being populated in the TLB +prevents new cached mappings from being populated in the TLB. * In the PA metadata maintained by the TSM (EMTT), captures into the per-page metadata, the TLB version at which the conversion was initiated or the mapping -was invalidated +was invalidated. * Initiate global or per-TVM fence/increment the TLB version for the platform or the TVM (this operation needs to be performed only on any one hart). * Issue an IPI to each hart (for global operations like conversion), or the TVM @@ -325,13 +326,13 @@ invalidated and updated to the new TLB version - the TVM exit is reported to the VMM. * Migration of a virtual-hart to a different hart is checked by the TSM to compare the TVM TLB version with the hart TLB version and is fenced by the TSM -during vcpu run. +during the vcpu run. * -----No active/usable translations for converted memory or for TVM G-stage mappings exist at this point ----- -* Invoke the specific mapping change operation (remove, relocate, promote, -migrate etc.) +* Invoke the specific mapping change operation, such as remove, relocate, promote, +migrate etc. * Checks that the affected mapping(s) are invalidated in the MTT and/or g-stage -mapping and validate the mapping +mapping and validate the mapping. * Subsequent page walks may create cached mappings from this point onwards. ==== Page Mapping Page Assignment @@ -343,17 +344,17 @@ page(s) to be used for the hgatp structure entries *Page Mapping Assignment Operation*: -* Verify that the TVM has been created successfully +* Verify that the TVM has been created successfully. * Verify that the PPN(s) for the new page(s) to be used for TVM hgatp is/are -Unassigned-Confidential per the MTT +Unassigned-Confidential per the MTT. * For the GPA to be mapped, perform a TVM-hgatp walk to locate the non-leaf entry that should refer to the new page being added (to hold the next level of the mapping for the GPA). If the mapping already exists, the operation is aborted. -* Initialize the new hgatp page to zero (no hgatp page table entries are valid) +* Initialize the new hgatp page to zero (no hgatp page table entries are valid). * Update the parent hgatp entry to refer to the new hgatp page (mark non-lead -as valid) -* Update the hgatp page EMTT entry with the TVM owner-id and page-type +as valid). +* Update the hgatp page EMTT entry with the TVM owner-id and page-type. ==== Measured page assignment into a TVM memory map @@ -378,10 +379,10 @@ and page size to be used for the guest mapping to be added. *Page Assignment operation*: -* Verify that the TVM has been created successfully +* Verify that the TVM has been created successfully. * If the source page is provided, this operation can only be performed if the TVM measurement has not been finalized. -* Verify that the PFN for the new page to be used for TVM is free in the MTT +* Verify that the PFN for the new page to be used for TVM is free in the MTT. * For the GPA to be mapped, perform a TVM-hgatp walk to locate the leaf entry that should refer to the new page being added. If the mapping does not exist OR exists but is not in the unmapped state, the operation is aborted. @@ -391,7 +392,7 @@ initialization of memory will be performed by the TSM in the context of the confidential supervisor domain and via the TSMs paging structure of the PA assigned to the TVM - hence the memory will be treated as confidential. * The measurement of the TVM is extended with the GPA used to map to the page. -* Update the TVM page MTT entry with the TVM owner PPN and page type as TEE-TVM +* Update the TVM page MTT entry with the TVM owner PPN and page type as TEE-TVM. * Update the leaf hgatp page table entry to refer to the new page (mark leaf as valid) to allow TLB mappings to be created when the TVM vcpu is executing subsequently. @@ -439,11 +440,11 @@ The AIA supports two mechanisms for tracking of interrupts at VS-level: IMSIC guest interrupt files, of which there are a fixed number per physical hart. These allow delivery of external interrupts directly to VS-level as a Virtual -Supervisor External Interrupt. Guest interrupt files occupy a single 4kB page +Supervisor External Interrupt. Guest interrupt files occupy a single 4KB page of physical address space. Memory-resident interrupt files (MRIFs), which track pending and enabled -interrupts in a 4kB page of DRAM. While the RISC-V IOMMU supports automatically +interrupts in a 4KB page of DRAM. While the RISC-V IOMMU supports automatically updating an MRIF's pending bits and delivering a notice interrupt to the host when an MSI is targeted at an MRIF, the hypervisor is still responsible for injection of the VSIE to the guest. IPI emulation must be provided by the @@ -487,11 +488,11 @@ Initializes the AIA state for a virtual hart. Must be called after the virtual hart has been added but before the TVM is run for the first time. The OS/VMM supplies: -The guest physical address of the IMSIC for the virtual hart -The supervisor physical address of a page of confidential memory that is to be +(1) The guest physical address of the IMSIC for the virtual hart. +(2) The supervisor physical address of a page of confidential memory that is to be used as an MRIF for the virtual hart. The page is available to be reclaimed upon destruction of the virtual hart. -An MSI address + data pair that is to be signaled when an MSI is delivered to +(3) An MSI address + data pair that is to be signaled when an MSI is delivered to a virtual hart's MRIF. *tvm_vhart_imsic_bind* @@ -500,10 +501,10 @@ Binds a virtual hart to a guest interrupt file on the current physical hart. The guest interrupt file number is supplied by the OS/VMM. The TSM is then responsible for: -Converting the guest interrupt file page to confidential memory. -Updating IOMMU MSI page tables with the address of the interrupt file. -Migrating MRIF state (if any) to the guest interrupt file. -Mapping the guest interrupt file at the previously-specified address in the +(1) Converting the guest interrupt file page to confidential memory. +(2) Updating IOMMU MSI page tables with the address of the interrupt file. +(3) Migrating MRIF state (if any) to the guest interrupt file. +(4) Mapping the guest interrupt file at the previously-specified address in the TVM's guest physical address space. Upon success the virtual hart is considered "bound" to the current physical @@ -526,11 +527,11 @@ the guest interrupt file in the TVM's guest physical address space using the invalidate + fence procedure described in <>. The TSM is then responsible for: -Verifying that TLB invalidation of the guest interrupt file is complete. -Updating IOMMU MSI page tables. -Copying interrupt state from the guest interrupt file to the virtual hart's +(1) Verifying that TLB invalidation of the guest interrupt file is complete. +(2) Updating IOMMU MSI page tables. +(3) Copying interrupt state from the guest interrupt file to the virtual hart's MRIF. -Converting the guest interrupt file back to a non-confidential state. +(4) Converting the guest interrupt file back to a non-confidential state. Upon success the virtual hart is considered "unbound" and the guest interrupt file it was using is available for OS/VMM use. @@ -538,7 +539,7 @@ file it was using is available for OS/VMM use. While a TVM virtual hart is unbound, MSIs directed at the virtual hart shall trigger the notice interrupt registered in tvm_vhart_aia_init. Attempts by other TVM virtual harts to write the virtual hart's IMSIC in the guest physical -address space (e.g. for the purposes of generating an IPI) shall generate a +address space (e.g., for the purposes of generating an IPI) shall generate a guest page fault exit on the virtual hart which initiated the write. *tvm_vhart_imsic_rebind* @@ -621,12 +622,12 @@ a VMM-available page to grant to a non-confidential VM. *Reclaim TSM operation*: * Verifies that the PAs referenced are either Non-confidential (No-operation) or -Confidential-Unassigned state -* TSM takes exclusive lock over the MTT tracker entry for the PA -* TSM scrubs page contents +Confidential-Unassigned state. +* TSM takes exclusive lock over the MTT tracker entry for the PA. +* TSM scrubs page contents. * TSM updates MTT tracker entry (synchronized) for the page as Non-confidential -and returns the PA as an Non-Conf page to the VMM -* VMM translations to the PA (via 1st or G stage mappings) may be created now +and returns the PA as an Non-Conf page to the VMM. +* VMM translations to the PA (via 1st or G stage mappings) may be created now. === RAS interaction diff --git a/specification/threatmodel.adoc b/specification/threatmodel.adoc index 541f49a..e83d389 100644 --- a/specification/threatmodel.adoc +++ b/specification/threatmodel.adoc @@ -1,40 +1,40 @@ [[threatmodel]] === Adversary Model -_Unprivileged Software adversary -_ This includes software executing in +_Unprivileged Software Adversary_ - This includes software executing in U-mode managed by S/HS/M-mode system software. This adversary can access U-mode CSRs, process/task memory, CPU registers in the process context -managed by system software. With user space I/O an Unprivileged software -adversary may also have ability to submit requests to I/O devices made +managed by system software. With user space I/O an Unprivileged Software +Adversary may also have ability to submit requests to I/O devices made available by system software for U-mode access. -_System Software adversary_ - This includes system software executing in +_System Software Adversary_ - This includes system software executing in S/HS/VS modes. Such an adversary can access S/HS/VS privileged CSRs, assigned system memory, CPU registers, IOMMU(s) and IO devices. -_Startup Software adversary_ - This includes system software executing in +_Startup Software Adversary_ - This includes system software executing in early/boot phases of the system (in M-mode), including BIOS, memory -configuration code, device option ROM/firmware that can access system +configuration code, device option read-only memory (ROM) / firmware that can access system memory, CPU registers, IOMMU(s), IO devices and platform configuration -registers (e.g., address range decoders, SoC fabric configuration, etc.). +registers (e.g., address range decoders, system-on-chip (SoC) fabric configuration, etc.). -_Non-invasive Hardware adversary_ - This includes adversaries that can use +_Non-invasive Hardware Adversary_ - This includes adversaries that can use non-invasive (requiring no physical change to the target hardware) attacks such as bus interposers to snoop on memory and/or device interfaces, voltage and/or clock glitching, observe electromagnetic and other radiation, analyze power usage through instrumentation/tapping of power rails, etc. which may then give the adversary the ability to tamper with data in use. -_Invasive Hardware adversary_ - This includes adversaries that can use +_Invasive Hardware Adversary_ - This includes adversaries that can use invasive hardware attacks, with unlimited physical access to the devices, -and use mechanisms to tamper-with/reverse-engineer the hardware TCB e.g., +and use mechanisms to tamper-with/reverse-engineer the hardware TCB, e.g., extract keys from hardware, using capabilities such as scanning electron microscopes, fib attacks etc. _Side/Covert Channel Adversary_ - This includes adversaries that may leverage any explicit/implicit shared state (architectural or micro-architectural) to leak information across privilege boundaries via -inference of characteristics from the shared resources (e.g. caches, branch +inference of characteristics from the shared resources (e.g., caches, branch prediction state, internal micro-architectural buffers, queues). Some attacks may require use of high-precision timers to leak information. A combination of system software and hardware adversarial approaches may be @@ -43,52 +43,52 @@ utilized by this adversary. === Threat Model T1: Loss of confidentiality of TVMs and TSM confidential memory via in-scope -adversaries that may read TSM/TVM confidential memory via CPU +adversaries that may read TSM/TVM confidential memory via CPU. T2: Tamper/content-injection to TVM and TSM memory from in-scope -adversaries that may modify TSM/TVM memory via CPU side accesses +adversaries that may modify TSM/TVM memory via CPU side accesses. T3: Tamper of TVM/TSM memory from in-scope adversaries via software-induced -row-hammer attacks on memory +row-hammer attacks on memory. T4: Malicious injection of content into TSM/TVM execution context using -physical memory aliasing attacks via system firmware adversary +physical memory aliasing attacks via system firmware adversary. -T5: Information leakage of workload data via CPU registers, CSRs via -in-scope adversaries +T5: Information leakage of workload data via CPU registers, control status registers (CSRs) via +in-scope adversaries. T6: Incorrect execution of workload via runtime modification of CPU -registers, CSRs, mode switches via in-scope adversaries +registers, CSRs, mode switches via in-scope adversaries. T7: Invalid code execution or data injection/replacement via G-stage -paging remap attacks via system software adversary +paging remap attacks via system software adversary. T8: Malicious asynchronous interrupt injection or dropped leading to -information leakage or incorrect execution of the TEE +information leakage or incorrect execution of the TEE. T9: Malicious manipulation of time read from the virtualized time CSRs -causing invalid execution of TVM workload +causing invalid execution of TVM workload. T10: Loss of Confidentiality via DMA access from devices under adversary -control e.g. via manipulation of IOMMU programming +control, e.g., via manipulation of IOMMU programming. T11: Loss of Confidentiality from devices assigned to a TVM. Devices bound to a TVM must enforce similar properties as the TEE hosted on the platform. T12: Content injection, exfiltration or replay (within and across TEE memory) via hardware approaches, including via exposed interface/links to -other CPU sockets, memory and/or devices assigned to a TVM +other CPU sockets, memory and/or devices assigned to a TVM. T13: Downgrading TEE TCB elements (example TSM-driver, TSM) to older versions or loading Invalid TEE TCB elements on the platform to enable -confidentiality, integrity attacks +confidentiality, integrity attacks. T14: Leveraging transient execution side-channel attacks in TSM-driver, TSM, TVM, host OS/VMM or non-confidential workloads to leak confidential -data e.g. via shared caches, branch predictor poisoning, page-faults. +data, e.g., via shared caches, branch predictor poisoning, page-faults. T15: Leveraging architectural side-channel attacks due to shared cache and -other shared resources e.g. via prime/probe, flush/reload approaches +other shared resources, e.g., via prime/probe, flush/reload approaches. T16: Malicious access to ciphertext with known plaintext to launch a dictionary attack on TCB components to extract confidential data. @@ -100,13 +100,13 @@ T18: Forging of attestation evidence and sealed data associated with a TVM. T19: Stale TLB translations (for U/HS mode or for VU/VS) created during TSM or TVM operations are used to execute non-TCB code in the TVM (or consume -stale/invalid data) +stale/invalid data). T20: Isolation of performance monitoring and/or debug state for a TVM leading to information loss via performance monitoring events/counters and debug mode accessible information. -T21: A TVM causes a denial of service on the platform +T21: A TVM causes a denial of service on the platform. [NOTE] ==== @@ -119,14 +119,14 @@ Security Model specification <>) on a regular basis as attacks evolve. This specification describes the threats that a system implementing CoVE should address, however, it does not prescribe the scope of mitigations; instead it focusses on mitigations enabled via the COVH/G interface and the use -of the RISC-V ISA (and extensions such as Smmtt). This specification also +of the RISC-V ISA and its extensions, such as Smmtt. This specification also provides recommendations that implementations of this reference CoVE architecture must address per their chosen scope of adversaries from the list of adversaries discussed above, and what performance/security trade-offs they accept. For threats from any adversaries, implementations may choose to mitigate threats using additional platform capabilities as needed. For all scenarios though, denial of service by TVMs must be prevented. At the same time, denial of -service by non-TCB software (e.g. in a hosting supervisor domain) is considered +service by non-TCB software (e.g., in a hosting supervisor domain) is considered out of scope. [[design_survey]] @@ -168,12 +168,12 @@ Supervisor Domains | Memory Confidentiality | Number of encryption keys | Implementation-specific | cryptography | Number of TEE keys supported | Security Model -| Memory Integrity | Memory integrity against SW attacks | Required | MMU, xPMP, -MTT | Prevent SW attacks such as remapping aliasing replay corruption etc. | +| Memory Integrity | Memory integrity against software attacks | Required | MMU, xPMP, +MTT | Prevent software attacks such as remapping aliasing replay corruption etc. | CoVE ABI -| Memory Integrity | Memory integrity against HW attacks | Implementation -specific | cryptography and/or MMU, xPMP, MTT extension | Prevent HW attacks +| Memory Integrity | Memory integrity against hardware attacks | Implementation +specific | cryptography and/or MMU, xPMP, MTT extension | Prevent hardware attacks DRAM-bus attacks and physical attacks that replace TEE memory with tampered / old data | Security Model @@ -194,7 +194,7 @@ specific | cryptography and/or MMU, xPMP, MTT | Ability to securely share memory with another TEE | Supervisor Domains | I/O Protection | DMA protection from non-TCB-admitted devices | Required | DMA -access-control e.g. IOPMP, IOMTT, IOMMU | Prevent non-TCB peripheral devices +access-control, e.g., IOPMP, IOMTT, IOMMU | Prevent non-TCB peripheral devices from accessing TEE memory | See CoVE-IO <>, IOMMU, Supervisor Domains (IOMTT) @@ -232,34 +232,34 @@ for CoVE | Prevent non-TCB hosting components from denying service to a TVM | Not in scope | Side Channel | Address mapping caches (controlled side channel) | Required -| Supervisor domain Id, MMU, xPMP, MTT | HW/SW TCB should use +| Supervisor domain Id, MMU, xPMP, MTT | hardware/software TCB should use tagging/ partitioning/ flushing techniques to address those types of side channels due to temporal/spatial shared resources | Supervisor Domains, Security Model | Side Channel | Transient-execution attack (TEA) side channels | Implementation-specific | * Bounds check bypass TEA and variants - should be -addressed by TVM software using apropos synchronization. TCB SW should use +addressed by TVM software using apropos synchronization. Software TCB should use synchronization to isolate TCB code from non-TCB code. -* Branch target injection TEA and variants - should be addressed by TCB SW via +* Branch target injection TEA and variants - should be addressed by software TCB via flushing across privilege boundaries to remove untrusted state injected by non-TCB software -* Speculative store bypass TEA and variants - should be addressed by TCB HW +* Speculative store bypass TEA and variants - should be addressed by TCB hardware via synchronization/barriers to prevent speculative execution of memory reads which may allow unauthorized disclosure of information. | Implementations should mitigate attacks such as these spectre variants (In practice, it is difficult to defend against such attacks in advance) | -Supervisor Domain Id, Addtl. Recommendations in Security Model +Supervisor Domain Id, additional recommendations in Security Model | Side Channel | Control channels, single-step/zero-step attacks | Required | -leverage HW/SW TCB mechanisms to enforce restrictions on single-stepping +leverage hardware/software TCB mechanisms to enforce restrictions on single-stepping or zero-stepping via use of state flushing/barriers, entropy defenses and detection mechanisms. | Prevent interrupt/exception injection (combined with cache side channel to leak sensitive data) | Security Model | Side Channel | Architectural cache side channel | Implementation-specific | cache partitioning-based defenses | Prevent shared resource contention, -e.g. attacks such as prime probe | Security Model +e.g., attacks such as prime probe | Security Model | Side Channel | Architectural timing side channel | Implementation-specific | data independent execution latency (DIEL) operations, uArch state flushing | @@ -269,7 +269,7 @@ Leveraging data dependency timing channels | Security Model | Required | RoT unique trust chain for TEE TCB | Enforcing initial firmware authorization and versioning | CoVE ABI, Security Model -| Attestation | Remote attestation | Required | HW-RoT-rooted PKI (trust +| Attestation | Remote attestation | Required | hardware RoT-rooted PKI (trust assertions) via Internet | Prevent fake hardware and software TCB; Prevent non-TCB hardware debugging in production. | CoVE ABI, Security Model @@ -285,7 +285,7 @@ Verification of attestation by TCB | Future CoVE ABI, Security Model | Attestation | TCB versioning (and updates) | Required | Mutable firmware where TVM has to opt-in at startup if TCB updates are allowed while the TVM is - executing - HW TCB then enforces lower TCB elements are updatable + executing - hardware TCB then enforces lower TCB elements are updatable (with apropos controls like security version enforced) to enforce the opt-in policy. | Allow TCB updates - Prevent TCB rollback | CoVE ABI, Security Model @@ -303,7 +303,7 @@ chain | CoVE ABI, Security Model | Attestation | TCB transparency (and auditability) | Implementation-specific | Mutable firmware | TCB elements reviewable | CoVE ABI, Security Model -| Attestation | Sealing | Implementation-specific | HW Rot sealing keys per TVM +| Attestation | Sealing | Implementation-specific | Hardware RoT sealing keys per TVM | Binding of secrets to TEEs | CoVE ABI, Security Model | Operational Features | TVM Migration | Implementation-specific | Secure From 48cbc65c931719f57956f26d8406cee71329d40a Mon Sep 17 00:00:00 2001 From: Wojciech Ozga Date: Fri, 15 Mar 2024 15:05:07 +0100 Subject: [PATCH 2/7] Update specification/glossary.adoc Co-authored-by: Ravi Sahita Signed-off-by: Wojciech Ozga --- specification/glossary.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/glossary.adoc b/specification/glossary.adoc index e84c322..fe74d8a 100644 --- a/specification/glossary.adoc +++ b/specification/glossary.adoc @@ -24,7 +24,7 @@ measured by the previous TCB layer. The CDI is a secret that may be certified to use for attestation protocols. | Confidential Computing | A computing paradigm that protects data in use by performing -computation in a hardware-based TEE. +computation in a hardware-based, attested TEE. | CoVE | Confidential VM extension (CoVE) is the set of non-ISA RISC-V ABI extensions defined in this specification that enables confidential computing on RISC-V From 486397b96d7cc816fb9215d96f71641ae3adfe27 Mon Sep 17 00:00:00 2001 From: Ravi Sahita Date: Tue, 19 Mar 2024 10:27:03 -0700 Subject: [PATCH 3/7] Apply suggestions from spec review Signed-off-by: Ravi Sahita --- specification/intro.adoc | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/specification/intro.adoc b/specification/intro.adoc index 4b27c9d..dadd333 100644 --- a/specification/intro.adoc +++ b/specification/intro.adoc @@ -6,9 +6,10 @@ This document describes the Confidential VM Extension (CoVE) interface for a scalable Trusted Execution Environment (TEE) for hardware virtual-machine-based workloads on RISC-V-based platforms. This CoVE interface specification enables application workloads that require confidentiality to reduce the Trusted -Computing Base (TCB) to a minimal TCB, specifically, keeping the host OS/VMM -and other software outside the TCB. -% Do we want to talk here about IO devices as well? +Computing Base (TCB) to a minimal TCB, specifically, keeping the host OS/VMM, +devices and other software outside the TCB. Admitting devices into the TCB of CoVE +TEE VMs is outside the scope of this specification and is described in the CoVE-IO +specification. The proposed specification supports an architecture that can be used for Application and Virtual Machine workloads, while minimizing changes to the RISC-V ISA and privilege modes. From 7546dbdcfd973ba9a181ec239ebb20c10ce445a1 Mon Sep 17 00:00:00 2001 From: Ravi Sahita Date: Tue, 19 Mar 2024 10:30:08 -0700 Subject: [PATCH 4/7] Apply suggestions from spec review Signed-off-by: Ravi Sahita --- specification/intro.adoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/specification/intro.adoc b/specification/intro.adoc index dadd333..5221821 100644 --- a/specification/intro.adoc +++ b/specification/intro.adoc @@ -12,5 +12,4 @@ TEE VMs is outside the scope of this specification and is described in the CoVE- specification. The proposed specification supports an architecture that can be used for Application and Virtual Machine workloads, -while minimizing changes to the RISC-V ISA and privilege modes. -% What is the meaning of "Application" here? When I read "Application and Virtual Machine" I think of a "process-based" and "VM-based" TEEs, i.e., SGX and TDX like. But this contradicts the initial sentence in this paragraph that says that CoVE provides TEE for VM-based workloads. \ No newline at end of file +while minimizing changes to the RISC-V ISA and privilege modes. \ No newline at end of file From 9678eaef45cce9a1daf8bddcb337afdab8097f1e Mon Sep 17 00:00:00 2001 From: Ravi Sahita Date: Tue, 19 Mar 2024 10:34:26 -0700 Subject: [PATCH 5/7] Apply suggestions from spec review Signed-off-by: Ravi Sahita --- specification/swlifecycle.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/swlifecycle.adoc b/specification/swlifecycle.adoc index 4283438..bae2b5b 100644 --- a/specification/swlifecycle.adoc +++ b/specification/swlifecycle.adoc @@ -210,7 +210,7 @@ Actual page sizes supported are implementation-specified. | Page-Type | Reserved - page that may not be assigned to any TEE entity If the Memory Type is Confidential, the following page types may be used: * Unassigned - page not assigned to any TEE (TSM or TVM) -* TVM - page assigned to a TVM (mapped via HGAT). +* TVM - page assigned to a TVM (mapped via G-stage page table). * TSM - page used by the TSM (for MTT and other control structures) | Page Owner | If the Memory Type is Confidential and Page-Type is TVM, this value holds the identifier (e.g., PPN) for the TVM control page (4KB TEE- From 741c42d3efda39d790c28bba1a11116db0ee773b Mon Sep 17 00:00:00 2001 From: Ravi Sahita Date: Tue, 19 Mar 2024 11:26:40 -0700 Subject: [PATCH 6/7] Apply suggestions from code review Signed-off-by: Ravi Sahita --- specification/refarch.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specification/refarch.adoc b/specification/refarch.adoc index dda2d38..623e1fc 100644 --- a/specification/refarch.adoc +++ b/specification/refarch.adoc @@ -431,7 +431,7 @@ TEE and TVM address spaces are identified by supervisor domain identifiers address translation caches, e.g., Hart TLB lookup may be extended with the SDID in addition to the ASID, VMID for workloads in the Confidential supervisor domain. TVM memory isolation must support sparse memory management -models and architectural page-sizes of 4KB, 64K, 2MB, 1GB (and optionally +models and architectural page-sizes of 4KB, 64KB (with Svnapot), 2MB, 1GB (and optionally 512GB). % Should 64K be 64KB? Is there RISC-V MMU spec for 64KB pages? The hardware may implement the MTT as specified in the Smmtt From 2d08786d302cbe7ccd5895531645afa54819d06c Mon Sep 17 00:00:00 2001 From: Ravi Sahita Date: Tue, 19 Mar 2024 11:27:27 -0700 Subject: [PATCH 7/7] Apply suggestions from spec review Signed-off-by: Ravi Sahita --- specification/refarch.adoc | 1 - 1 file changed, 1 deletion(-) diff --git a/specification/refarch.adoc b/specification/refarch.adoc index 623e1fc..151dd22 100644 --- a/specification/refarch.adoc +++ b/specification/refarch.adoc @@ -433,7 +433,6 @@ SDID in addition to the ASID, VMID for workloads in the Confidential supervisor domain. TVM memory isolation must support sparse memory management models and architectural page-sizes of 4KB, 64KB (with Svnapot), 2MB, 1GB (and optionally 512GB). -% Should 64K be 64KB? Is there RISC-V MMU spec for 64KB pages? The hardware may implement the MTT as specified in the Smmtt privileged ISA extension, or other approaches may be used such as a flat table. The memory tracking table may be enforced at the memory controller,