SGX Explained

Notes on Intel SGX Explained

This long paper by Victor Costan and Srinivas Devadas of MIT describes much of the hardware of Intel’s x86 that is relevant to security. It also describes several attack categories with extensive references.

Typo: “equal the most recent value written”/“equal the most value written”
Translation IaaS: “Infrastructure as a Service”
The flow diagram in Figure 6 suggests that when an instruction stores on itself the CPU executes the new instruction. That would surprise me but who knows. I know that the hardware fetches instructions ahead and notices mods to prefetched instructions, at least by the same virtual address.
Page 8: “s 0xFFFFF000 - 0xFFFFFFFF (the 64 KB of memory right below the 4 GB mark)” The address range describes 2¹² = 4K bytes, not 64KB.

“For example, applications can read the time-stamp counter via the RDTSC and RDTSCP instructions, which are very useful for benchmarking and optimizing software.” Well that depends on “time stamp disable” bit in CR4.TSD (Section 2.5 of System Architecture Overview).

Page 11 suggests that hardware page table consultation in virtual mode does the naïve thing of consulting the map for the guest as it consults the map built by the guest for the guest’s guest. That might well be an inferior solution to how IBM did virtual-virtual with a kernel built map that was the composition of the two maps which is described in detail here.

Page 11: “from application software running above ring 0”/“from application software running at ring 3”

“which would make it impossible for the verifier to reason about the security properties of the software inside the container.”/“which would make it impossible to reason about the security properties of the software inside the container.”

You speak somewhere of architectures that were designed to be virtualizable. I believe that when the IBM 360/67 was designed, as a mod to the standard IBM 360, the concept of virtualization had not been invented. Yet the 67 was fully virtualizable. Part of the reason was that the designers saw no reason to let the user mode program learn the content of privileged registers, despite the fact that there was no harm in doing so. ‘Harm’ here meant damage to system security rather than ruining an illusion. Letting user mode code read real privileged registers that locate tables in memory prevents presenting a virtual address to a guest OS designed to run in privileged mode.

Mixing user mode architected state, such as ‘Sign Flag’ with system state ‘IOPL’ or privileged mode in one register EFLAGS, would have seemed unseemly to the 360 architects. I don’t know if they could have explained why. It led Intel to recover both values from the stack in the IRET command. This presumed that the privileged code shared the stack with the user which was always a fatal plan in case of hostile user code that could put an invalid address in the stack pointer and then immediately cause a fault.

Popek & Goldberg describe what it is about an ISA that makes it natively virtualizable. 360/67 had those properties as well as several other architectures that I have seen. Before VM became popular Wang built a machine that extended the 370 and allowed these ‘safe’ operations and many other ‘convenient instructions’. They could not produce a VM for their variant.

IBM added several privileged commands to the 370 to optimize the VM. Then the question arose: “Did the virtual machine have these features too?”. I have not seen a list of the features that the various virtual x86’s have. Nor, for that matter, what various features the different VM supervisors require or exploit.

P 16: See my PCIe misgivings here.

The ME is part of Intel’s Active Management Technology (AMT), which is marketed as a convenient way for IT administrators to troubleshoot and fix situations such as failing hardware, or a corrupted OS installation, without having to gain physical access to the impacted computer. This reminds me of an IBM plan for ACS.

The good news it that the PCH, which includes the ME, is an optional part of today’s x86—a separate chip. The bad news is that just about all x86 systems include it as of about 2015.

“SPI”?

“The ME accesses the flash memory chip via an embedded SPI controller.”/“The ME accesses the flash memory chip an embedded SPI controller.”??

P 33: “The random data is produced by a cryptographically strong pseudo-random number generator (CSPRNG) that expands a small amount of random seed data into a much larger amount of data,”
Well, not too ‘small’. Unless there is a substantial work factor in the CSPRNG the seed must be unguessable. The CSPRNG expansion works especially well for generating private RSA keys.

P 38: The major issue in ‘the CA system’ is trusting the CA, actually trusting all the CA’s in the world.

P 41: I think that the authors distinguish between a Certificate Authority and the x509 CA system. I have no objections to how CA’s are used here.
My note on attestation following section 3.3. I should note here that attestation seems designed to protect the interests of parties other than the owner-operator of the machine—DRM, in other words.

“However, all the designs share the principle that each step taken to build a secure container contributes data to its measurement hash.” It seems to me that you need to hash the result of the steps, not the steps. The ‘semantics’ of those steps are likely to be unstable over time.

P 42: An easy strategy for Intel is not to create certs for machines that they deliver with debug ports, or create certs with a lesser root cert.

P 45: “PCI Express Attacks”: See this for additional vulnerabilities, and potential fixes.

P 46: “§ 3.5 argues that today’s system software is virtually guaranteed to have security vulnerabilities.”
Section 3.5 does not explore small kernels on hardware without SMM, TXT, ME, etc..

P 47: “However, potentially malicious system software can still infer partial information about the application’s memory access patterns, by observing the application’s page faults and page table attributes.”
Worse it can read the ciphertext, noting changes, and deduce much information flow of the tenant.

I think that the paper does not describe how the hostile code is kept out the enclave’s (container’s) memory. I recall (or jumped to the conclusion) that it worked liked encrypted disks.

I had thought that the first attack described in section 3.7.2 was prevented by SGX hardware by encrypting cache lines in DRAM with symmetric keys that depended on the virtual address. If so the page swap will fail with very high probability. Perhaps the authors know more about SGX than I. Perhaps I have heard details of a more recent SGX design which assumes encrypted and authenticated DRAM. The feasibility of adequate authentication is still an issue—the replay attack against freshness. These ideas come from rumored logic for encrypted disks which are to DRAM as is to the cache. The difference is that encrypted disk pages can afford much better replay protection than cache lines. It is an arithmetic problem.

I had assumed that no enclave code would attempt MP.

P 48:“These instructions [RDTSC & RDTSCP] have been designed for benchmarking and optimizing software, so they are available to ring 3 software.” Access to the real time clock can be denied to ring 3 programs (CR4). They can be made available when there are no secrets in the TLB.

P 50: In general there is a problem of extracting key bits thru timing. If programs with such secrets, can decide when to take special precautions when using such secrets, then they can invoke a privilege of denying precise clock read access to all ring 3 programs, including themselves. Programs that legitimately attempt these clock read instructions are merely delayed until the end of the privileged period. This is an engineering rather than a mathematical solution.

P 57:

The MEE is informally described in an ISCA 2015 tutorial [103], and appears to lack a formal specification. In the absence of further information, we assume that SGX provides the same protection against physical DRAM attacks that Aegis and Bastion provide.

P 58: The physical memory devoted to enclaves is defined at a range of address whose size is a power of 2 and that begins on a multiple of its size. This range is define by registers “PRMRR”.

Within this area (one area per CPU) is a set of pages. The hardware has a notion of separate enclaves which share these pages. The “EPCM” is a mechanism used by the hardware the prevent two enclaves from using the same page.

I feel tempted to quit this examination of SGX for they are are painting themselves into so many corners that they will never achieve a useful protected platform.

Footnote 80: Intel Manageability Firmware Recovery Agent
Intel® 100 Series and Intel®C230 Series Chipset Family Platform Controller Hub (PCH)
Wikipedia
Intel’s SGX root

Pausing at page 50 to read section 5.1, P 58. “Software Guard Extensions Programming Reference”

55/110