TLS sessions from within TEEs

Posted on 8/19/24 by Arnaud, Founding Engineer at Turnkey (follow on X)

Part 0: But why?

We’ve been working with TEEs (Trusted Execution Environments) aka secure enclaves at Turnkey for a couple of years now. While building our new OAuth feature we had to solve an interesting problem: Enclaves do not have network access (no NIC!), yet we have to fetch a list of public keys to verify OIDC tokens securely (see the spec).

Fetching this list of published public keys is a critical step: If Turnkey verifies OIDC token signatures against the wrong list of public keys, forged OIDC tokens can be accepted as legitimate, or legitimate tokens can be rejected. We can’t hardcode OIDC signer public keys in our enclave application code because they rotate too frequently¹. We can’t provide these public keys to enclaves at runtime because our threat model would consider this insecure. We also can’t record a TLS session and provide it as proof, because TLS does not provide non-repudiation²!

This post explains our approach in three parts: in Part I, I’ll introduce TEEs and how we’re using them at Turnkey. We’ll see that TEEs provide verifiability of the computation they run, and that Turnkey’s operating system to run applications inside of secure enclaves (“QuorumOS”) provides application-to-application authentication primitives through the provisioning of stable Quorum keys at boot time.

In Part II, I’ll talk about networking more in-depth. To make a TLS connection to the outside world our enclave application relies on a layer 4 proxy sitting on the host side: a TCP connection is established by the proxy at the request of the enclave application, and the TLS session can be driven from inside the enclave over that connection.

Finally, in Part III I’ll explain how we combined Quorum key signatures and layer 4 proxy to secure our OAuth flows. And as a bonus, I’ll outline other use cases where a TLS fetcher will come in handy—turns out: there are plenty!

Part I: What are TEEs? And how is Turnkey using them?

I won’t bore you with a generic definition and classification of TEEs here. If you need a general introduction to TEEs, I’d recommend checking out the wikipedia page, and if you’re curious about AWS Nitro Enclaves in particular (we’re using them at Turnkey), check out this video.

secure enclaves

Think of a TEE, or “secure enclave”, as an isolated (virtual) machine provisioned with its own CPU and memory. The following properties are important to grasp:

A secure enclave is stateless and does not have the ability to write to a persistent disk or cache. Its only form of persistence is volatile memory (RAM), cleared on every restart.
A secure enclave is not connected to the network. The only networking element attached to a secure enclave is a VSOCK interface to enable communication with the enclave host.
A secure enclave has access to an independent secure source of entropy and time via the Nitro Security Module (“NSM”)
On boot, a secure enclave generates a brand new cryptographic key pair, called the enclave ephemeral key.
A secure enclave can provide attestations containing measurements (aka Platform Configuration Registers, or “PCRs”) about the contents of the image, boot RAM, and more. Attestations provide verifiability of the computation running inside of secure enclaves.

In order to deploy applications inside of secure enclaves, we’ve built a new OS: QuorumOS. Among other things, QuorumOS adds a crucial provisioning mechanism. QuorumOS expects shares of a known key (“Quorum Key”) to be posted when an enclave boots. Once enough shares are posted and the Quorum Key is reconstructed, QuorumOS runs the application, and the application can use this Quorum Key to decrypt, encrypt, or sign data. Applications running within QuorumOS operate with a stable key across reboots: their Quorum Key. This Quorum Key can be used to authenticate the application’s output: If data is signed by the Quorum Key, it’s legitimate and originates from this application. Otherwise, it’s not. In Part III we’ll see how Quorum Keys are used to ensure TLS response authenticity. Before we get there, we need to talk about enclave networking.

Part II: TLS requests from inside enclaves

From the previous section we know secure enclaves do not have the ability to contact the outside world directly. We’ve also said that an enclave is connected to its host by a VSOCK interface, but have not explained what VSOCK is, really. Put simply, a VSOCK is similar to a UNIX domain socket (UDS) but is used to communicate between hosts and virtual machines. A VSOCK connection has a context ID and a port. The context ID is analogous to an IP address in TCP/IP, and ports work as you would expect.

classic VSOCK usage

Typically an enclave server binds to its context ID and a chosen port, listening for host connections and requests. The host client forward requests it receives from the network to the enclave application by connecting to the right context ID and port.

To make outbound requests from inside an enclave application we need to do this in reverse: the host has to listen for requests made by the enclave application. When the enclave makes a request to establish a connection, the host-side proxy can connect to the right target because it has network access.

VSOCK interface usage to enable proxying

We’ve explained how enclave applications can connect to a proxy application connected to the network. How do we use this to fetch TLS from “within” enclaves? And: If the proxy server fetches TLS content entirely, what have we accomplished? Is this secure?

An important point to grasp is that the proxy works at the TCP level only (aka “layer 4” or “transport layer”) and handles bare TCP read and write operations. More exactly, the proxy’s interface is composed of 3 operations:

Open connection: Open a new TCP connection to a target IP address
Read from connection: Read N bytes from an existing TCP connection
Write to connection: Write N bytes to an existing TCP connection

All enclave applications at Turnkey are written in Rust. In Rust, Read and Write are standard library traits: std::io::Read, std::io::Write. The most popular pure-Rust TLS crate, Rustls, works with these traits to implement TLS: users of the library provide a connection object which implements Read and Write traits (generally, a TCP socket), and Rustls uses that connection object to implement TLS handshakes, request encryption, and response decryption on top. This is explained in more detail in their documentation. We use this to our advantage by implementing these traits with a custom struct: upon receiving a call to “read” or “write” our trait-implementing struct calls the host proxy to read or write from already-established TCP connections!

Here’s an abbreviated Rust code snippet to show what I mean when I say that “implement the Read and Write traits by calling the proxy”:

struct RemoteConnection {
    fn new(...) -> Self {
        // Make a request to the host Proxy to establish a new TCP connection
        // We keep a connection ID so we know where to read from & write to
    }
}

impl Read for RemoteConnection {
    // This function signature is mandated by the standard library
    fn read(&mut self, buf: &mut [u8])-> ... {
        // Make a request to our Proxy via VSOCK to read from our connection
    }
}

impl Write for RemoteConnection {
    // This function signature is mandated by the standard library
    fn write(&mut self, buf: &[u8]) -> ... {
        // Make a request to our Proxy via VSOCK to write to our connection
    }
}

The picture is now complete and we’ve arrived at the end of Part II. Our proxy lets enclave applications establish TCP connections to remote hosts and perform bare TCP reads and writes. The enclave application is able to create and use these remote TCP connections to establish a TLS connection using Rustls. The TLS session keys are generated and kept inside of the enclave, and the TLS certificates are also verified there. The host proxy only ever sees encrypted packets!

Important note on security: If the proxy isn’t honest, the TCP packets could be routed to the wrong remote host, but that would cause TLS certificate verification to fail. The proxy can also choose to censor and refuse to forward packets. This will be detected by the enclave as well. Finally, we’ve already seen that the proxy cannot decrypt or change TCP packets because TLS encrypts traffic with session keys, and these session keys are created and kept in our secure enclave. Ready for part III?

Part III: Securing OIDC verification with TLS-fetching enclaves

If you’ve read and understood part I and II, congratulations: the hard work is done! Here I’ll explain how we use the TLS fetcher to verify OIDC tokens.

Remember from Part I: Enclaves have a stable Quorum Key to authenticate the data they return. Now that we have an enclave application to fetch content over TLS, we can provide non-repudiation of TLS responses by signing the fetched content with the TLS Fetcher Quorum Key.

In other words, the output of our TLS fetcher enclave is a proof that a given URL returned some content at a specific time. We call this new primitive a “verifiable TLS fetch” for brevity.

You can probably see where this is going: We use verifiable TLS fetching to guarantee the authenticity of the OIDC provider configuration, as well as the authenticity of the current list of JWK signers (list of public keys).

Putting this all together, verifying an OIDC token signature is done in five steps, all within enclave applications:

Parse the OIDC token and extract the “iss” property (issuer URI, e.g. “https://accounts.google.com”).
Verifiably fetch the issuer’s OIDC configuration at “iss” + /.well-known/openid-configuration. For example: Google’s OIDC configuration.
Parse the provider configuration and get the “jwks_uri” key. The value is a URI.
Verifiably fetch jwks_uri and parse the response: We now have a list of public keys! For example, see Google’s current signers.
Verify that the OIDC token is signed by one of the public keys we just fetched

Verifiable TLS fetch is a valuable primitive well beyond OIDC token signature verification and we’re planning to use it for many other use-cases:

We can securely fetch cryptocurrency prices from multiple sources to provide notional value Turnkey policies.
We can secure third-party integrations and attest to them. Think: gas estimation or transaction construction APIs, transaction parsing, broadcasting, indexing requests, LLM APIs, and more.
Exposed through a standalone Turnkey activity, we think customers will build new onchain oracles to secure their smart contracts.

Zooming out, the TLS fetcher is a fundamental building block to use existing web APIs in a verifiable way. We’ve added non-repudiation on top of TLS, and this is no small feat. We can’t wait to see where this leads us. Get in touch if this triggered ideas or comments — we would love to build with you!

For e.g. Google, they rotate every ~6 hours ↩︎
This answer has a long explanation, but I’ll provide a shorter one for convenience: when a TLS session is established, the client and the server both negotiate a shared TLS symmetric session key. This session key is the key used to encrypt and decrypt payloads sent over the network. Because this key is shared and identical on both sides, a client can pretend to have received a response it didn’t receive, and the server can pretend to have received a request it didn’t receive. In other words: it’s trivial for the client to impersonate the server, and vice-versa. ↩︎