Dirty COW, Dirty Pipe, CopyFail: Three Ways to Root via Page Cache
Three vulnerabilities, three different kernel subsystems, same result: an unprivileged user writes data to the page cache of a read-only file and becomes root. Dirty COW needed a race condition. Dirty Pipe was deterministic. CopyFail does the same in 732 bytes of Python.
Dirty COW (CVE-2016-5195)
When: in the kernel since 2007, discovered in 2016. Sat there for 9 years.
Where: mm/gup.c, copy-on-write handling in get_user_pages().
How it works:
When writing to a private file mapping (MAP_PRIVATE), the kernel should first create a COW copy, then write. The problem: these two operations were not atomic.
The exploit runs two threads:
- First writes to the mapping via
/proc/self/mem, triggering a COW break - Second calls
madvise(MADV_DONTNEED), telling the kernel to discard the just-created COW copy
If madvise hits the window between page lookup and write, the kernel writes data directly to the original page cache page. After millions of attempts, it eventually succeeds.
Reliability: low, the race condition requires many iterations.
Discoverer: Phil Oester, who found the exploit in network traffic; the vulnerability was being exploited before disclosure.
Fix: commit 19be0eaffa3a, which prevents madvise from interrupting the COW operation.
Dirty Pipe (CVE-2022-0847)
When: in the kernel since 5.8 (2020), discovered in 2022. Sat there for 2 years.
Where: lib/iov_iter.c, uninitialized flags field in struct pipe_buffer.
How it works:
In 2020, the PIPE_BUF_FLAG_CAN_MERGE flag was added, telling the kernel it can append data to an existing pipe buffer instead of allocating a new one. The problem: after draining a pipe, buffers return to the pool but the CAN_MERGE flag is not cleared.
Exploit:
- Create a pipe and fill it with data, so every buffer gets the
CAN_MERGEflag - Drain the pipe; buffers are freed, but the flag remains
splice()one byte from the target file into the pipe, and page cache lands in a buffer with the staleCAN_MERGEflag- Write to the pipe: the kernel sees
CAN_MERGEand appends data directly to the page cache page
Deterministic, no race condition. A sequence of a few syscalls.
Constraints: can’t write at page offset 0, can’t cross page boundaries, can’t enlarge files.
Discoverer: Max Kellermann (CM4all), who found the issue while diagnosing corrupted log files.
Fix: commit 9d2231c5d74e, which initializes flags to zero when creating new buffers.
CopyFail (CVE-2026-31431)
When: in the kernel since 4.14 (2017), discovered in 2026. Sat there for 9 years.
Where: algif_aead.c, in-place optimization in the kernel’s crypto subsystem (AF_ALG).
How it works:
Three independent changes created the vulnerability:
- 2011:
authencesnalgorithm (IPsec ESN) writes 4 bytes of scratch data past the tag area - 2015: AF_ALG gained AEAD with
splice(), so page cache enters scatterlists - 2017: optimization
req->src = req->dstmakes page cache pages end up in the writable destination list
Exploit:
- Open an AF_ALG socket, bind to
authencesn(hmac(sha256),cbc(aes)) splice()data from the target file (e.g.,/usr/bin/su), placing page cache in the scatterlistsendmsg()with controlled AAD, where bytes 4–7 are the write valuerecv()triggers decryption, andauthencesnwrites 4 bytes into page cache- Repeat for each shellcode chunk
execve("/usr/bin/su")runs the modified binary from page cache, shellcode as root
The entire exploit: 732 bytes of Python, standard library, zero dependencies.
Discoverer: Theori / Xint Code.
Fix: commit a664bf3d603d, reverting to out-of-place operation, separating req->src and req->dst.
Comparison
| Dirty COW | Dirty Pipe | CopyFail | |
|---|---|---|---|
| CVE | CVE-2016-5195 | CVE-2022-0847 | CVE-2026-31431 |
| CVSS | 7.0 | 7.8 | 7.8 |
| Subsystem | mm (COW) | pipe | crypto (AF_ALG) |
| Mechanism | race condition in madvise/write | stale CAN_MERGE flag after splice | authencesn scratch write to page cache |
| Deterministic | no | yes | yes |
| Time in kernel | 9 years (2007–2016) | 2 years (2020–2022) | 9 years (2017–2026) |
| Exploit | C, two threads, millions of tries | C, syscall sequence | Python, 732 bytes |
| Modifies disk | yes | no (page cache only) | no (page cache only) |
| Container escape | N/A | yes | yes |
| Used in the wild | yes (before disclosure) | yes (CISA KEV) | unconfirmed |
Evolution
These three vulnerabilities demonstrate the same primitive (writing to the page cache of a read-only file), but with increasing elegance:
Dirty COW required winning a race between two threads. Unreliable, but worked on kernels from 2.6.22, meaning practically everything.
Dirty Pipe eliminated the race condition. A single uninitialized flag was enough for splice + write to land directly in the page cache. It didn’t modify the file on disk, which made it harder to detect.
CopyFail went further: the exploit fits in a single short Python script, works identically on Ubuntu, RHEL, Amazon Linux and SUSE. Also doesn’t touch disk. A controlled 4-byte write at any offset, repeatable without limit.
The common denominator: page cache as a shared resource between userspace and the kernel is a powerful mechanism, but any bug that allows unauthorized writes to it immediately grants privilege escalation.
More about CopyFail: CopyFail: 9 Years of Hidden Privilege Escalation | Check If Your Kernel Is Vulnerable
Sources: dirtycow.ninja, dirtypipe.cm4all.com, copy.fail, NVD