Category: Self-Hosting

  • The Copy Fail Vulnerability That Makes Your Docker Containers Basically Glass

    The Copy Fail Vulnerability That Makes Your Docker Containers Basically Glass

    Your homelab is probably vulnerable right now.

    I know I should feel old saying that, but here we are. Copy Fail — officially CVE-2026-31431 — is a Linux kernel vulnerability that lets any local user gain root access in seconds. And because the kernel’s page cache is shared across containers, a compromised container can effectively escape to the host in ways that completely undermine the Docker security model.

    CVSS 7.8 (High). Discovered and publicly disclosed on April 29, 2026 by researchers at Theori (using an AI system called “Xint Code” by the security firm MMM). The exploit can be performed with a 732-byte Python script.

    What It Actually Is

    The bug lives in algif_aead, the Linux kernel’s AEAD (authenticated encryption with associated data) crypto module that’s exposed to userspace via AF_ALG sockets. The bug itself is absurdly simple: an in-place optimization introduced back in July 2017 (kernel 4.14, commit 72548b093ee3) sets req->src = req->dst, which creates a situation where tag pages from the source scatterlist get chained into the output scatterlist via sg_chain().

    Here’s what that means in practice: when you splice a file’s page cache into the crypto pipeline, those tag pages end up referencing file page cache data. The authencesn(hmac(sha256),cbc(aes)) algorithm writes 4 bytes as scratch space — but because of the bug, that write lands inside your spliced file’s cached data. Your file gets modified in memory, bypassing all permission checks.

    Think of it as Dirty Pipe (CVE-2022-0847), but without any race condition, without needing kernel offsets, and working reliably across every major Linux distribution.

    Why This Matters for Docker and Self-Hosters

    This is where it gets genuinely scary for anyone running containers:

    • Page cache is shared across containers and the host. Writing from one container corrupts the host page cache, affecting every single other container.
    • This breaks the container-as-security-boundary model entirely.
    • Anyone with unprivileged access to a shared Kubernetes cluster, a self-hosted CI/CD runner (GitHub Actions, GitLab shared runners, Jenkins), or a multi-tenant homelab is exposed.
    • A compromised container can escape to the host using just standard syscalls: socket, setsockopt, splice, sendmsg, recvmsg.

    For homelab runners like me who have containers on shared infrastructure, the attack surface is wide open.

    What’s Affected

    The range is staggering:
    All kernels from 4.14 through 7.0-rc
    – Specifically: all 6.18.x before 6.18.22, and all 6.19.x before 6.19.12
    – Tested and confirmed on: Ubuntu 24.04, Amazon Linux 2023, RHEL 10.1, SUSE 16
    Not patched in older LTS lines: 6.12.x, 6.6.x, 5.15.x, 5.10.x (distribution backports still vulnerable)
    – Fixed kernels: 7.0, 6.19.12, 6.18.22

    What You Can Do Right Now

    1. Patch your kernel. Upgrade to 6.18.22, 6.19.12, or 7.0+. If you’re on an older LTS line, check your distribution’s backport status.

    2. Immediate mitigation — disable the module:

    install algif_aead /bin/falsermmod algif_aead

    3. Seccomp profiles blocking AF_ALG socket creation from untrusted processes.

    4. Runtime detection — the Sysdig team released a Falco rule that flags unexpected AF_ALG SEQPACKET socket creation from unprivileged processes:

    - list: known_af_alg_binaries    items: [cryptsetup, "systemd-cryptse", "systemd-cryptsetup", veritysetup, integritysetup]  - macro: successful_af_alg_socket    condition: >      evt.type = socket and      evt.rawres >= 0 and      (evt.arg.domain contains AF_ALG or evt.arg.domain = 38)

    5. If you run Firecracker microVMs, gVisor, or a dedicated host per tenant — you’re NOT affected. The shared kernel assumption is what makes this vulnerable.

    The Bigger Picture

    What’s especially troubling here is how Theori found it. They used an AI system called “Xint Code” that discovered and analyzed the vulnerability in approximately one hour from a single prompt. This isn’t just a new vulnerability — it’s a demonstration that AI-driven kernel auditing is now fast enough to find critical bugs in decades-old code.

    Theori’s AI approach found a bug that sat in the kernel for nine years without being caught by human auditors. And the fix — commit fafe0fa2995a from early April 2026 — essentially reverts that near-decade-old optimization.

    For the rest of us running Linux-based containers everywhere, the takeaway is simple: patch your kernels, audit your containers for local users with unnecessary privileges, and assume that any container with local code execution is already compromised if you haven’t patched.

    Sources: Sysdig Blog, Bugcrowd, OVHcloud Blog, Microsoft Security Blog