XML injection (XXE - XML External Entity)

circle-exclamation

Understanding XML Injection

What is XXE Injection?

XML External Entity (XXE) injection is a vulnerability that occurs when an XML parser processes external entity references in XML documents without proper validation. This allows attackers to access local files, perform SSRF attacks, cause denial of service, or execute remote code in some cases.

Vulnerable Code Example

// PHP vulnerable XML processing 
$xml_data = $_POST['xml']; 
// Vulnerable: libxml with external entities enabled 
$dom = new DOMDocument(); 
$dom->loadXML($xml_data); 
// External entities processed by default
$xpath = new DOMXPath($dom); 
$result = $xpath->query('//user/name');
echo "User: " . $result->item(0)->nodeValue;

Normal XML Request:

<?xml version="1.0"?>
<user>
    <name>John</name>
    <email>john@example.com</email>
</user>

Malicious XXE Request:

How XXE Injection Works

XXE exploits the XML parser's ability to process Document Type Definitions (DTDs) and external entities. When external entity processing is enabled, attackers can define malicious entities that reference local files, network resources, or other XML documents.

XML Entity Types

Internal Entities:

External Entities:

Parameter Entities:

General vs Parameter Entities:

  • General entities: &entityname; - Used in document content

  • Parameter entities: %entityname; - Used in DTD definitions

Impact and Consequences

  • Local File Disclosure - Reading sensitive system files

  • Server-Side Request Forgery (SSRF) - Making requests to internal services

  • Denial of Service - Billion laughs attack, recursive entity expansion

  • Remote Code Execution - In specific configurations (expect://, PHP wrappers)

  • Information Disclosure - Extracting configuration files, source code

  • Port Scanning - Discovering internal network services

XML Parser Behavior and Detection

Common XML Parsers

PHP Parsers:

  • DOMDocument - Default: External entities enabled

  • SimpleXML - Default: External entities disabled (PHP 5.6+)

  • XMLReader - Default: External entities enabled

Java Parsers:

  • DocumentBuilderFactory - Default: External entities enabled

  • SAXParserFactory - Default: External entities enabled

  • XMLInputFactory (StAX) - Default: External entities enabled

  • TransformerFactory - Default: External entities enabled

Python Parsers:

  • xml.etree.ElementTree - Default: External entities disabled

  • xml.dom.minidom - Default: External entities disabled

  • lxml - Default: External entities disabled

  • xml.sax - Default: External entities disabled

JavaScript/Node.js Parsers:

  • libxmljs - Default: External entities enabled

  • xmldom - Default: External entities enabled

  • xml2js - Default: External entities disabled

Detection Methodology

Basic Entity Detection:

File Reading Detection:

HTTP Request Detection:


Basic XXE Exploitation Techniques

Local File Disclosure

Direct File Reading

Reading System Files:

Reading Configuration Files:

Reading Application Files:

Windows File System Access

Reading Windows System Files:

Reading IIS Configuration:

Reading Application Data:

Environment-Specific Paths

Common Linux Paths:

Common Windows Paths:

Server-Side Request Forgery (SSRF)

Internal Network Scanning

Port Scanning:

Service Discovery:

Internal API Access:

Cloud Metadata Access

AWS Metadata Service:

Google Cloud Metadata:

Azure Metadata Service:

Protocol Exploitation

FTP Protocol:

LDAP Protocol:

Gopher Protocol (if supported):


Advanced XXE Exploitation

Blind XXE Exploitation

Out-of-Band Data Exfiltration

Basic Out-of-Band XXE:

Parameter Entity Chaining:

External DTD (evil.dtd on attacker server):

Error-Based Blind XXE

XML Parse Error Exploitation:

Invalid URI Error:

Time-Based Blind XXE

Slow External Resource:

Conditional Time Delays:

XXE with Parameter Entities

Complex Parameter Entity Attacks

Nested Parameter Entities:

Parameter Entity with External DTD:

External DTD with Data Exfiltration:

UTF-16 Encoding Bypass

UTF-16BE Encoded XXE:

UTF-16LE Encoded XXE:

Protocol Handler Exploitation

PHP Wrapper Exploitation

PHP Filter for Base64 Encoding:

PHP Input Stream:

PHP Expect Wrapper (if enabled):

Data URI Exploitation

Data URI with Base64:

Data URI with URL Encoding:

Java-Specific Protocol Handlers

jar:// Protocol:

netdoc:// Protocol (Java):


Denial of Service Attacks

Billion Laughs Attack

Exponential Entity Expansion

Classic Billion Laughs:

Optimized Expansion Attack:

Quadratic Blowup Attack

Large Entity Repetition: The quadratic blowup attack involves creating a very large entity (thousands of characters) and then referencing it multiple times within the XML document. This causes exponential memory consumption as the parser expands each reference. Create an entity with 10,000+ repetitive characters, then reference it 1,000+ times in the document content.

xml

External Resource Exhaustion

Slow Loris Attack

Slow External Resource Loading:

Multiple Concurrent Requests:

Recursive Entity Loading

Infinite External Entity Loop:

Recursive DTD (recursive.dtd):


XXE in Different Application Contexts

Web Services and APIs

SOAP Web Services

SOAP Envelope XXE:

WSDL File XXE:

REST API with XML Content

XML Payload in REST:

RSS Feed Processing:

File Upload and Processing

Document Upload XXE

Microsoft Office Document XXE:

SVG File XXE:

Android APK Manifest XXE:

Configuration File Processing

XML Configuration Files:

Spring Bean Configuration:

Content Management Systems

WordPress XML-RPC

XML-RPC Method Call:

Drupal XML Processing

Drupal Feed Import:


XXE in Mobile Applications

Android Application XXE

XML Parsing in Android

Android XML Parser XXE:

Android Layout XML:

iOS Application XXE

iOS Plist Processing:


Advanced Exploitation Scenarios

Multi-Stage XXE Attacks

XXE to RCE Chain

Stage 1: File Discovery

Stage 2: Configuration Extraction

Stage 3: Remote Code Execution (if PHP expect enabled)

XXE to SSRF to Internal Network Compromise

Stage 1: Internal Network Discovery

Stage 2: Service Enumeration

Stage 3: Service Exploitation via Gopher

Last updated

Was this helpful?