May 2, 2025 ~11 min read

HTML to PDF Renderer: A tale of local file access and shellcode execution

Authored by:

Alain

TL;DR

In a recent engagement, we found an HTML to PDF converter API endpoint that allowed us to list local directories and files on a remote server. One of the PDF files we created, revealed that the converter was using a .NET renderer framework based on Chromium 62. With this, we were able to gain remote code execution by porting a Chromium 62 exploit to the particular version of the renderer.

Introduction and Server-Side XSS

In a recent engagement, we were reminded — once again — why in-depth manual penetration testing remains irreplaceable.

This particular case involved identifying and manually uncovering a series of classic vulnerabilities that no automated scanner would have flagged. What followed was a journey through misconfigurations, outdated components, a juicy Chrome 62 1-day, and some hands-on fun with V8 internals. So, grab a nice coffee!

Our approach in a nutshell: The service we’ve tested allowed us to generate PDF invoices with dynamic content. The endpoint generating the invoice, reports/export/pdf, received multiple POST fields: a JWT token, a report title, and a massive string of url-encoded HTML. Encoded HTML passed to a web server? That made our XSS senses tingle! Having worked with HTML to PDF renderers before, server-side XSS instantly came to my mind. In this particular setup, we can send malicious HTML tags or even JavaScript to the HTML renderer. This malicious code then gets evaluated on the remote server. The impact of such attacks can cause:

Server-side request forgery (SSRF) via iframes, images, or even JavaScript
Local file read via iframes
Internal data leak of cookies and application storage in general
Execute JavaScript if enabled in the browser

So, let’s have a look at what we were able to do.

PDF Metadata

As we were testing a production environment and didn’t want to cause any alerts, we refrained from throwing payloads at the endpoint blindly. We needed a local testing setup. Luckily, the generated PDF had us covered:

The metadata indicated that the PDF stems from the EO.Pdf 18.3.46.0 module. A quick Google search revealed an EssentialObjects collection with the interesting component EO.Pdf. This library can convert HTML to PDF, which is precisely what we were looking for.

In our pentests, we always mark unnecessary metadata in PDFs, images, and documents: It is just an avoidable information disclosure to potential attackers. In this case, it really helped us to identify the exact version we were dealing with, so we could develop our attack! (Hence, our warnings are actually relevant).

Server Side XSS in the EO.WebView framework

We quickly set up a local testing environment. Luckily enough, we could just download the EO.Pdf library from NuGet, a widely used .NET package manager. They even had the exact version we wanted available!

NuGet package manager can be used to obtain exact EO.Pdf library version

The documentation provides a short example code to convert an HTML file to a PDF:

EO.Pdf.HtmlToPdf.ConvertUrl("c:\\test.html", "c:\\result.pdf");

Ignoring the popup of EO to notify us that we use a trial license, the result.pdf was created successfully. Let’s test out some payloads! We started with SSRF, as we expected this to work even if JavaScript is not allowed in the renderer:

<html>
<body>
<img src="https://webhook.site/74db201f-292d-4197-b338-a468515182ef">
</body>
</html>

And we got a hit on our webhook! If the remote server uses this exact configuration, we already have an SSRF vulnerability.

Webhook called from the embedded image in the HTML

It looks like we got a Chromium version 62 as the render engine. Boy, that’s old! It roughly matches the release date of the 18.3.46 NuGet package from early 2018. Chromium 62 was released in October 2017, and, naturally, the library did use a few month old release of chrome of that time.

So, we decided to fiddle around a bit more and try JavaScript and chrome://version to get more details. We tested JavaScript execution by dynamically appending an <iframe> element to the website, allowing us to embed “sub-pages” in our PDF:

<html>
<body>
<script>
  var iframe = document.createElement('iframe');
  iframe.height = 1000
  iframe.width = 1000
  iframe.src="chrome://version";

  document.body.appendChild(iframe)
</script>
</body>
</html>

Dynamic JavaScript execution and exact V8 version

Success! We could use iframes to embed pages and even execute JavaScript code! Nice. It looks like a full-blown Chromium browser. The screenshot above notes the exact V8 version 6.2.414.2 (V8 is the JavaScript engine used by Chromium), which will become important later.

Last but not least, we tried to read directories and files:

<html>
<body>
<script>
  var iframe = document.createElement('iframe');
  iframe.height = 300
  iframe.width = 1000
  iframe.src="file://C:/";

  document.body.appendChild(iframe);

  var iframe2 = document.createElement('iframe');
  iframe2.height = 300
  iframe2.width = 1000
  iframe2.src="file://C:/windows/win.ini";

  document.body.appendChild(iframe2);
</script>
</body>
</html>

Local directory listing and file read of C:/Windows/win.ini

The Chromium context allowed us to access the local file system. Neat, the severity of this bug just increased, as we could now browse the file system and access files!

Arbitrary file read is nice, but can we even increase the severity of this finding to critical? The browser is five years old. There must be a working RCE exploit for this version somewhere.

Exploiting the Chromium 62 engine

First, a disclaimer: I have minimal experience exploiting JavaScript engines and, although known to me in theory, had never written a full V8 exploit before. Some of the concepts explained below might be imprecise or just incorrect. For a proper introduction to V8 exploitation, please consult one of the many V8 exploitation blog posts out there.

I suspected that the browser engine resides in the 51 MB (!) DLL file EO.WebEngine.dll. The .NET binary was heavily obfuscated and gave no clues on how the Chromium engine was embedded inside this DLL. Debugging the managed .NET dll with x64dbg also didn’t hint at how the browser was embedded. Time for blind exploitation!

I was aiming for a bug that wasn’t caused by just-in-time (JIT) optimization, as I didn’t know if the TurboFan JIT was even enabled on the EO.Pdf Chromium build. Searching for Chromium CVEs of the last years on Google and the Chromium issue tracker, CVE-2017-15428 looked promising. Plugging in the proof of concept from the Chromium issue, we created a crash!

EO.Pdf crashing when converting HTML to PDF with CVE-2017-15428

Although this verified the bug’s existence, I had no idea how to exploit it. On 64-bit executables, the reg.lastIndex apparently controlled a register. In my case, I had no clue what was happening. So I decided to build a V8 development setup to test and triage further!

Building a simple V8 debug environment

There are many tutorials on how to build a V8 environment with depot_tools, the repository tool, and the build generator from Google. The steps are simple:

Get depot_tools and put it in your PATH (https://chromium.googlesource.com/chromium/tools/depot_tools.git)
Clone V8 into a folder: git clone https://chromium.googlesource.com/v8/v8
Fetch all tags: git fetch origin --tags
Checkout the V8 version of our embedded Chromium: git reset --hard 62.0.3202.9
Run gclient sync to do magic stuff
Run hooks via gclient runhooks to do more magic stuff
Obtain all dependencies to build V8: ./build/install-build-deps.sh
Create 32-bit debug build: gm ia32.debug d8

Step 7 failed horribly: The 2017’s V8 version uses OpenSSL 1.1, which is not supported in the recent Ubuntu 22.04 I was working on. So I grabbed an Ubuntu 16.04 docker container, prayed that all the repository archives were still up, and eventually managed to build the exact V8 version 62.0.3202.9. Hurray!

Crashes and frustration

Running the RegEx CVE against the 32-bit debug build caused the same assert failure as described in the Chromium issue tracker:

root@32f21bbac6ef:/src/v8/out/ia32.debug# gdb --args ./d8 /tmp/CVE-2017-15428.js 
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
[...]
Reading symbols from ./d8...done.
gdb-peda$ r
Starting program: /src/v8/out/ia32.debug/d8 /tmp/CVE-2017-15428.js
[...]
[New Thread 0xf0f26b40 (LWP 93133)]
abort: CSA_ASSERT failed: IsFastRegExp(context, regexp) [../../src/builtins/builtins-regexp-gen.cc:2966]


==== JS stack trace =========================================

Security context: 0x43096655 <JSObject>#0#
    2: replace(this=0x23c8949d <Object map = 0x40d89a09>#1#,0x23c894b9 <JSRegExp <String[3]: abc>>#2#,0x23c8a179 <JSFunction (sfi = 0x43099d95)>#3#)
    3: tt [/tmp/CVE-2017-15428.js:22] [bytecode=0x43099e2d offset=79](this=0x23c849a5 <JSGlobal Object>#4#)
    4: /* anonymous */ [/tmp/CVE-2017-15428.js:26] [bytecode=0x430999c9 offset=43](this=0x23c849a5 <JSGlobal Object>#4#)

However, executing the proof of concept with the 32-bit release build only resulted in a NULL pointer dereference :(

NULL pointer dereference on the 32-bit release build with CVE-2017-15428

That was when I realized I had no clue what I was actually doing. And that it is a big step from a crashing PoC with a type confusion/ use-after-free to actual remote code execution. It certainly wasn’t impossible to learn, but maybe the EO.Pdf-embedded Chromium wasn’t the best starting target. Back to square one!

Porting a Chrome 62 Exploit to EO.Pdf

I tried some more CVEs and PoCs, and most were only triggers of an out-of-bound read/write that often didn’t work due to other heap layouts. Eventually, I found a full PoC for Chrome 62.0.3202.62 32-bit provided by a security researcher of “Qihoo 360 Vulcan Team”. The exploit was planned to be used for Pwn2Own Mobile but was reported to Google instead. The bug itself resides in the WebAssembly part of the JavaScript engine and causes a use-after-free by growing a buffer while holding a reference to the old, non-grown, and freed buffer. The Chrome version used in the exploit was pretty close to the embedded Chromium version we are using. That looked promising!

But first, some more setup: Converting HTML to PDF wasn’t optimal to understand better what was happening. I wanted to catch JavaScript exceptions, debugging output, and results from JavaScript calls during exploitation. Therefore, I changed the local setup to load a website in a dedicated, headless WebView and passed the JavaScript code directly to the browser engine:

ThreadRunner threadRunner = new ThreadRunner();

//Create a WebView through the ThreadRunner
WebView webView = threadRunner.CreateWebView();
threadRunner.Send(() =>
{
    //Load a page, otherwise we have no JS context for execution
    webView.LoadUrlAndWait("chrome://version");

    // Execute JS code
    var res = webView.EvalScript(File.ReadAllText("C:/pdftest/cve_test.js"));
    Console.WriteLine(res);
});

Furthermore, we didn’t want to be too noisy about our exploitation attempt. I thus set the (automatically enabled, duh!) crash error reporting to false and caught the Chromium crash data in a special event CrashDataAvailable handler:

static void Runtime_CrashDataAvailable(object sender, EO.Base.CrashDataEventArgs e)
{
   Console.WriteLine(e.ToString());
    System.Diagnostics.Process.GetCurrentProcess().Kill();
}
// [...]
EO.Base.Runtime.CrashDataAvailable += Runtime_CrashDataAvailable;
EO.Base.Runtime.EnableCrashReport = false;

Now I could alert debug messages shown to the user in a .NET message box. This also paused the rendering process, which was very convenient for debugging. For example, I added the following code that shows the address of the (leaked) array buffer elements:

big_array_element_addr = (u2d(big_array[0x1f7e0]))[0] + 8 - 0x20000 * 8,
alert("big_array_element_addr: " + big_array_element_addr.toString(16));

Opening the process list in x64dbg to inspect the memory held a surprise for us: The renderer was running with the --no-sandbox flag! Chromium usually sandboxes the renderer process; thus, usually, another sandbox escape exploit is necessary to gain real remote code execution. It seemed like we could skip this step!

Chromium renderer process has the sandbox disabled!

If we inspect the memory at the leaked address, we see all the 2.2 double values as expected. Those 2.2 values are set in the exploit to initialize the buffer we abuse for our relative read/write primitives.

Address leak of array elements seem to work

Testing the exploit in the EO.Pdf renderer, unfortunately, the type-confused ArrayBuffer was not accepted as an argument for the DataView, although fake_ab instanceof ArrayBuffer was true. I assumed that some internal ArrayBuffer structures were wrong and eventually used my V8 debugging environment to get the memory layout of some internal V8 structures. When writing this blog post, I could not reproduce the issues, so let’s skip that less interesting part of exploitation. JavaScript is weird.

CALC!

Eventually, the shellcode was executed, and a calc popped up! Neat!

Luckily, the exploit also worked when using the single EO.Pdf.HtmlToPdf.ConvertUrl invocation and even increased the stability of the exploit to around 80%!

Closing Thoughts

This blog post shows how much impact the annoying finding “outdated version with known vulnerabilities”, as we often write in our reports, can have. And once more, we see how user input can’t and shouldn’t be trusted, even if it’s just some “harmless HTML rendering”. The customer could have disabled the JavaScript engine for the PDF conversion process, but often developers stop once the default and suggested method EO.Pdf.HtmlToPdf.ConvertUrl works. There is an overloaded variant of the ConvertUrl method which does accept HtmlToPdfOptions though. The HtmlToPdfOptions can deny local file access and disable the JavaScript engine altogether, preventing all the above exploits. Unfortunately, those are not set by default.

For me personally, it was fun to dive a little bit into Chromium/V8 exploitation. My initial concerns regarding a locked-down, custom Chromium build used by EO.Pdf were unfounded. According to the EO.Pdf documentation, they added a lot of .NET bindings for Chromium. Luckily, those didn’t hinder our exploitation. I still have no clue how to properly debug the embedded Chromium engine if it’s not being stopped with an alert box. Massive respect to all the V8 researchers, those exploits and skills are very advanced!

As said in the beginning, there is little to no chance an automatic security and vulnerability scan would have found such issues. So, if you need an in-depth pentest of your applications, write us!

On this page

TL;DR
Introduction and Server-Side XSS
PDF Metadata
Server Side XSS in the EO.WebView framework
Exploiting the Chromium 62 engine
Closing Thoughts

TL;DR¶

Introduction and Server-Side XSS¶

PDF Metadata¶

Server Side XSS in the EO.WebView framework¶

Exploiting the Chromium 62 engine¶

Building a simple V8 debug environment¶

Crashes and frustration¶

Porting a Chrome 62 Exploit to EO.Pdf¶

CALC!¶

Closing Thoughts¶