~27 min read

On Secure Boot, TPMs, SBAT, and downgrades -- Why Microsoft hasn't fixed BitLocker yet

On Secure Boot, TPMs, SBAT and Downgrades -- Why Microsoft hasn't fixed BitLocker yet
Authored by:

This Blogpost is an addition to Windows BitLocker — Screwed without a Screwdriver. In that post, we looked at how easy it is to break into BitLocker’s default “Device Encryption” configuration on up-to-date Windows systems by simply downgrading the Windows Bootloader.

One significant question that we haven’t yet addressed is: Why is this possible? Why hasn’t Microsoft fixed this yet? The answer is both simple and complicated. I fell into a rabbit hole investigating this and will share my findings here. I’ll first lay some groundwork on how Secure Boot and the TPM work, discuss PCRs and which one you might use for BitLocker, and explore the ecosystem’s future with SVT, SBAT, and key rotations that might brick your motherboard.

Rather watch a talk than read? Parts of this blog are talked about in my 38C3 Talk, linked at the bottom.

Secure Boot - How does it even work?

To understand the challenge of fixing the bootloader downgrade issue, we first need to explore how Secure Boot works and how it interacts with BitLocker. The first concept you need to know are “Measurements”. Measurement are for example: “I am now booting a bootloader with hash XYZ”, or “The Floppy-Drive is my default boot device.”

Using those, the secure booting process has two components: Measured Boot and Verified Boot.

Measured Boot records the integrity of the booted system, in a way that others can verify it. It records all boot components and allows the implementation of features like remote attestation, or that only trusted Windows boots can unlock BitLocker.

Secure/Trusted/Verified Boot runs integrity checks, and makes decisions about what is allowed to boot. This runs on different levels, and can be called different names depending on the context. For example, with Secure Boot, only binaries signed by Microsoft are allowed to boot. In practice, two default certificates are installed on pretty much everything:

  • Microsoft 1st party certificate, which signs all Windows bootloaders
  • Microsoft 3rd party certificate, which signs everything else we commonly understand to boot under Secure Boot, like Linux shims.

Further, Secure Boot distinguishes two boot-phases: the boot phase and the post-boot environment:

Boot phase: This includes everything from early boot up to running the OS — platform initialization, UEFI, early PCI-init, linux-shim, the bootloader. In the eyes of Secure Boot the integrity of these components is very important and nothing used in this phase should be malicious or vulnerable.

Post-boot environment, or runtime environment RT. It starts with the execution of the operating system kernel. While Secure Boot still applies protections here, this phase is considered less critical for the scope of Secure Boot. Operating system kernels are massive and inherently more vulnerable, and protections here are more “best effort.” For example, as we’ve seen during our lockdown bypass, the Linux kernel on Microsoft-signed distributions goes into lockdown mode automatically to protect against compromise, but this is bypassable.

To achieve these goals, Secure Boot relies on a “trusted” measurement device: the Trusted Platform Module (TPM). TPMs come in two variants: firmware-based (fTPM), which is integrated into the CPU’s management code, or discrete hardware-based (dTPM), which is a separate chip on the motherboard. Additionally, there are two major protocol variants, 1.2 and 2.0. They are similar-ish enough for our purposes, so we can ignore the differences here. The two most relevant functionalities for us are Platform Configuration Registers (PCRs), which keep track of the measurements, and the ability to perform cryptography based on established rules. Let’s look at those two more in detail:

Platform Configuration Registers (PCRs)

A Platform Configuration Register stores a hash, commonly sha256. There is no way to directly set a PCR to a specific value. Instead, we have to add “measurements” to them. This process involves combining the existing PCR content with a new measurement, hashing the result, and storing it back in the register. Additionally, each measurement is stored in an Event Log outside the TPM:

Image 1: Platform Configuration Register

By reading the Event Log and the PCR content, we can verify that all events in the event log were actually measured in the PCR and that no events are missing. Most platforms have 24 PCRs, with the first 8 used during the “early boot” phase. Notably, the first 16 cannot be reset except during a cold reboot. Once a measurement is hashed into one, it cannot be removed.

Here is what PCRs look like on a Windows 11 machine after boot:

Image 2: PCRs on Windows 11 after boot

As you can see, only the early-boot PCR0-7 and PCR11-14 are in use.

How does data get into PCRs? The boot process of modern computers is really quite involved. There are lots of different components from different manufacturers loading each other. A simplistic view might be:

  1. Platform
  2. UEFI
  3. Bootloader
  4. Kernel

Each of those components adds its own measurements. Simplified: There is a root of trust in the CPU. That is used to verify the first stage, the platform. The platform then hashes and/or signature-verifies the UEFI. The results are “measured” into PCR registers. It then adds a “separator” into all PCRs and hands execution off to the UEFI if verification succeeds. The UEFI now performs the same process for the bootloader, which in turn checks the kernel. This creates a trust chain, where each component verifies and measures the next one. As long as each component in the chain is trusted, everything is fine. But one component can never lie about itself, since the component before it took the hash.

Alright, that’s all fine and good, but what exactly goes into the PCRs?

What goes in which PCR?

Great question! It’s not easy to answer satisfactorily without devoting an entire blog post to this. So, let’s link some more detailed resources, in case you’d like to dig deeper. For the early-boot PCRs, there are specs: TCG_EFI_Platform_1_22_Final_-v15.pdf and TCG PC Client Platform Firmware Profile Specification (e.g., look for “Design Consideration and Distinctions Between PCR[0], PCR[2], and PCR[4]”). Another helpful high-level overview is provided in the Tianocore Documentation. While the exact contents differ from platform to platform, here’s a general breakdown of PCR contents:

  • PCR0 measures the manufacturer-provided firmware executable (e.g., UEFI, embedded UEFI drivers)
  • PCR1 measures the manufacturer-provided firmware data (e.g., microcode, ACPI Tables, Boot-Order)
  • PCR2 measures the user-configurable firmware code (e.g., PCI cards)
  • PCR3 measures the user-configurable firmware data (e.g., PCI cards)
  • PCR4 measures the bootloader code (usually from disk)
  • PCR5 measures the bootloader data (usually from disk)
  • PCR6 measures manufacturer-specific stuff (e.g., resume events from S4 or S5 power state events)
  • PCR7 measures the Secure Boot state
  • PCR11 is an OS-specific PCR. In Microsoft’s case, the bootloader uses it to “lock” VMK key derivation when booting the operating system. No process after the Microsoft bootloader can ever unseal a correct VMK via the TPM.

A great resource to learn more about PCR measurements is by examining the specific values on your system. Each measurement corresponds to an entry in the Event Log, which can be viewed using system tools:

  • Windows: Use tpmtool.exe.
  • Linux: Use tpm2-tools to run sudo tpm2_eventlog /sys/kernel/security/tpm0/binary_bios_measurements.

Lets briefly look at some relevant events.

  1. The platform measures the firmware code into PCR0:
- EventNum: 2
 PCRIndex: 0
 EventType: EV_EFI_PLATFORM_FIRMWARE_BLOB
...
  1. The platform measures that secure-boot is enabled into PCR7:
- EventNum: 3
 PCRIndex: 7
 EventType: EV_EFI_VARIABLE_DRIVER_CONFIG
 DigestCount: 2
 Digests:
 - AlgorithmId: sha1
 Digest: "d4fdd1f14d4041494deb8fc990c45343d2277d08"
 - AlgorithmId: sha256
 Digest: "ccfc4bb32888a345bc8aeadaba552b627d99348c767681ab3141f5b01e40a40e"
 EventSize: 53
 Event:
 VariableName: 8be4df61-93ca-11d2-aa0d-00e098032b8c
 UnicodeNameLength: 10
 VariableDataLength: 1
 UnicodeName: SecureBoot
 VariableData: "01"
  1. EFI Boot drivers are measured into PCR2:
- EventNum: 12
 PCRIndex: 2
 EventType: EV_EFI_BOOT_SERVICES_DRIVER
...
  1. UEFI application is measured into PCR4:
- EventNum: 17
 PCRIndex: 4
 EventType: EV_EFI_BOOT_SERVICES_APPLICATION
...
  1. A “separator” is hashed into all PCR0-6, so no later component can “fake” more measurements and pretend an earlier component made them:
- EventNum: 35
 PCRIndex: 0
 EventType: EV_SEPARATOR
...
  1. The Secure Boot Database is measured into PCR 7:
- EventNum: 45
 PCRIndex: 7
 EventType: EV_EFI_VARIABLE_AUTHORITY
...
 Event:
...
 UnicodeName: db

In total, there are around 100 different measurements on my device.

TPM Cryptography

Alright, now that we have filled the PCRs with those magic numbers and roughly know what they represent, what can we actually do with them?

This is where the TPM’s cryptographic capabilities come into play. We can encrypt (or “seal”) values with the TPM, and apply a policy to only unlock those values when certain conditions are true. For instance, we could create a policy that only allows unsealing a key when PCR7 has a specific value or if the user provides a specific password. We can also combine multiple of those rules into a single policy.

BitLocker makes extensive use of that. The encrypted secrets themselves aren’t stored on the TPM, so they don’t take up precious high-security storage space. Instead, the TPM is used to validate the decryption request. You provide the encrypted key to the TPM, and if the policy conditions are met, the TPM unseals the data. If the policy doesn’t match, the request fails. In case of BitLocker, there is a great blogpost digging into how that looks exactly: A Deep Dive into TPM-based BitLocker Drive Encryption .

Selection of PCRs for Microsoft BitLocker

Great! We can use a TPM policy to unseal our VMK! But how do we determine the “correct” policy for this? Should we require a user password? Should we just use all available PCRs or select some? BitLocker gives you flexibility here. It can require a preboot password from the user or rely entirely on PCR values for automated unlocking. The latter is “easier” to use since users don’t have to remember and enter an additional password for each boot (very convenient). However, it is also less secure: If an attacker can manipulate the PCR values to match those of a legitimate environment, they could potentially derive the VMK.

On enterprise Windows, Microsoft BitLocker can be configured to use any PCR you like. It’s always a tradeoff between security and usability. Say, we seal the VMK to the value of PCR4, the bootloader code. Then, any bootloader update at any time will render the disk unable to automatically unseal, requiring the user to input the recovery password. To handle anticipated changes in the PCR values, we have some options. For example, we could:

  1. Disable BitLocker protection temporarily just before the reboot, then reboot once, and then reseal the VMK to the new PCR values. This will leave the disk unprotected for a short while.
  2. Simulate the expected PCR changes, then seal the VMK to the new values before rebooting. This method depends on the accuracy of the simulation :)

Luckily, most bootloader updates are handled via the Windows update function, giving Microsoft an opportunity to ensure the process goes smoothly. However, this has limits, as Microsoft doesn’t control much of the code that makes the measurements. Say, for instance, there is a UEFI update available. This will definitely cause PCR0 to change, as this contains the hash of the UEFI firmware. But the now updated UEFI is responsible for measurements itself. A firmware vendor could make changes to the measurement order or modify other PCR-related events in such an update. This makes predicting the outcome much more challenging.

Further complications arise when users manually update their BIOS or apply updates using third-party motherboard software. Enterprise-level threat detection tools can also interfere. Essentially, PCR values aren’t guaranteed to be stable across all systems. While there are systems where that works perfectly, if you are Microsoft and have millions of customers with devices, some will behave weirdly.

With that in mind, let’s consider the two variants of PCRs Microsoft used in the past:

  • Legacy configuration: PCR 0,2,4,11. This locks the code of all UEFI, PCI cards, and bootloader via their respective hashes. Every upgrade or downgrade would cause a mismatch, forcing a recovery key promt.
  • Current configuration: PCR 7,11 and Secure Boot-based. Relies on Secure Boot (PCR7) for stability and PCR11 for the custom “lockout” mechanism that ensures no one beyond the bootloader can unseal the VMK. While less secure than the legacy configuration, this approach is more user-friendly and recovery-resistant.

Microsoft explains their reasoning in this documentation

Modern Standby hardware is designed to reduce the likelihood that measurement values change and prompt the customer for the recovery key.

For software measurements, Device Encryption relies on measurements of the authority providing software components (based on code signing from manufacturers such as OEMs or Microsoft) instead of the precise hashes of the software components themselves. This permits servicing of components without changing the resulting measurement values. For configuration measurements, the values used are based on the boot security policy instead of the numerous other configuration settings recorded during startup. These values also change less frequently. The result is that Device Encryption is enabled on appropriate hardware in a user-friendly way while also protecting data.

For some additional reading on challenges related to PCR selection, I recommend a discussion around usage of PCR0 in fwupd, and some more writing by Lennart Poettering on the Future of Encryption in Fedora desktop variants.

A Wild Patch Appears - The PCR4 Scare

At one point during my research, when I had just gotten the BitPixie Exploit to work, I shared it with a coworker who wanted to reproduce the results. I have to admit - my documentation wasn’t great but even when discussing the issues, the exploit just wouldn’t work for him. Turns out: Microsoft had rolled out a patch to re-include PCR 4 into the default Device Encryption configuration! They had even migrated the TPM protectors on devices that applied the update, from PCR 7,11 to PCR 4,7,11. They had done this as a reaction to CVE-2024-38058, “BitLocker Security Feature Bypass Vulnerability”.

With this, they pinned the bootloader by both hash and Secure Boot status. Downgrades were still possible, but the TPM would refuse to unseal the VMK, since PCR4 would mismatch. But recall what we said earlier? How might we not want PCR4 as this increases the likelihood of BitLocker recovery screens when the automated unlock fails for some reason? Well…

This has made a lot of people very angry and been widely regarded as a bad move

Image 3: Newspaper headings after 2024/07 update

So, just one month later, they undid the fix!

Why was the fix for this vulnerability disabled and how can I apply protections to address this issue?

When customers applied the fix for this vulnerability to their devices, we received feedback about firmware incompatibility issues that were causing BitLocker to go into recovery mode on some devices. As a result, with the release of the August 2024 security updates we are disabling this fix. Customers who want this protection can apply the mitigations described in KB5025885.

This is an excellent example of why it is so complicated for Microsoft to fix this class of issues: Anything they change will break at least some systems.

Secure Boot Certificates

One way in which Microsoft could make changes is through Secure Boot certificates. Secure Boot is in its own little ecosystem there.

Only specifically signed programs can boot when Secure Boot is enabled in UEFI. This is done with a certificate chain and a full key hierarchy. It starts off with a “Platform Key” that is embedded in the firmware:

  • PK - PlatformKey. Vendor controls this, usually rsa2048. Manages KEK.
  • KEK - Key Exchange Key, rsa2048.
    • Could directly sign bootable content, usually doesn’t.
    • Can update DB and DBX.
  • DB - Signature Database. List of sha256hash OR rsa2048 public keys of certs, that are ALLOWED to run.
  • DBX - Database of FORBIDDEN hashes/signatures.

On Linux, when using shim, we additionally have:

  • MOK - Machine Owner Key. List of hashes/pubkeys. User can provide this.
  • MOKX - Machine owner deny list.

On almost all devices, the default Microsoft certificates are enrolled:

  • 1x KEK from Microsoft: Microsoft Corporation KEK CA 2011
  • 2x DB key from Microsoft:
    • 1st party cert: Microsoft Windows Production PCA 2011 (used for signing Microsoft 1st party bootloaders)
    • 3rd party cert: Microsoft UEFI CA 2011 (used, e.g., for Linux shim, a Linux UEFI secure boot compatible bootloader)

This creates two “classes” of signed bootloaders: Windows native and “other”. PCR7 contains a measurement with the following information: (1) secure boot is enabled, (2) both the available secure boot keys and revocations, and (3) which certificate was actually used to sign the bootloader. This makes it easy for Microsoft to make the distinction that all legit Windows bootloaders (i.e., those signed with PCA2011) can unseal BitLocker disks, whereas no other bootloader is allowed to do so.

In case you want to customize the contents of your Secure Boot setup, there is a document by the NSA that goes a lot deeper here: UEFI Secure Boot Customization. It discusses the trust chain, what each signing component does, and how you might update PK/ KEK/ DB/ DBX/ MOK/ MOKX.

Revocations, They Scream!

We have a certificate-based system, right? Isn’t revocation a thing? If we have a known vulnerable bootloader, can’t we just revoke it? Shouldn’t this be solved? Well, yes! It would be great if we could revoke all vulnerable bootloaders. In fact, we could! That is precisely what the DBX database is for. Except… The standard doesn’t really consider those kinds of mass-revocations. The space available for revocations is quite limited, usually 32kB. All of the Microsoft bootloaders are signed by the same certificate, so we cannot just revoke some intermediate-cert. If we want to revoke every old Microsoft bootloader for the last 15 years (as they all have this bug), we would need to revoke every single signature. There is no version-based revocation or the like in the specification!

And that’s not all! Microsoft has to share that sparse revocation space with all 3rd party revocations, as we can see from a comment in the shim documentation:

As part of the recent “BootHole” security incident CVE-2020-10713, 3 certificates and 150 image hashes were added to the UEFI Secure Boot revocation database dbx on the popular x64 architecture. This single revocation event consumes 10kB of the 32kB, or roughly one-third, of revocation storage typically available on UEFI platforms. Due to the way that UEFI merges revocation lists, this plus prior revocation events can result in a dbx that is almost 15kB in size, approaching 50% capacity.

All official Microsoft DB/DBX contents are available on GitHub, with an amazingly detailed writeup by auscitte of how they get rolled out.

But wait! What is stopping us from revoking the primary cert? As it turns out, nothing! (*well, actually, quite a lot, but in theory…)

A Way Forward: Rolling the Secure Boot Certificates (KB5025885)

Microsoft controls the KEK on most devices and could, as such, update both the DB and DBX databases at will. In fact, this is precisely what their recommended solution (KB5025885) to these issues does: Blacklist the old PCA2011 certificate, and add a new 2023 one to DB. Microsoft already offers bootloaders signed with the new key. If you opt-in, downgrades to older bootloaders are no longer possible.

This has some usability issues, though. Old boot media will fail to boot, making it more difficult to recover such a system. Also, DB/DBX updates on this scale have never been done. There are, for sure, scruffy vendors out there that will have bugs when the primary Microsoft keys are revoked. There is a section of known issues in the KB (they don’t go into details about the issues, though).

Making changes to Secure Boot, with a lot of vendor UEFIs in the mix that might not be entirely up-to-spec, and rolling Secure Boot certificates, which has never been done at scale before… I’m sure this will just work splendidly! Though, in all honesty, I wish them the best of luck.

This issue with dwindling revocation space coincides with the expiration of the old certificates in October 2026, so Microsoft has to roll them anyway. Old systems will likely continue to boot, even with expired certificates but they might not get the latest security patches. More great info on this whole rolling Secure Boot keys plan is available in form of a presentation by Microsoft engineers held at ‘UEFI Fall Conference’ in October 2023: Evolving the Secure Boot Ecosystem. They have excellent quotes such as

Some OEM specific firmware implementations prevent updates to the Secure Boot variable store or brick a system entirely.

That’s Not Enough! The KEK Expires as Well!

Well, shit. DB and DBX updates of our signing keys aren’t enough, the KEK expires in 2026 as well. If Microsoft wants to keep the ability to make DB/DBX updates, they’ll need to roll the KEK as well. However, that can only be updated by the Platform Key PK, which vendors hold. -> This requires the help of all vendors. If they have trashed their keys, better hope there is a way to replace them! Going back to the UEFI presentation linked earlier:

In fact some number of devices will likely become un-serviceable due to this event.

What Microsoft likely means here is that some platforms simply can’t receive DB/DBX updates — not that they won’t boot at all. It helps explain why the issue remains unresolved: Microsoft has neither announced a concrete timeline for a fix nor stuck to a single plan, and any course of action is bound to generate negative press in the coming years. The added security benefits are difficult to explain, and secure boot already faces resistance from many users.

How to Prevent This Going Forward: SVN

Rolling certs is all well and good, but how do we prevent the same issue of a lack of revocation space in the future? SVN! No, not subversion, Secure Version Number. This is a rollback protection that does not rely on certificate revocations, implemented in the bootloader itself. Little public information is available, as far as I know, but the plan seems similar to what Linux is planning with SBAT, described in the next section.

The main idea is that UEFI has a bit of storage in non-volatile memory, that a bootloader can use to store data. Further, the bootloader can specify that only the boot environment is allowed to access/write that data. Those variables are locked down as soon as we exit the boot phase and enter the runtime environment. This makes it possible for a bootloader to have a “Secure Version Number” that can be compared to the one stored in NVRAM. When the own number is greater, store the greater number. When it is lower, refuse to boot.

This makes it possible to revoke a whole class of bootloaders at once, without having to deal with certificates or revocations. This opens up new attack surfaces and ideas where an attacker might be able to influence this NVRAM, reset contents, and get old bootloaders to boot anyway. Yet, it is still strictly better than the old solution, which is not to be able to revoke known vulnerable bootloaders at all. Bootloaders installed after applying KB5025885 already support SVN. Every time the SVN is bumped, all old boot media, containing an old bootloader with old SVN, will refuse to boot on updated systems. However, this is fully intended here.

The Linux Side of Things: SBAT

Let’s now look at the Linux side of things, just for giggles. Could changes be coming here as well?

The most notable change is the introduction of SBAT (Secure Boot Advanced Targeting). This is the Linux shim equivalent of Microsoft’s SVN, and is already enforced in most distributions. The best documentation on SBAT is available from the shim documentation. The rough idea is similar to the SVN: We want to be able to revoke much more easily.

At the time of this writing, revoking a Linux kernel with a lockdown compromise is not spelled out as a requirement for shim signing. In fact, with limited dbx space and the size of the attack surface for lockdown it would be impractical do so without SBAT. With SBAT it should be possible to raise the bar, and treat lockdown bugs that would allow a kexec of a tampered kernel as revocations.

SBAT has already been rolled out, so we can briefly examine how it looks in practice.

For example, the current Debian Shim includes the following (hardcoded) SBAT limits:

sbat,1,2022052400
grub,2
sbat,1,2023012900
shim,2
grub,3
grub.debian,4

When shim boots grub, it reads first checks the signature. If it checks out, it reads the metadata contained in grub itself to see what security level Grub advertises. If it is lower than expected, it refuses to boot. In grub metadata, this looks like:

sbat,1,SBAT Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
grub,3,Free Software Foundation,grub,2.12~rc1,https://www.gnu.org/software/grub/
grub.debian,4,Debian,grub2,2.12~rc1-3,https://tracker.debian.org/pkg/grub2
grub.debian13,1,Debian,grub2,2.12~rc1-3,https://tracker.debian.org/pkg/grub2
grub.peimage,1,Canonical,grub2,2.12~rc1-3,https://salsa.debian.org/grub-team/grub/-/blob/master/debian/patches/secure-boot/efi-use-peimage-shim.patch

In this case, we have grub version 3 and grub.debian version 4, so are allowed to boot. The SBAT versions that the shim checks against itself are stored in NVRAM again:

Image 4: Contents of NVRAM showing the SbatLevel

Note that these SBAT values can be written by any bootloader. Microsoft recently, in their 08/24 security update, updated the sbat values to revoke vulnerable grub bootloaders. This caused a lot of problems for folks with dual-boot setups. Their Linux, previously booting fine, suddenly refused to boot after applying a Windows background update, with an inscrutable error message: Verifying shim SBAT data failed: Security Policy Violation (Can’t boot Debian 12. Security Policy Violation).

One other issue that came up again with SBAT was the recent push towards eliminating bootloaders and directly booting into a Linux kernel. This concept, known as the “unified kernel image” (UKI), offers several advantages. However, it also exposes some challenges. The upstream Linux kernel community often takes the stance that only the latest version of each tree is secure. This led to an extensive discussion on mailing lists and LWN. If you are in need for some good quotes, go and read LWN: Much ado about SBAT. The issues surfaced there are indeed quite complex: Who determines this security version number? Patches might get downstreamed non-linearly, so it’s hard for a single number to capture the complexity of security. What bug is “severe” enough to warrant a bump in that number?

If you want to dig deeper into unified-kernel-images and Linux booting in general, a good starting point are some resources by Lennart Poettering: Brave New Trusted Boot World, a discussion on LWN about that, as well as some forum posts here.

Sidenote: Machine Owner Keys: When you read about Linux Secure Boot, sooner or later you’ll come across Machine Owner Keys, or MOKs. These keys enable users to add custom signing keys into a Linux Shim, allowing secure boot for custom kernels or kernel modules. The MOK itself is stored in this same NVRAM section as the SBAT and SVN data. Changes to the MOK can only be made through the MOKManager, which is executed before entering the runtime environment where the Linux kernel starts. This separation ensures the MOK is protected even from kernel-level changes. However, if you use a default Ubuntu signed shim and install your own MOK, you are still relying on the Microsoft 3rd party certificate.

Other Attacks Against BitLocker

To wrap up this research, let’s revisit BitLocker and explore other potential attack vectors. While the following list is not exhaustive, it highlights the complexity of achieving secure TPM-only encryption. Some more are detailed in Wack0’s excellent GitHub repository on BitLocker attacks, which served as inspiration for this exploration.

The list here is incomplete, but it gives a good overview of just how difficult TPM-only encryption is to get right. There are three main exploit categories:

  1. Attack the bootloader or Windows after it has unsealed the key. For example, consider CVE-2022-41099 – Analysis of a BitLocker Drive Encryption Bypass. This vulnerability exploited a flaw in the Windows recovery environment. By resetting the computer and choosing to “remove everything,” an attacker could reset the system at approximately 98% progress, bypassing encryption safeguards. Thankfully, this issue has been patched in all updated systems, as the flaw was relatively straightforward to fix.
  2. ”Reset” PCR states, while retaining code execution. There are both hard- and software attacks that allow resets to the PCR registers, which could then be filled with bogus measurements.
  3. Break or compromise the TPM hardware. Directly attack the TPM chip to access key material, or unlock with wrong policies.

TPM hardware attacks

As far as hardware attacks go, your have to distinguish between discrete (dTPM) and firmware (fTPM). Discrete TPMs are a separate chip on the board, and can be much more easily accessed.

dTPM: Just sniff the bus! It is unencrypted, and transfers the BitLocker VMK in plaintext.

dTPM: While the system is offline, power on just the TPM and put arbitrary values on the bus. You can unseal anything you know PCR measurement logs for.

dTPM: While the system is running, reset the TPM by pulling either power or reset pins low. This can be done with a set of tweezers.

fTPM: Has known fault-injection attacks on AMD (faulTPM)

Cold boot attack: Leak key from RAM after bootloader unsealed it, by resetting the PC and booting into something else.

  • Microsoft says: “To defend against malicious reset attacks, BitLocker uses the TCG Reset Attack Mitigation, also known as MOR bit (Memory Overwrite Request), before extracting keys into memory.”
  • This relies on UEFI implementing this overwrite request correctly, though.
  • Recent blog: Dumping Memory to Bypass BitLocker on Windows 11
  • HackerNews Discussion on that: https://news.ycombinator.com/item?id=42552227, with the blog-author adding: “what I’m guessing is that Microsoft is not accounting for every possible place the key can end up when they’re destroying secrets.”

TPM software attacks

dTPM: PCH can be reconfigured to assign the TPM reset pin as a generic GPIO. This allows software to reset the TPM via GPIO write.

dTPM & fTPM: Put system in S3 sleep mode. That fully turns off all peripherals, including TPM! -> TPM cannot remember PCR state.

  • TPM should save state to non-volatile memory. But that is done on command of the OS, and can simply be patched out (e.g., with Linux kprobes)
  • Specs don’t say what TPM should do on resume-from-boot if there is no PCR data saved in non volatile memory! -> some devices are broken (“secure” devices don’t reset PCRs to zero on wake from S3 without saved PCR values)
  • A Bad Dream: Subverting Trusted Platform Module While You Are Sleeping
  • Ready-made tool: bitleaker

dTPM & fTPM: Hardware debugger enabled too early without measuring anything -> get tpm-code-exec on clean slate, write any PCRs you want

Conclusion

This was a fun journey! Not only did we discuss Windows Secure Boot but also Linux and TPMs. Drawing a succinct conclusion here is hard, since so much depends on your concrete threat model. Getting encryption to the masses, in a user-friendly way and with “good enough” security is a great goal. But getting actual secure encryption is another.

I feel that discrete TPMs, as they are currently implemented on most x86 platforms, aren’t great. Historically, there were more attacks against dTPMs than fTPMs, even though dTPMs are, in theory, way better certified and resistant against the “big guns” of fault injections, decapping, and the like. From my research, it seems that a lot of the attacks were much broader and attacked the integration of the TPM in the wider ecosystem rather than TPM cryptography.

However, firmware TPMs had some of the same integration issues, just fewer of them. In any case, relying just on Secure Boot, the way it is right now, with the Microsoft keys enrolled, is likely quite insecure. Adding a simple PIN or password makes any attack way harder, even when an attacker has physical access.

Revocations are impossible right now, though Linux is ahead on that front since many non-SBAT-capable shims have already been revoked due to severe unrelated security issues. There isn’t really a prevention against downgrade attacks on the BitLocker front, except for Microsoft rolling their keys, which will likely, and unfortunately, be a bit painful.

Let’s pour one out for all the affected users and sysadmins of the future. If you run BitLocker in your environment, think about your threat model, and if you would rather have more recovery screens (locking on PCR4), more user hassle (preboot authentication), or more “unknowns” (upgrading to the 2023 certificates).

I hope you enjoyed this journey and learned as much as I did! I’d love to talk to you if you have any further remarks or questions. Hit me up via thomas (at) neodyme.io.

38C3 Talk - Windows BitLocker — Screwed without a Screwdriver

I presented this research at 38C3. You can find the presentation here: https://media.ccc.de/v/38c3-windows-bitlocker-screwed-without-a-screwdriver

By revealing the content you are aware that third-parties may collect personal information