Hacking Hardware Security Modules

Security researchers Gabriel Campana and Jean-Baptiste Bédrune are giving a hardware security module (HSM) talk at BlackHat in August:

This highly technical presentation targets an HSM manufactured by a vendor whose solutions are usually found in major banks and large cloud service providers. It will demonstrate several attack paths, some of them allowing unauthenticated attackers to take full control of the HSM. The presented attacks allow retrieving all HSM secrets remotely, including cryptographic keys and administrator credentials. Finally, we exploit a cryptographic bug in the firmware signature verification to upload a modified firmware to the HSM. This firmware includes a persistent backdoor that survives a firmware update.

They have an academic paper in French, and a presentation of the work. Here’s a summary in English.

There were plenty of technical challenges to solve along the way, in what was clearly a thorough and professional piece of vulnerability research:

  1. They started by using legitimate SDK access to their test HSM to upload a firmware module that would give them a shell inside the HSM. Note that this SDK access was used to discover the attacks, but is not necessary to exploit them.
  2. They then used the shell to run a fuzzer on the internal implementation of PKCS#11 commands to find reliable, exploitable buffer overflows.
  3. They checked they could exploit these buffer overflows from outside the HSM, i.e. by just calling the PKCS#11 driver from the host machine
  4. They then wrote a payload that would override access control and, via another issue in the HSM, allow them to upload arbitrary (unsigned) firmware. It’s important to note that this backdoor is persistent ­ a subsequent update will not fix it.
  5. They then wrote a module that would dump all the HSM secrets, and uploaded it to the HSM.

Posted on June 20, 2019 at 6:56 AM17 Comments

Comments

Who? June 20, 2019 8:17 AM

That is odd. I would never connect a HSM to the Internet or other large (i.e. untrusted) networks, including corporate-wide intranets.

An old computer running DOS would be enough if it does have a good entropy source.

I understand large corporations have different requirements, they will not be able to buy a HSM for each small office. But we must accept it, if our device is connected to an untrusted network we should expect something bad happening to it.

Cassandra June 20, 2019 8:42 AM

There is a mindset that ‘hardware’ is more secure than software, and hardware whose vendor can point to important sounding certifications is yet more ‘secure’*. There are plenty of people with low emissivity/high absorpivity headgear willing to take advantage of such naïveté. There was a time when some people thought that simply putting a firewall at the edge of your network was almost all that needed doing to provide adequate security (whatever that means), and decades of counter-examples have not fully removed that misconception.

This is good work, and helps to demonstrate to people that there is more to security than simply throwing special-sounding hardware at the problem.

*There is a current vogue for assuming the use of hardware tokens assures invulnerability, which, as Yubikey are currently demonstrating ain’t necessarily so.

scot June 20, 2019 10:12 AM

The Google translation of the first sentence:

“Hardware Security Modules (HSMs) are electronic devices used as trusted cryptographic bricks in environments requiring high security requirements.”

I think now I’m going to have to name our HSMs “brick1”, “brick2”, “brick3”, etc.

Z.Lozinski June 20, 2019 10:57 AM

@Who:
That is odd. I would never connect a HSM to the Internet or other large (i.e. untrusted) networks, including corporate-wide intranets.

Think about a public cloud, which needs somewhere to store master keys. You have to store the keys somewhere within the infrastructure. These can be the keys of the cloud provider, or the keys provided by the users of the cloud (BYOK or bring-your-own-key). Best practise when running applications in a public cloud is for an enterprise to use it’s own keys. AWS and IBM Cloud both have processes to allow BYOK. (You might choose to separate cloud-keys from BYOKs, in fact I think that has just become very important ).

We can use OpenStack as an example. The major cloud providers all have proprietary cloud platforms, but telco clouds often use OpenStack. In OpenStack, the Barbican component provides a key management API. Barbican expects to have a back-end secret store where the actual passwords, keys etc. are stored. The secret store can be implemented as an encrypted database, but for high security an HSM is preferred. The HSM is typically attached to an internal network. And indeed there may be more than one HSM for high availability. The HSM is attached to a server using the PKCS#11 network protocol (which is just another crypto API).

There are recommended deployment patterns for OpenStack which separate the customer-facing and internet-facing network(s) from the internal networks used for operations, administration, maintenance and provisioning. Now you are dependent on the SDN controller and the network design to separate these networks.

So in any system with a usable public API that allows access to the HSM, this attack is a problem. Bother. (Where’s RACF when you need it?)

Lisa June 20, 2019 11:08 AM

Nobody mentioned it yet in Bruce’s post or any of the comments above.

But the flawed HSM which was successfully compromised was from SafeNet.

Let’s hope that SafeNet is shamed into quickly providing it customers with free patches or if neccessary free hardware replacements, which fix the security holes found.

The point of a HSM is to protect private keys, and one of main ways they achieve this is by having a minimal attack surface by only providing crypto services and nothing else. Having a shell on an HSM is beyond stupid since that greatly increases the attack surface.

Rene Bastien June 20, 2019 4:08 PM

@Lisa, it was a SafeNet HSM, confirmed by Gemalto to one of my clients. The patch has been available for some time. Here is what bugs me. Gemalto claims that only a certain type of their HSMs is susceptible to this type of attack. Given that all SafeNet HSMs share the same crypto chip and OS, should clients just trust Gemalto’s statement that all is fine with the other HSMs? I know a bit about SafeNet HSMs and I am not sure I could give this assurance to any of my clients at this point.

Pau Amma June 20, 2019 11:23 PM

Hardware Security Modules (HSMs) are electronic devices used as trusted cryptographic bricks in environments requiring high security requirements.

In this context “brique” would be better rendered as “building block”. (“Brique” is also the French name for Lego blocks, IIRC.)

Dave June 20, 2019 11:55 PM

@Who: That is odd. I would never connect a HSM to the Internet or other large (i.e. untrusted) networks, including corporate-wide intranets.

HSMs are obscenely expensive – because they’re so highly secure – so companies want to maximize their return on investment by buying as few as possible and sharing their use out over the network. Since they’re highly secure – because they’re so expensive – this is fine.

Until it isn’t.

Dave June 21, 2019 12:01 AM

@Lisa: But the flawed HSM which was successfully compromised was from SafeNet.

This time. There have been plenty of other attacks on PKCS #11 APIs by pen-testers, or in some cases users with too much time on their hands, which never got published. Another variant is when there’s a higher-level wrapper like Java around PKCS #11 and you can attack the HSM by going in via PKCS #11 rather than the Java API, which lets you do more tricky things than Java.

The overall issue is that PKCS #11 is meant to be an internal programming API, not an attacker-facing API. It’s not secure because it was never meant to be attacker-facing, a bit like SCADA devices that are now suddenly exposed over the Internet.

Nate June 24, 2019 8:46 PM

I still don’t understand: What is it that an HSM actually provides, in terms of security in a cloud?

As I understand it, an HSM has very little storage in it and can only store keys. Let’s assume everything is working perfectly and you can open an encrypted channel to the HSM, from your personal physical kickable box that you know is secure, through an untrusted telco, through the untrusted Internet, through your untrusted cloud provider’s internal network, then verify that it is in fact a real HSM you’re talking to and not a hacked or emulated on that’s being spied on, and upload your secret private key.

Since the keys can’t leave the HSM (otherwise what’s the point of putting them in it), then there’s surely only a few operations the HSM can actually do for you:

  1. Given an unencrypted data stream, encrypt it with your private key.
  2. Given an encrypted data stream, decrypt it with your private key.
  3. Sign something with your private key, granting some kind of organizational authority.

In case 1, your unencrypted data stream is going through the untrusted cloud’s network or host, so it can just be eavesdropped in its unencrypted form.

(Because even if you can open an encrypted channel to the HSM from, say, a cloud VM…. how are you doing that encryption? You’re doing it in the cloud VM’s RAM which is open and readable to the untrusted cloud host, surely? And if you avoid doing the encryption-to-the-HSM on a cloud host, and do it outside of the cloud, on your personal kickable machine… why the heck do you need an HSM at all?)

In case 2, the same thing. Unencrypted data is coming back – or data which is encrypted to the HSM, which in any case has to be decrypted in open, readable RAM on a cloud host or what’s the point?

Case 3 seems like the only one that gives any security guarantee at all? Sign something, ie, grant organizational approval. In this case you’re not emitting any unencrypted data and you’re fine.

So the use case for external HSMs inside an untrusted cloud’s private network seems… very, very slim. Only useful for signing things. Not for actual encryption/decryption of data, because otherwise it just becomes this whole pile of turtles on top of turtles. At some point, you gotta be processing data in the clear on an untrusted cloud VM.

(And this is where presumably on-chip Secure Enclaves like Intel SGX come in as the only possible technology which might allow a cloud VM to do any kind of data processing on encrypted data, without revealing to the hypervisor either the key being used to encrypt/decrypt the data, what that encrypted data is, or what operation is being done. And it seems like that’s very very finnicky, hard to manage, and how do you know that SGX is in fact implemented correctly and doesn’t itself have a secret backdoor from Intel built into it?)

So HSM seems like: useful for one very specific use case, which is basically ‘processing financial transactions, particularly authorising transactions’. Not useful for very much else.

I mean are you gonna stick all your Active Directory domain credentials in HSMs? Or are you?

What am I missing here?

Nate June 24, 2019 8:52 PM

After glancing briefly at some literature:

Okay, not just financial transactions, but, the real-world use cases for HSMs do seem to be mostly limited to signing certificates without revealing your root certificate private key.

Which is fine, but, doesn’t seem to have much to do with privacy of data. An HSM can stop an untrusted cloud host from pretending to be you and issuing certificates in your name. But can’t stop them from reading any other actual data – anything other than certificates – straight from the RAM of your VM.

Am I reading the situation wrong? Is there in fact any way to process arbitrary data, as opposed to just certificates, on a cloud VM, using HSMs?

Or using any other currently deployed technology, I mean, not some theoretical academic thing that’s not been implemented?

Nate June 24, 2019 9:07 PM

My big worry with cloud computation is with the category of ‘data in use’, because I think if you had a sufficiently motivated, sufficiently large, and sufficiently evil cloud provider, they could deploy ways of dumping RAM from cloud VMs. From all except perhaps the highest security, non-tenanted, physically inspected servers (ie what the US intelligence community uses for their Amazon cloud). For the rest of us… which may include a whole lot of corporate systems… are there any guarantees, apart from the certificates they have in their HSM?

https://en.wikipedia.org/wiki/Data_in_use

Marcel June 25, 2019 3:36 AM

@Nate: Indeed the most important use case for a HSM is digital signing without revealing the private key. This is useful far beyond financial transactions for proof of identity, ownership etc.
I also agree with your assessment, that if the HSM is used for encryption, obviously the to be encrypted data was outside the HSM in plaintext and therefore could already have been stolen or compromised, regardless of the trusted channel to access the HSM. However, it is a matter of fact that most data processing requires data in the clear and thus this risk has to be accepted to have the functionality. Encryption helps to limit the exposure, especially for data at rest and for data distribution. By only decrypting the minimal data set required for processing also the exposure time and scope can be limited.
The HSM safeguards the decryption key in these use cases, thus protects against robbery of data AND the decryption key (still a compromised application could reveal the data currently processed to an adversary). I believe safeguarding the private/secret keys is a very valuable property indeed, especially if paired with key access control based on role and user identity(s).

Clive Robinson June 25, 2019 5:49 AM

@ Nate,

I still don’t understand: What is it that an HSM actually provides, in terms of security in a cloud?

That depends on who you are…

Historically HSM’s were used for very limited functionality to protect a secret from those that need to use it (ie a master key).

Thus the HSM functionality was very limited. Also due to the technical effort in design, implementation, supply and delivery chains that should have gone into manufacturing and delivering them to the “end user” very expensive.

But sales were low and even despite the high price profit was low… So various vendors started to add functionality to broaden market appeal thus increase sales and therefore profit.

The problem with adding functionality is you add complexity at a greater rate[1]. Which without significant care by designers and implementers leads to a weakening of security.

Another problem with adding functionality was “upping the throughput” on the same hardware, which is a double whamny all on it’s own. Because not only does the number of new functions increase the demanding more of the hardware, the traffic for those functions increases as well which makes even greater demands.

As a general rule as you increase through put, you have to either upgrade the hardware or make the utilization of the existing hardware more efficient. The general result for a number of production reasons is to try and increase efficiency. At some point you run into a series of problems, but the net result is generally the same, as you increase throughput you weaken security.

In effect it’s “Security-v-Efficiency” as the troughput goes up things like the bandwidth of side channels goes up. Which has other effects like increasing the potential for secrets being leaked passively or by carefull manipulation by an attacker of both the inputs and outputs cause secrets to leak, even through the likes of data diodes.

For these reasons alone HSM’s should never be placed where untrusted entities be they technical or natural can get at them. That is they need a high degree of both Information and Physical security. Remember a number of CA’s got hacked and false but valid certificates got issued because of the problem of not seperating “Signing” from “untrusted entities”.

A general definition for “cloud” would be “Making resources available to untrusted entities”, thus not realy the place for HSM’s to be without a whole load of extra precautions. Precautions that very few will get right in the light of currently known attacks, but even those very few will in all probability NOT “get lucky” in the face of future attacks.

Thus it’s not just “freedom” that “requires eternal vigilance” all things “security” or more correctly “Privacy” do as well.

Which raises the question of the best way to maintain “Privacy” to which the simplest answer is “total segregation”. That is if an attacker does not know you exist or where you are located and likewise can not get to you then they can not breach your privacy. But that requires a degree of issolation that for most is impractical at best.

As a general rule humans are social and have to communicate to exist at any kind of modern functional level, the same is true for computers if they are to be of any effective use beyond a certain minimal function[2]. Thus information has to get in and get out of the “total segregation” area. Deciding what information, when, and how it will cross and how you will monitor it for compliance is the key to maintaining privacy thus security.

Few places actually do this, they get sold an idea about HSM’s and assume incorrectly the rest of it comes with it as a “Turnkey Solution” it does not.

[1] As you add functionality to a system each new function ends up having relationships with existing functions in various ways. At the simplest level you can see that if you have N functions then you have N^2-N relationships within the bounds of the system. So if you have five functions you have twenty relationships if you add one more function you add another ten relationships… Whilst there are ways to control this they require not just quite specialised –and classified– knowledge but rather more hardware resources running at lower through put.

[2] When talking about this I use as an example a battery charger in a laptop or that of a home Solar/Wind storage system to prolong it’s usable life whilst getting the best availability. You would probably be surprised at the complexity and all the information flows required.

Robert July 1, 2019 4:38 AM

@Rene Bastien
all SafeNet HSMs share the same crypto chip and OS…

Thats not correct. Gemalto is using 2 HSM product lines, Luna and ProtectServer. Luna 6 and PS2 has the same chip, but different FW (PS2 is still using Eracom functionality). Luna 7 is comming with totaly different chip and FW (it is more focused to ECC).

With the logical security, there still have to be the physical one. The first initialization should be done through serial cable, the same with the encryption key exchange. Then you can communicate with HSM remotely over client. If the thief has no key, cant communicate with HSM, cant upload his FW. HSM should be in secure enviroment – locked in cage at DC, secured with 2FA and CCTV.
Devices, including HSM and Smart Cards, are as safe as the user is working with it.

Frank E. January 15, 2020 9:41 AM

I would love to know your views on software-defined networking, which Sidewalk Labs wants to apply to residential networks in Quayside, Toronto.

They claim that SDN will simplify and improve security for customers by letting the providers do all the configuring.

Isn’t this a bad idea from a privacy point of view?

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.