Thangrycat: A Serious Cisco Vulnerability

Summary:

Thangrycat is caused by a series of hardware design flaws within Cisco’s Trust Anchor module. First commercially introduced in 2013, Cisco Trust Anchor module (TAm) is a proprietary hardware security module used in a wide range of Cisco products, including enterprise routers, switches and firewalls. TAm is the root of trust that underpins all other Cisco security and trustworthy computing mechanisms in these devices. Thangrycat allows an attacker to make persistent modification to the Trust Anchor module via FPGA bitstream modification, thereby defeating the secure boot process and invalidating Cisco’s chain of trust at its root. While the flaws are based in hardware, Thangrycat can be exploited remotely without any need for physical access. Since the flaws reside within the hardware design, it is unlikely that any software security patch will fully resolve the fundamental security vulnerability.

From a news article:

Thrangrycat is awful for two reasons. First, if a hacker exploits this weakness, they can do whatever they want to your routers. Second, the attack can happen remotely ­ it’s a software vulnerability. But the fix can only be applied at the hardware level. Like, physical router by physical router. In person. Yeesh.

That said, Thrangrycat only works once you have administrative access to the device. You need a two-step attack in order to get Thrangrycat working. Attack #1 gets you remote administrative access, Attack #2 is Thrangrycat. Attack #2 can’t happen without Attack #1. Cisco can protect you from Attack #1 by sending out a software update. If your I.T. people have your systems well secured and are applying updates and patches consistently and you’re not a regular target of nation-state actors, you’re relatively safe from Attack #1, and therefore, pretty safe from Thrangrycat.

Unfortunately, Attack #1 is a garden variety vulnerability. Many systems don’t even have administrative access configured correctly. There’s opportunity for Thrangrycat to be exploited.

And from Boing Boing:

Thangrycat relies on attackers being able to run processes as the system’s administrator, and Red Balloon, the security firm that disclosed the vulnerability, also revealed a defect that allows attackers to run code as admin.

It’s tempting to dismiss the attack on the trusted computing module as a ho-hum flourish: after all, once an attacker has root on your system, all bets are off. But the promise of trusted computing is that computers will be able to detect and undo this kind of compromise, by using a separate, isolated computer to investigate and report on the state of the main system (Huang and Snowden call this an introspection engine). Once this system is compromised, it can be forced to give false reports on the state of the system: for example, it might report that its OS has been successfully updated to patch a vulnerability when really the update has just been thrown away.

As Charlie Warzel and Sarah Jeong discuss in the New York Times, this is an attack that can be executed remotely, but can only be detected by someone physically in the presence of the affected system (and only then after a very careful inspection, and there may still be no way to do anything about it apart from replacing the system or at least the compromised component).

Posted on May 23, 2019 at 11:52 AM27 Comments

Comments

Steven Clark May 23, 2019 12:13 PM

An FPGA-based root of trust? Reading it’s bitstream from SPI flash? Isn’t some bit of OTP at the base of the root of trust standard best practice? I’d say I’m surprised but these are the people who hardwire SSH keys after all.

albert May 23, 2019 3:33 PM

Why not?

We already give users* the ability to remotely reprogram flash memory, why not make the FPGA available too? Let’s invite everyone to the party.


*by users, I mean anyone who can get admin access.
. .. . .. — ….

SpaceLifeForm May 23, 2019 4:30 PM

Can anyone show that the Apple Secure Enclave is not exploitable in similar manner?

(I expect crickets)

Clive Robinson May 23, 2019 8:30 PM

@ Bruce,

This “two step” approach is exactly the sort of thing I would do if I was putting a back door into a system as I’ve indicated in the past (back when Blumberg tride to do a number on Apple).

@ All,

For those discounting this attack, you have to realise just how far down it is in both the computing and security stacks. Thus the much greater ability the attack has to bubble up through the stacks, eliminating any gains that might have been made from top down security processes such as formal methods.

It’s nasty, it has significant scope and my “thinking hinky” gland is itching and in effect saying “to much deniability” to be “by accident” therefore without further evidence otherwise I would assume “by design”.

Oh and before people get all defensive ask yourself what you would think if it was found in a Chinese company product rather than a US company product.

Patriot May 23, 2019 9:03 PM

It almost does not matter what we do in the hopes of having an internet and devices which allow everyone to have confidentiality, anonymity, authentication, etc.

Somewhere there is a room full of smart people who are ordering take-out and getting paid well to get your data. They have the bandwidth to come up with stunningly clever ways to watch you.

It is not going to stop in the near future.

RealFakeNews May 23, 2019 9:08 PM

So the real reason Huawei are banned is because the home-grown stuff is terminally compromised by “bad design”, so to force people to buy it they will just block the competition?

Sounds fishy.

As Clive said, this seems just a little too basic to be an accident. This is worse than exposing private keys.

The device of ultimate trust can be compromised to give absolute control and access to the device it is supposed to be watching.

Nothing “Big Brother” about that.

Ismar May 23, 2019 9:37 PM

Good time to ask why it is the case that it seems impossible to design any secure software that is independent on the underlying hardware to provide the basis of that security ?
Has anyone tried doing this and what would be the (obvious) reasons for it not working ?

Esteban May 23, 2019 9:43 PM

@SpaceLifeForm
You managed to set up an impossible task. Can anyone show that something is not exploitable? If you have read this blog or been around at least a decade you obviously know the answer. If not then you are just here to make a personal case against a company you do not like. Come clean either way.

John Smith May 23, 2019 9:46 PM

I’ve worked with engineers who went on to work for Cisco. Extremely smart, and experts at everything they did.

This “series of hardware design flaws” doesn’t pass the smell test.

Patriot May 24, 2019 12:17 AM

This part of that article was especially striking:

“The takeaway here is that we have to start thinking about privacy as a collective, environmental problem, not something that hits individual people, and certainly not something where the onus is on the individual to protect themselves. Privacy is starting to look like a problem similar to climate change — and in past eras, something similar to food safety.”

Clive Robinson May 24, 2019 1:18 PM

@ Ismar,

Good time to ask why it is the case that it seems impossible to design any secure software that is independent on the underlying hardware to provide the basis of that security ?

The way computer architectures are currently it is not possible to write software that remains secure once it is on hardware that is not Fully Trusted (of which there is none of any use you can get).

The reason is obvious when you realise why and you will say “why did I miss that?”.

Software is indistinguishable from data, because both are just bits in memory with no way to distinguish them. That is a “bag of bits” is just another “bag of bits” all meaning comes from meta data that is not in the “bag of bits” as to get the full range such meta data would have to be “out of band”. Thus data can be instructions and instructions can be data and you can not distinguish them.

The big problem is that all these bits are stored in memory, which by general usage definition must be writable (ie the tape in a Turing machine).

Thus you have to ask a question that came up in the early days of computers,

    What happens if the bits stored in memory get changed whilst a program is running?

In current architectures the answer ranges from nothing to total mayhem with points in between where the program will apparently behave as it should but in fact do one or more things differently. In essence it’s what some forms of malware do to gain privileges of some form.

Thus the next question is usually,

    What can change any bit in memory?

And the answer is anything that can get access to the memory and this includes high field strength non-ionizing, ionizing radiation and high energy particles.

Likewise transients on power and signal busses and even in some extream cases mechanical vibration[1].

All of which are usually considered to be “random” even when they are determanistic, because they are not correlated with the functioning of the running program (the exception being active EM fault injection attacks).

But there are also other faults to consider, logic gates have issues around their input switching threshold that can and does result in “soft faults”[2] that fall under “metastability” concerns. Likrwise all electronics have other faults that can effect the contents of memory.

Are there ways to mitigate these “random” effects? Well yes we can increase the size of each memory word to contain extra bits. You then define various values that are considered good and those bad.

For instance we can add a single bit to a byte to store a parity bit. Thus with nine bits half the 512 values are valid, the others indicate a “parity error” in the memory. However that is only detected for odd numbers of bit errors such as 1,3,5,7,9 even numbers of bit errors such as 2,4,6,8 produce valid values from ibitialky valid content so such errors are not detected.

There are more complicated systems but at the end of the day, any changes in bit valies at any given memory location that produce valid but incorrect values will not be detected.

Likewise any process that writes entire new values directly to memory locations will go undetected. Such issues used to occure with badly programed Direct Memory Access (DMA) controlers used for large data transfers without ibterupting CPU functioning.

The point that many miss is the contents of any memory location are only detrcted at that level when a memory location is read. So any changes made can easily go undetected. Especially if they are not just below the CPU layer in the computing stack but below the MMU layer and even below the nemory interface level.

Most modern computer systems can not detect tampering with the memory unless other hardware such as Parity, ECC, etc hardware detects an invalid value and raises some kind of exception.

Thus it does not matter what “formal methords” you designed the software with such tools reach effectively ends at the ISA level just above the CPU level in the computing stack. The same is true for any “Top Down” security method, they are all in most current computer architectures susceptable to “Bottom up” attacks and current architectures provide no protections against these “bubbling up” attacks.

Which brings us to,

Has anyone tried doing this and what would be the (obvious) reasons for it not working ?

The answer is yes, I have and the reasons are as stated CPU and above security fails to CPU and below tampering with memory. I also started working on ways to mitigate the issues quite some time ago and I have found a number, that have been previously discussed on this blog.

[1] If you think about basic electrical components such as inductors and capacitors, their properties are defined by conductors held in a normally fixed mechanical arrangement. If however you can disturb that mechanical arrangment you change their electrical properties thus introduce an interfearing signal. Such behaviour is most frequently observed with “free running oscillators” and is lumped under a common term of “microphonics”.

[2] In reality there realy is no “digital electronics” it’s actually at heart “analog electronics” with very high gain and little or no negative feedback. If you look at early CMOS inverters you will find that you can treat them as analog amplifiers simply by biasing their inputs to their switching threshold which is in effect the middle of their analog gain chatecteristics. You can do similar with other logic family gates and the most common use you see for it, is for either XTAL or RC oscillators.

lurker May 24, 2019 3:13 PM

@ RealFakeNews

So the real reason Huawei are banned is because the home-grown stuff is terminally compromised by “bad design”…

So Cisco et al, all those given the nudge, nudge, wink, wink from On High, deliberately leave holes in their systems. On High is pissed off with Huawei because they won’t play that game. But Huawei could/should still be excluded because of their shoddy version control, and poor update procedures, leaving holes that anybody, anywhere, could find in their own good time and exploit for purposes beyond the control of On High. Ah, it’s not about trust, it’s about control.

George May 24, 2019 11:39 PM

@RealFakeNews wrote ,

“So the real reason Huawei are banned is because the home-grown stuff is terminally compromised by “bad design”, so to force people to buy it they will just block the competition?

Sounds fishy.”

If the first step is to design an unbreakable system, then the second step is to design one that can only be exploited by the designer and his delegates.

Cisco is undoubtedly run by “smart” folks, just as Huawei is. The trade war is looking more like a political stunt show designed to camouflage the real issues running at hand. China had grown too big technologically to pose a threat to US dominance in control systems.

AlexT May 25, 2019 6:01 AM

So can we say that almost any Cisco equipment less than 10 years old should be trashed ASAP ?

Wow !

Jon May 25, 2019 6:53 AM

@ Clive Robinson

“Software is indistinguishable from data, because both are just bits in memory with no way to distinguish them.”

That’s not quite true in some microcontroller architectures, where the program is burned into non-volatile memory and the data kept separate in RAM. For example, a buffer overrun on an Atmel (now Microchip) AVR will only corrupt other data, not the program itself.

But (you knew there was a but coming, huh?) there are certain instructions that will let an AVR modify the code itself, akin to what we have here – the FPGAs configuration stream being tampered with.

If you write the code without those instructions, however, nothing in the data area can ever be executed. And there is nowhere to put any input except into the data area.

If you want a bootloader, you have to put those instructions in, and a corrupt bootloader image can program anything you want. So don’t do that – but accept that you cannot update chips in the field except by replacing them.

If you don’t put those instructions into the code, the chips cannot execute any data.

Anyhow, just a minor thingy,

Jon

Clive Robinson May 25, 2019 8:59 AM

@ lurker,

But Huawei could/should still be excluded because of their shoddy version control, and poor update procedures, leaving holes that anybody, anywhere, could find in their own good time and exploit…

Those were the reasons given in a report by a UK Agency. Which is odd as that privious reports have been different. But now suddenly as the current US Administration is on it’s high horse trying to do a “5G our way or no way” routine suddenly the company is pronounced terrible by a little known UK agency… Coincidence?

Well maybe, maybe not. Remember the UK agency that produced the report is very strongly attached to a UK IC Agency better known as GCHQ.

Also remember that under “the special relationship” that the UK GCHQ IC agency gets payed by the US NSA IC agency to spy for them.

That is not just to spy on the UK and it’s citizens, but most European Nations and their Citizens, oh and on many other nations and their citizens as well. Why? Well because the UK for various reasons is where a lot of the worlds sub-sea communications cables come to shore. It is these cables that many commercial companies use in preference to geo-stationary satellites due to the shorter path times. Remember it is “commercial” “economic activity” companies that favour these shorter path lengths, so the spying is mostly on “commercial economic activity”… That’s hardly the sort of think the US claims it performs espionage on so you have to wonder why they pay others to do it for them…

Thus when the current US Administration started up their “this company must be evil” nonsense people realy should have asked the question of,

    How was the company being judged?

That is was it against the rest of the industry? Or some esoteric or mythical ideal that somebody thought up for some altogether different and potentially corrupt reason?

The answer as anyone who works in the industry knows, is it was not against the rest of industry…

So the company was being judged against some mythical ideal that even in a perfect universe would not have be possible. So at best the report was a “Strawman argument” used for some altogether different reason. Or to put it another way some one in the “B-Team” tipped the playing field against the company, so far it was designed to “crush and bury” the company…

Lets put it this way if Huawei were judged against the realities of the very cut throat industry they would be up with the other “Leaders in Excellence”. As for Cisco, well I guess their CERT record might be a place to start, after all Cisco and their other brand name companies are the ones we keep hearing about for things like hard coded usernames and passwords etc. You would have to ask what Cisco’s “version control” and “update proceadures” are like to be fair… But all those CERT advisories, oh and also evidence of their equipment being tampered with by State Spying Entities, perhaps most could make an educated guess, that they are not as good as Huawei…

Or is it all just a “smoke screen” because Cisco are secretly funded by the US state so that all nations and their citizens who buy Cisco can be spyed upon?

It probably would cause some moral out rage in certain places if I ever consider asking the same of all US telecommunications companies as the US government asks of Chinese companies. But as they say “tough”, that’s exactly what I am asking people to do.

Why? Because if people want to play on a fair playing field and have a chance at privacy in their communications then the same questions and the same scrutiny must be asked and verified publically for all market entrants irrespective of what nation they originate from.

The fact that certain nation states are clearly not doing this and then get ruffled feathers when you mention it says where the real problem is and also gives a good indication of the honesty or not of their statments.

Which leads by a different path to almost the same conclusion you arrived at of,

Ah, it’s not about trust, it’s about control.

Let’s look at what “trust” realy means if your objective is privacy of your communications as a nation state. It does not mean holding a popularity contest and going for the people you think are least likely to betray your “blind faith” that’s the “teen school girl approach” at best, and we know how unreliable that can be. So “Trust” has to be achieved in more reliable ways.

There is the old truism about the CIA motto of “In God We Trust” because what it realy means is “every other bu99er we check”. That is they always try to establish trust rather than use blind faith.

That is the approach you have to take, if you want privacy in your communications, you assume that ALL entrants and their products will be malicious and act accordingly.

There was a lot of faux out rage noise when the UK Gov came out and said they were considering Huawei for use in 5G. Put more prosaicaly the US Administration orchistrated, no doubt by B-Team interests, to “throw the toys out the pram” and “have a major hissy fit” about the UK Gov decision.

Actually it was not just a faux “animated fit of peak” for the sake of publicity/spin. In essence there was genuine B-Team conniption at the fact that the “Special Relationship” was not the “Choker Chain dog lead” they thought it was, and as a result of their failing it came to be by implication a quite public slap in the face for the current US Administration[1].

In reality the UK GCHQ position is “We assume ALL ‘equipment’ from all companies from all nations can and will be used to spy on us, therefore we take general mitigation activities based on that assumption”. It is a position they have taken for many years and it’s not ever been a secret.

However take care to note it is about the ‘equipment’ not the companies or where they come from. That is GCHQ very well know that they, the NSA and many other national and private SigInt agencies around the world hack/implant any equipment they can “as a mater of policy”.

Most readers on this blog should know this from the “Greek Olympics” tragidy where the cavalier activities of a CIA operative led eventually to the suspicious death of an employee working for a Greek Cellular Network supplier and an international arrest warrent issued for the CIA operative. Likewise the tapping of the German Chancellor’s mobile phone etc etc.

Instead of joining in with the current US Administration nonsense people shold stop to ask why this common sense long held attitude by GCHQ has so afronted the US representatives? Because it says rather more about the current US Administration views and activities than it does about the UK, GCHQ, or the company in question.

The “technical” or “security” point people should take on board is,

    No commercial equipment is immune to the effects of State Level espionage.

It’s a clear, simple, factual statment and it has no prejudice or ulterior motives or agendas behind it.

Thus from my perspective the GCHQ advice to the UK Gov of basically “We assume it’s all got implants no matter where it comes from” is assuming a level playing field. As is the follow through action of we “mitigate all”.

Which given the nature of modern commercial equipment markets, the “mitigate all” is the only sensible technical response to ensure National Security in all ways, not just with regards the privacy of communications.

There is however the thorny issue of a “political” point to consider.

Take the US Company which this thread is about “Cisco”. Who let’s face it are the ones who have had the embarrassment of pictures taken of US IC entities adding implants to their equipment appear in the press. Likewise the frequent vulnarabilities reported about their equipment, that alow not only State Level SigInt agencies, but near all other Cyber-ne’er-do-wells to abuse Cisco’s products at the expense of their products users world wide.

Cisco is one of a number of major manufacturers of telecommunications equipment in the world, thus they are like any other major manufacturers “a target of choice”. Put simply the ROI says every dollar you spend on perverting a big corp’s product the greater is the number of customers they have that then become your victims…

So as a general rule for Cyber-ne’er-do-wells is “it pays to attack big”. You may remember we saw that in play in the early days of PC malware when “Microsoft was the target of choice” and supposed security experts were telling people to “Buy Mac’s” to be secure…

Which importantly also means that big telco manufacturers are targets not just for the IC of the country their head office is in but all countries SigInt agencies etc. In the case of Cisco their home country is the US which has a policy of spying on the world and his dog, hence the photos were not exactly unexpected.

So,

1, We know that Cisco equipment arives at customers world wide with implants in or later it gets owned from software vulnerabilities.

2, We also happen to know that the US IC puts implants in Cisco equipment and also gains illegal entry and control of Cisco’s products.

Therefore,

3, Is US company Cisco to blaim for those US implants and US exploits?

Most would say no, but ask yourself what they would say if I changed “US” to “China” and “Cisco” to “Huawei”?

Well the current US Administration is indicating that if Huawei equipment is ever found to have implants you should blaim Huawei not the country or agencies who implanted or exploited…

So why should I or anybody else treat Cisco any differently?

Therefore under US Gov rules I should claim that Cisco is not just aware of what the US IC is doing to their equipment but that they are actively responsible or even colluding with them, therefor nobody should buy Cisco equipment anywhere in the world.

Do people in the US think that is a fair viewpoint?

At the end of the day the real point is, the current US Administration is effectively saying,

    One rule for the US Gov and US corps and a different rule for everyone else.

Thus exhibiting just another form of US Exceptionalism.

So the current US Administration not only do not want a level playing field in their country, they also want to insist that every other country play on the same playing field that so favours the US…

What do non US people think of that?

Well it’s clear that the UK Government think the current US Administration view point is wrong, even though the US SigInt agency the NSA gives the UK SigInt agency GCHQ money and equipment to spy for them. It’s also clear that the UK Government reasoning is based on the common sense reasoning of it’s SigInt agency GCHQ, which has not realy changed in as long as many people can remember. It’s also reasoning based on fact abd not prejudice or hidden agendas etc.

Oh and to cap it off it appears the B-Team in the current US Administrarion are prepared to actually go to real kinetic war over it… No matter what the cost in body bags.

For those in the US ask yourself, do you realy want your loved ones and friends in body bags because the B-Team want a war any war to put their name in the history books?

[1] Actually when you consider how the information came out it might well have been designed “to yank the current US Administration’s chain” to remind them that “They do not set UK National Communications Policy”. However it came about because of a “leak” that arose from a UK NSC meeting, where GCHQ presented their view on how the UK should proceed with 5G under all asspects of “National Security”. The basics of that presentation were not secret or even confidential, and most people in the telecommunications industry know it as it’s based on “common sense” reasoning for atleast the last half century or so. Unfortunatly the MSM involved rather than say what the message was realy about decided that a “UK Minister being disloyal to PM over BREXIT differences” would make more sales etc.

Clive Robinson May 25, 2019 12:32 PM

@ Jon,

If you write the code without those instructions, however, nothing in the data area can ever be executed. And there is nowhere to put any input except into the data area.

You know this is comming 😉

The PIC architecture which Microchip started with is Harvard thus instructions on one bus into the instruction decode logic and data on the other bus into the ALU/Registers and I/O.

It does give an improved level of security except when the instructions form an interpreter for “code” in the data memory.

Some years back I wrote a very simple basic style interpreter for one of the PIC24 chips when developing a secure memory dongle. It enabled relatively fast design just using a VT52 terminal as a test instrument / control pannel. I finally replaced it with a variation of a Forth interpreter to get multi tasking/threading running on it as well as getting quite a lot of “code compression”.

Stack based interpreters are a trick that programmers who have to write multi-tasking embedded systems on SoC’s that don’t have MMU’s can find rather usefull. But few even hear of stack based interpreters after their college courses…

lurker May 25, 2019 4:13 PM

@ Clive Robinson
thanks for fleshing out the detail between my meagre lines. Of course little weight should be attached to China’s claim that they do not seek global hegemony. It is in the nature of hegemons, particularly the current, that they are ever fearful of rivals. So they will never allow any outsider to replace their current lapdog, subject of this thread.

Jon May 25, 2019 10:40 PM

@ Clive Robinson – Actually, I had in mind an AVR, which was bought by Microchip, not developed there. But anyhow, you wrote:

“It does give an improved level of security except when the instructions form an interpreter for “code” in the data memory.”

I would call that a bootloader. AVRs do contain the stack in the data memory, smashing that stack will lead to the program jumping around wildly but it won’t change “what to do when you get there”. If there is nowhere in the program you can land that says “here we move information from data to program”, it cannot happen.

Which leads to an interesting thought that some AVR instructions occupy multiple locations, and jumping into the middle of one of them and using, say, the last byte of one and the first byte of the next opcode to construct the required opcode for modifying the program might work. Tricky, that.

Aside from the above, my point was that unless you deliberately include instructions at the outset that permit moving of data from data to program (like, say, a self-programming bootloader), the data cannot write its own instructions – thus there is no data that can ever be executed.

You wrote a great deal more too, which I will read through more carefully later. I think I kinda agree with you though – all this Huawei paranoia stuff is entirely hypocritical.

J.

AlexT May 26, 2019 2:13 AM

I’m sorry to insist but but am I correct in my understanding that
1. This is a hardware design issue, most likely not “fixable” (I’m sure Cisco engineers are pulling their hairs to find mitigation)
2. It can be exploited remotely

Given the prevalence of Cisco in the market I find the consequences nothing short of staggering.

There might or might not be active exploitation at the moment but as the host of this blog contends “attacks only get better”. Given the possible rewards this will be most definitely weponized.

Again, am I missing something or is this an absolute massive issue?

Clive Robinson May 26, 2019 5:24 AM

@ lurker,

thanks for fleshing out the detail between my meagre lines.

That’s alright, it’s a subject that many readers might bot have thought about very much so a little back ground helps them catch up a bit or take fright and head for cover :-S

Clive Robinson May 26, 2019 7:03 AM

@ Jon,

my point was that unless you deliberately include instructions at the outset that permit moving of data from data to program … the data cannot write its own instructions – thus there is no data that can ever be executed.

There was an assumption some years ago that Harvard architectures were not just “more secure” but “secure” and many bought into the idea and to a certain extent still do.

However back in the late “naughties” strange things were happening and in 2007 a paper with the title “The Geometry of Innocent Flesh on the Bone” appeared which in academia kicked of Return Oriented Programing (ROP) that was talked about a lot for the next half decade, but because many assumed it was a “solved problem” you tend not to hear about it much.

But it’s not gone away[1] and the more subtle varients are still ripe for explotation even when the instructions can not be changed because they are in old fashioned ROM that has to be taken off board and reprogramed in some cases with a soldering iron (Rope and diode memory or reconecting the pins used for writing).

The point is that outside of some specialised software like state machine and DSP style programs, programs in general, especially those where the instructions were emmited from a compiler need RAM for their stacks and data on which branching decisions are made.

Thus if the program it’s self or the library code it is built on contains “useful code” then malicious activities are possible.

As you have realised and noted above that “useful code” may not be obvious. One of the weaknesses of the extended x86 instruction sets is that an instruction that takes say five bytes of ROM is also a potential four byte, three byte two byte or single byte instruction. And yes people have exploited this before (usually only as a single or sometimes two byte initial instruction).

However there is a new variation on ROP around which we will probably hear more about in the near future and it’s something I’m playing with currently.

As you have no doubt heard Intel had a little Xmas prezzie for everyone a couple of Xmas’s back. And it’s turned as I said at the time to be the “Xmass present that keeps on giving”, and probably will do for the next half decade or so. The problem is the “go faster stripes” around the CPU’s ALU and instruction decode units are actually more complex hardware wise than the core CPU functions.

Such complexity needs to be made highly efficient and as I have warned for well over a decade on this blog and prior to that in other places there is the problem of “Efficiency-v-Security” the more efficient you make things, then generally the more side channels you open up unless you realy know what you are doing[2].

Most though have assumed that “Efficiency-v-Security” means only “simple time based side channels” even though I warned there were other types. One of which is “system transparancy” from input to output was demonstrated by Matt Blaze and some of his students, and I’ve talked about the reverse direction getting back through data diodes, firewalls and other security measures using “error and exception” excercising tricks. But the majority have still not realised that what hit Intel AMD and ARM in the way of Meltdown and Spector were just a couple more on the “Efficiency-v-Security” list. Oh and that there are quite a lot of further ones that will appear to researchers when they think just a little bit further. Some I can promise you will be real doosies and will be down to hardware that’s become so entwined and critical to the system that they can be neither removed or mitigated without significant penalties. This problem Cisco is having is just one further…

To see why consider High Hardware complexity CPU systems such as the 64bit varients of the old x86. They are so complicated they have what are the basis of “Virtual Turing Engines” in them. For instance the more modern x86 varient memory control hardware has been demonstrated as being a virtual Turing compleate engine that can and has been made to function as such. It’s therefore possible to make a Turing Engine in such complexity, and it needs only the Data RAM to act as it’s tape.

This sort of thing is the future of malware…

[1] https://securityintelligence.com/return-oriented-programming-rop-contemporary-exploits/

[2] You can probably take it as read that nobody and I do mean nobody knows what they are doing in this respect. The reason is the “Known, Knowns” through “Unknown, Unknowns” proplem, that is what we know is quite a bit less than we will know at some future point. In general few designing high end CPU hardware are even aware of the “Known, Known” issues which is why Meltdown and the Spector varients exist. As for those that do, well the solutions to those problems as we know can have quite serious performance degredations. Anyone bringing it up in the “marketing advantage” or subsiquent “implementation” / “engineering” meetings is not only likely to not get invited back, they might find all their personal effects in a cardboard box in the security office supprisingly quickly.

Clive Robinson May 26, 2019 7:58 AM

@ All,

Just so you know Cisco should have absolutely no excuse for this problem.

Back in 2012 Cambridge University Computer labs presented a paper at CHES which showed that there were security issues with FPGA bitstream even in Mil-Grade high security parts,

https://www.cl.cam.ac.uk/~sps32/ches2012-backdoor.pdf

If you read the paper you will see just how vulnerable FPGA’s are to this attacks on their “Field programing” methods. In a way this is the same as malware being able to overwrite Flash BIOS ROMs such that the old IBM/Microsoft “load I/O drivers at boot and make persistent” is excercised which is what Lenovo did to it’s consumer level laptops that created such an uproar a few years back, and what a malware researcher was trying to find when trying to figure out what we now call BadBIOS an idea of which was using inaudible to humans networking by high frequency audio signals that could happily cross traditional high security “air-gapps”.

It’s fairly certain that Cisco engineers and security practitioners would have read the CHES paper and be able to draw the required conclusions. Thus the only reason Cisco could have had not to make design changes back then was because they desided for “business reasons” not to bother. Likewise with the BadBIOS and Lenovo security issues that were not just well publicised but frequently analysed technically in excruciating detail (on this blog as well as many other places).

The most likely reason for those “business reasons” is not to give up some perceived “marketing advantage”. It’s why we had Meltdown and the Spector variations.

Usually “marketing advantage” is based around some notion of “desirable performance” which security wise almost always brings you smack into the middle of the “Efficiency-v-Security” domain of problems where all sorts of nasty side channel issues are just waiting to be brought to life then found by those who have an interest in finding them.

Expect to see a lot more security issues to come from the “Efficiency-v-Security” domain of problems. Contrary to what many think the domain is not just about leaking of information via time based side channels.

I’ve mentioned it on a number of occasions in the past in this blog and admittedly later than I expected the academic and open security communities are finally waking up to what it actually means…

That is not just in leaking information, but system transparancy, and fault and similar injection attacks via error control and signalling channels, as well as those for exceptions as well. As I’ve mentioned before you can establish a viable covert communications channel back through data diodes, firewalls, and similar security barriers many of which people mistakenly belive are “one way” so they take no security precautions.

Such lack of prepardness is going to keep biting and biting very hard for quite some time to come. As security engineers it’s upto you to make those above you aware of the issues. However you are not going to be at all popular with anyone else, so I’d do it as discreetly as possible, just to “Cover Your Arse” (CYA) when the brown stuff does hit the wall as it almost always does with security vulnerabilities.

Clive Robinson May 26, 2019 12:05 PM

@ Alex T,

Again, am I missing something or is this an absolute massive issue?

It depends on your view point to a certain extent, and right some people have their fingers in their ears whilst chanting “not listening” over and over or are chosing to look elsewhere. Whilst many others don’t yet see where this is going. Some of course will do as they did with Meltdown and the Spector varients, shrug their shoulders and hope things will sort themselves out…

So,

1. This is a hardware design issue, most likely not “fixable”

Well firstly remember “All things man made man can fix or replace”, the question realy becomes one of resources and how best to utilise them. Have a think back to what happened with the Pentium maths bug as an example, I’m fairly certain that is going to be in the back of many of Cisco’s managers minds, or an equivalent like the Pinto Gas Tank.

From what’s been said by others it’s a fault that can not be fixed by software, and that does not surprise me in the slightest. Thing about modern PC BIOS chips, they are Electrically Erasable, and in many cases there is no hardware write protect, so you can change it without a lot of difficulty because you don’t have to open the case etc.

So there is a high probability this Cisco system is similar, which implies the only real fix is a replacment of the board or entire unit.

However also from what’s said you need certain software privileges which brings us to your second point

2. It can be exploited remotely

Only if you can get system privileges remotely.

Which means that although the actual hardware fault may not be fixable, keeping the required privileges away from remote attackers may be possible untill the hardware fault has been fixed.

I suspect that Cisco will issue some kind of software upgrade very soon if for no other reason than to buy themselves more time…

What we need now is more indepth information on the hardware to judge by.

AlexT May 26, 2019 3:09 PM

@Clive: Thanks for this detailed response.

I guess some restrain is indeed required here but this has the potential to turn into an aboslute catastrophe.

Jon May 28, 2019 10:59 AM

@ Clive – Fascinating! But I don’t think it applies to me right here right now.

Fortunately, I use highly cheap’n’cheerful 8-bit AVRs that don’t have “streamlining” around their ALUs, or any pretence of simultaneous threads. There is no OS. There is no shell. There is one privileged escalation (“May I rewrite myself?”), and that requires some very specific opcodes, which I very rarely include.

Unless you byte-slice up opcodes, making the program jump around can only execute the opcodes that are there.

A denial of service seems doable – force some branch to incessantly branch to itself – or mixmastering up different ops from here and there to turn “Hello World” into an obscenity, but the program itself, unless* it includes self-modifying opcodes, cannot be forced into modifying itself. On a restart, it will be as it was.

Most of the libraries I use I know pretty well because I wrote them. Found a bug in one the other day. Ouch. So generally I’m not too worried about unexpected behaviour from libraries.

Pulling and soldering pins would be a bit tricky, given that the program ROM is inside the MCU itself, but if you wanted to decapsulate and fiddle about you probably could…

That’s actually a problem with FPGAs – The configuration ROM is often external to the chip, leaving intercepting that signal near trivial. Xilinx, among others I suspect, claim their configuration data stream to be well-encrypted with a trade secret algorithm, but I dunno.

Fuse-based programmable logic doesn’t have that problem, but their density is often small enough that an exhaustive search isn’t very hard.

Anyhow, given my ruthless input validation (hehe) and the systems these chips operate, I’m not too worried.

Have fun!

J.

  • See byte-slicing opcodes.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.