Comments

Ron May 19, 2020 12:54 PM

It does seem really good, though one thing I don’t see is any mention of distributed machine learning malware. If an attack can spread across multiple machines they could coordinate to at least share computing resources to scale up an analysis for deeper penetration. A distributed ML algorithm could train itself on inputs from many machines and come up with an estimate faster of how best to avoid detection. The combined power of many machines could help to train ML which is better at attacking or hiding on any one machine. It could train a smarter algorithm for accomplishing the attackers goals using the victims own computing resources rather than having to offload data to a remote compute cluster or limiting the complexity of local computation.

Another thing I don’t see, though I only skimmed the paper, is a view towards advancements in ML. Unsupervised learning or self teaching algorithms that can adapt to changes better than anything available today are likely to appear at some point. Currently the ML field has focused a lot on building static models from training data, but there has been a fair amount of work on systems that can truly learn over time and adapt.

Somewhat related to ML enhanced phishing and propaganda generation, ML methods that trick a user into doing something they shouldn’t might be considered, in other words direct influence over a person by analyzing machine use patterns and information sent and received by that person. Examples might be faking an email from someone they communicate with often, or making the person think a database has been corrupted to distract their attention from something else. But based on a ML algorithm that learns the person behavior and tailors a response to it.

Perhaps these are implementation details though, the paper doesn’t really go into specific methods, it just outlines broad categories.

Drone May 19, 2020 10:47 PM

Excerpting from Page-10:

“Data Poisoning: Can machine learning systems fail due to their training data? For many machine learning systems, training data is fundamental. Without the explicit instructions provided in other forms of computer programs, these systems learn virtually all they will ever know from the data they are provided. For hackers, this creates an opportunity: change what data the system sees during training and therefore change how it behaves. This risk is particularly acute for systems continually trained and retrained based on user input, such as recommendation systems and some spam filters. This class of activity, known as data poisoning, deserves further technical study. More in-depth research will likely reveal significant vulnerabilities for machine learning systems deployed in national security environments in which there are dedicated adversaries.”

IMO the paper does not give the topic of Data Poisoning in ML enough consideration:

Data Poisoning is the Achilles’ Heel of ML systems, especially in the case of autonomous protection against cyber-attacks. The problem is that once a ML system learns something wrong, it is almost impossible to un-learn. An attacker can do things to confuse and/or distract a ML protection agent, and each time the attacker does this the agent learns something wrong that can’t be un-learnt. The only way to trust the agent again is to reset it back to a nascent time prior to the poisoning attack so that the misleading attacks are erased from the corpus of accumulated knowledge (along with all the good knowledge learned up to the time of the attack). And then there’s the problem of how to detect the poisoning in the first place! Without 100% detection of poisoning attacks (an almost impossible task), over time you will grow to mistrust the ML agent.

Phaete May 20, 2020 3:01 AM

Just skimming through the malware attribution summaries shows a strong bias.

Wannacry is North Korean malware ‘misusing’ NSA exploits.

Notable cyber accidents have likely already occurred.
For example, it seems probable that the 2017 attack known as WannaCry, carried out by North Korean hackers and causing more than $4 billion in damage all over the world, was at least partially unintentional.
Though designed to act as ransomware, the attack contained no mechanism for decrypting files once a ransom was paid.
The North Koreans may not have understood the power of an NSA exploit, ETERNALBLUE, which they repurposed and added to the code”

But stuxnet is only ‘probably’ from USA and Isreal, and the attack on the German steelmill was “mysterious” instead of another stuxnet operation.

There are other cases, too.
A mysterious attack on a steel plant in Germany in 2014 was apparently an espionage operation gone wrong.
It triggered automated shutdown procedures that damaged the facility.
The countrywide outage of Syria’s internet in 2012 was purportedly another espionage attempt, this time by the NSA, that inadvertently disabled key internet routing functions.
Stuxnet, the reported U.S. and Israeli attack on Iran, propagated far further than intended, eventually turning a covert program into one of the most famous cyber operations ever
.

Even glorifying their industrial sabotage as “one of the most famous cyber operations ever

For me this reads like 1 side of the story of 2 people having an issue with another. Every few sentences a small spin/stab towards the other.
This secondary agenda undermines the credibility of the report as a whole, which is actually a pretty decent summary/exploration of ML.

Ergo Sum May 20, 2020 5:43 AM

Quote from the article:

But another set of questions also deserves analysis: what about the cybersecurity vulnerabilities of machine learning systems themselves? It stands to reason that they will be vulnerable to many of the same weaknesses as traditional computer systems, such as the potential for software bugs that an attacker could exploit. Moreover, they offer new kinds of fundamental vulnerabilities providing hackers additional opportunities to undermine the effectiveness of machine learning systems in critical moments. Yet, for all of this, credible estimates suggest only one percent of AI research money is spent on machine learning security and safety.

The lack of addressing AI vulnerabilities at the start will be the downfall of the AI systems. Adversaries will learn how to systematically feed disinformation to an AI system; essentially creating an automated double agent, that fools the AI system. That just great, does IoT ring a bell?

Ergo Sum May 20, 2020 5:57 AM

@Phaete…

For me this reads like 1 side of the story of 2 people having an issue with another. Every few sentences a small spin/stab towards the other.

This secondary agenda undermines the credibility of the report as a whole, which is actually a pretty decent summary/exploration of ML.

Certainly, there’s a bias against real/perceived adversaries, even if they use(d) NSA exploits in their malicious code. “False flag” operation didn’t even occur to the authors it seems…

vas pup May 21, 2020 5:47 PM

I guess the best target is AI algorithm itself, then it will generate bad predictions (even when training data was really good, balanced and relevant) OR targeting training data, i.e. as IT guys you know that by logic in order to get right output it should be both: good programming logic AND good input.
Moreover, as with fake video news utilizing artificially generated images of key politicians speeches: it is working in recurrent mode AI generating images and AI discovering fake in cycles against each other until the reach the level of impossibility to distinguish reality from fake – attacking AI and defense AI should be developed and tested in concert against each other.

I just curious: did Russians developed counter rocket technology against their own supersonic recent weaponry? If know – see above.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.