Symbiote Backdoor in Linux

Interesting:

What makes Symbiote different from other Linux malware that we usually come across, is that it needs to infect other running processes to inflict damage on infected machines. Instead of being a standalone executable file that is run to infect a machine, it is a shared object (SO) library that is loaded into all running processes using LD_PRELOAD (T1574.006), and parasitically infects the machine. Once it has infected all the running processes, it provides the threat actor with rootkit functionality, the ability to harvest credentials, and remote access capability.

News article:

Researchers have unearthed a discovery that doesn’t occur all that often in the realm of malware: a mature, never-before-seen Linux backdoor that uses novel evasion techniques to conceal its presence on infected servers, in some cases even with a forensic investigation.

No public attribution yet.

So far, there’s no evidence of infections in the wild, only malware samples found online. It’s unlikely this malware is widely active at the moment, but with stealth this robust, how can we be sure?

Posted on June 22, 2022 at 6:07 AM16 Comments

Comments

RapidGeek June 22, 2022 8:11 AM

My first thought is that this could be a state actor. Many cloud companies use Linux and desktop users could be used in DDOS attacks.

Nix June 22, 2022 8:15 AM

Using LD_PRELOAD doesn’t really strike me as being too impossible to detect, not once you know it’s a thing that’s happening. It would have to scrub an awful lot of detection routes, most of which there is no discussion of in the list above.

So it prunes its own name out of /proc/$pid/maps? That’s an absolute minimum. Perhaps it’s also clever enough to spot someone setting LD_PRELOAD in the environment when execve is called, and add its own object before it, and similarly interposes /etc/ld.so.preload and /etc/ld.so.preload.d (where implemented), but does it scrub ptrace() results, and thus strace output, so you can’t see the shared object being loaded if you’re looking for it? What about blktrace showing the shared object getting read in? What about LD_DEBUG? (libs, files, symbols or bindings might all reveal this). I can almost guarantee it doesn’t scrub the results of usdt or systrace provider firings in DTrace for Linux (v1, at least, is mostly kernel-level, and this thing never goes near that, and v2 uses its own bpf probes which it is, ah, unlikely this thing is going to be editing on the fly). bpftrace is probably similarly unaffected, ditto SystemTap, and it would show up in neon lights in any of those.

Hiding things like this is about as hard as jailing things in the old Python sandbox (which was removed because actually sandboxing anything with it was a hopeless quest because there were so many holes in it that nobody ever trusted that they were all patched). Linux has far more introspective “holes in the VM” than Python ever did.

As usual this thing is mostly a threat to the vast majority who don’t know it exists. Which is no doubt quite enough for the attackers.

Clive Robinson June 22, 2022 8:38 AM

@ Bruce, ALL,

I know I should not complement attackers but, they have put both logical thought and effort, and I suspect testing into what they have done.

More so than qyite a few software developers of commercial software have.

However the simple fact is, even when running entirely in RAM if you know how to instrument a system correctly, malware will show up somewhere in the system memory maps etc.

Back in the early MS NT days, it was very dificult to find out what was in core memory let alone the kernel interfaces. OK things have got better but they are not where they could be.

Traditionally *nix had been better, but I feel that they all could do somewhat better these days. Especially with both core and storage being effectively beyond mortal human searching. The places and opportunities for hiding malware these days appear to get larger by the day whilst support feels like it is growing not even a quater as fast…

Bill June 22, 2022 8:49 AM

@Nix Perhaps it could have been mentioned clearer that its so-called “stealthiness” is mainly due to it being so novel, so people don’t know to look for it, not that it actually scrubs itself from all the ways of being found when one is specifically looking for it. Now some people at least know about it, but how long before it’s on the “traditional” list of things that all the tools check? It’s under the radar so to speak, until the common man can detect it with “standard” tools.

Peter A. June 22, 2022 9:19 AM

I haven’t found in the article how the initial infection with LD_PRELOAD is made.
Once some process has been infected, all its child processes can be infected by hooking exec*() calls and inserting LD_PRELOAD in the environment – but not other processes. How it “infects all the running processes”, specifically the daemons running since startup?

Ok, if you somehow infect user’s shell and then the user makes su to root or similar, there is a potential for detecting the condition and autostarting some more harmful processes, but how you are going to do the LD_PRELOAD trick on already running ones? On old sysv-init you could just kill off and restart daemons via the init.d scripts with LD_PRELOAD, but on systemd there’s only a message to PID 1 and it restarts what’s configured with its own env. Anyway, such a restart is easily noticed.

Nix June 22, 2022 11:02 AM

@Peter A, I speculate that possibly it adds itself to /etc/ld.so.preload, or to some file under /etc/ld.so.preload.d, and then edits itself out of that, in which case it would affect everything after the next reboot. (But before that reboot it would be thoroughly detectable, so maybe it just triggers a reboot itself… which is hardly subtle either.)

cmeier June 22, 2022 11:03 AM

How would this malware survive a system reboot? It either needs to save something on the machine or you start w/ some executable that was infected at the time the OS was installed. Otherwise, it needs to be re-infected from outside somehow at each system reboot.

As part of a defense-in-depth strategy, executables, libs, and some config files and directories like /lib, /usr/bin, /etc/ld.so.preload should have the immutable flag set. This won’t stop an attacker, of course, but hey, it makes the bastards work a tiny bit harder and, as part of a standard virus scan, it isn’t hard to run a check of a hash of a list of files kept on a non-writable cd to make sure they haven’t been modified.

Also note that this malware has to have infected the network firewall in order to prevent an admin from seeing that a machine behind the firewall is connecting to one of the external machines listed in the article. That’s another potential point of detection that a machine has been compromised.

SpaceLifeForm June 22, 2022 4:22 PM

@ cmeier

it isn’t hard to run a check of a hash of a list of files kept on a non-writable cd to make sure they haven’t been modified.

That potentionally could happen, likely does not in practice, and almost certainly would not happen in the cloud.

Same arguments apply to firewalling outbound traffic.

The ideal attack point would be libc where LD_PRELOAD would not have to visibly exist.

All of this leads to the argument that it is safer to use static executables or use MUSL with LD_PRELOAD disabled.

I realize that stuff will break or not even compile with this approach, because the code has become dependent upon glibc functionality and quirks.

I do not believe it is feasible to build glibc with LD_PRELOAD disabled.

If you go with MUSL as your libc, you may be forced to reduce your attack surface. 🙂

David Leppik June 22, 2022 4:22 PM

OMG! It’s an actual virus! Not just a worm that people who don’t know the difference call a virus!

Clive Robinson June 22, 2022 5:36 PM

@ SpaceLifeForm, ALL,

All of this leads to the argument that it is safer to use static executables

Back in the days when CLI was the only way to do things, the reason for shared code via “linked libraries” was to minimise the use of very expensive memory. Thus get something like double or triple the number of users on a machine.

Memmory is “dirt cheap” these days, and heck if you know what you are doing then you can use a 1TByte SSD as “slow core” and not see any real speed degradation (paging / caching).

The need for shared library executables and all the issues that has brought up over the years realy is nolonger necessary with CLI or non graphic user interfaces.

So the question boils down to,

“Do we realy need graphical interfaces on everything?”

The answer is almost certainly not. Which in turn means the need for those shared libraries is also over…

Ted June 22, 2022 5:46 PM

I am wondering if a different dynamic linker attack – this one hides a crypto miner – is a lot different than the Simbiote attack? Both seem to use LD Preload.

https://www.cadosecurity.com/linux-attack-techniques-dynamic-linker-hijacking-with-ld-preload/

This researcher says the technique is relatively easy to detect.

“… To check the contents of the LD_PRELOAD envar, the export command can be used. If you suspect your system has been compromised and this envar is set then it’s likely that a malicious library has been used… “

SpaceLifeForm June 22, 2022 8:38 PM

@ Ted

“… To check the contents of the LD_PRELOAD envar, the export command can be used. If you suspect your system has been compromised and this envar is set then it’s likely that a malicious library has been used… “

False. Not only is it False, it is dangerous.

The safer command to run is

echo $LD_PRELOAD

But, it probably does not matter if you have already been hacked.

In general, you do not want to export an environment variable unless you know that your script is going to exec a child process that you know the child process requires.

The script could unset LD_PRELOAD before the exec.

hxtps://www.baeldung.com/linux/delete-shell-env-variable

There is a huge can of worms involving the semantics of setenv(), getenv(), putenv(), etc.

I have a design that makes the semantics properly function, but it requires overhead and restrictions. i call it envsafe.

It requires using more ram if the code needs to set or change an environment variable, It requires using mmap(). The restrictions are that an envvar name be limited to under 4K, and that the same applies to the value. And it requires a guard page.

So, we are talking about 12k of ram for just one single environment variable if it needs to be set or modified.

But, if you want to make it safe, this is the overhead price you pay.

Note that it is not a huge price to pay, because there are not that many programs changing environment variables. There is no overhead if a process does not create or modify environment variables.

The overhead does not impact the kernel. This all happens inside libc.

If one thinks they need evironment variable names or values larger than 4k, to them I say: Stuff it up your crappy config file.

See __environ and PATH_MAX.

It is a rabbit hole that will make you learn stuff, especially the interaction betwwen userland and the kernel and how execve() works.

SpaceLifeForm June 23, 2022 3:36 AM

@ Ted

re: much to think about

Sorry about that. 😉

I appreciate your feedback.

I spent about 40 minutes writing that up to try to be as clear and concise as possible. I still had a typo.

Don’t forget to come up for fresh air periodically.

Clive Robinson June 23, 2022 5:10 AM

@ Ted,

Re : Gives me much to think about

As I note from time to time in the ICT Indistry, especially the sub section that is the ICT Security Industry, we appear to learn little or nothing from our history.

Even not just our “living memory history” but often our history of half a decade or less, even just a couple of months…

Worse the ICT Security Industry has the bad habit of tending to view things as “individual” or singular events not “classes” of events so fail to abstract out predictors of future related events, that many malware writters and similar chearfully do…

Why this should be so is open to various interpretations, but the frequent result is new security events that probably would have been avoided.

One such series of many avoidable historical events have been due to past “resource issues” that have morphed into something worse, a lot worse, that has now become entrenched apprently beyond eradication.

That is the lack of core memory in systems in the last quater of the last century lead to what might be considered “compression by tokenisation”.

That is code would be put in a library that got loaded into memory “once” –often at boot time– and called by many programs via an address token supplied by the “linker” at executable load time. This was in addition to writing programs so that only one copy of an oft used program like an editor was loaded into memory and each user ran the same copy (see file system “sticky bit” usage on files[1]).

Whilst the “Sticky Bit” function on program code is now very thankfully considered obsolete, it’s legacy along with that of libraries leaves a bad stench that bring much in the way of current security failings, and without doubt many more in the future.

What has happened gets swept under the title “code reuse” and it is promoted by many as a virtue for many many reasons.

But like “white wedding dresses” it hides a lot of things that may be very far from virtuous.

I won’t go into it all the details but the more common failings are,

1, Code bloat
2, Hidden vulnerabilities
3, Common failings in many unrelated programs
4, Programs that are very fragile.

In theory abstracting out lower level “common code” is a good thing, as it only needs to be written and tested just the once and be available from then on to all programs. But it becomes less and less useful the higher up the stack you go and the less specific a function it abstracts is.

The problem is unless it’s very low level most people want it to do slightly different things that is it will not at it’s simplest be “All things to All Men”. Thus the “interface” gets messy as does a whole load of otherwise unnessecary glue code and you don’t realy get “code reuse” but “code bloat” in the shared library.

The side efect of the “all things to all men” is not just bloated and messy code in the library it often only gets tested for one “use case” and thus all the other stuff “chucked in” the library frequently does not get tested at all thus ends up containing hidden vulnerabilities.

Three things can happen when a failing in a library is found,

1, You strip the library out.
2, You fix the library code.
3, You code around it in the program.

The first almost never happens as it’s consequences are against the “Code ReUse” mantra.

The second happens infrequently because it can and often does cause existing programs to break.

So the third option gets used, which just perpetuates the problem almost for ever… Because the new program will become in most cases a compiled executable for which the source code becomes either unavailable or the needed “history files” never get written so it becomes subject to the first two problems.

The net result is that the program eventually has to be re-written from scratch so a whole big chunk of the “code reuse” mantra should realy get called what it is “Bovine Scat”.

But… Hidden away in that is the “money keeps rolling in” that is if code can not be properly maintained for excuse AAA then it has to be re-written thus keeping programmers in the “Hamster wheel of pain” and share holders seeing increasing corporate worth, thus senior managment getting bonuses, whilst the users of the code are certainly not getting a lift from the elevator car…

But that “fragility” asspect, has not just sharp but very long teeth as two recent examples indicate. One was a coder who deliberately made his code not functional and brought many many projects to their knees over night, and much anger from all the “free riders” who had been shown up to be the “cheep chisslers” they undoubtedly were.

The second was log4J, it was clear version 1 was nolonger viable a decade ago, so a replacment was built and anounced and people were urged to move but they declined for all sorts of excuses… Then AliBaBa anounced Log4Shell an attack against log4J… The result was a veritable blood bath, stired up by much finger pointing with even politicians diving in to splash about and get federal agencies to make dire prognostications…

All of which were entirely predictable, but less obvious was the big flows of money hidden under all the splashing of blood and hurling of bovine scat…

Any student of ICT History would have told you this was going to happen in some way, and that it will happen again and again… Untill the very unprofessional side of the “Code ReUse” mantra goes the way of the “Sticky Bit”… And perhaps not uncoincidently the significant money flows the unprofessional side of “Code ReUse” has made so desirable to the dispicable and cheap chiselers and free loading and in some cases major tax fiddling big Corporates.

[1] The sticky bit appeared in 1974 and it’s purpose was to “speed things up” as well as “save memory” a compiled program gets loaded into memory as several parts, one being the executable code in the “text segment” that does not change throughout execution the equivalent of what would later be called “ROM Code” any variables would end up in what we tend to call “heap space” which contains the stacks, memory tokens as well as what many programers consider variables. Obviously loading or copying the “text segment” takes time. However with Virtual Memory it requires only a token to be changed in user space and a few changes in the process user space to map the code into multiple users apparently unique process spaces,
https://en.m.wikipedia.org/wiki/Sticky_bit

[2] https://en.m.wikipedia.org/wiki/Log4j

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.