September 17th, 2017
Friends, SysAdmins, #infosec, lend me your
After Equifax was pwned, word got out that CVE-2017-5638, a Remote Code Execution vulnerability in Apache Struts, was exploited in the attack. This vulnerability had been published on March 7th, 2017; the compromise was first detected on July 29th, 2017, with earliest access as far back as May 13th, 2017. That is, the vulnerability had been exploited approximately 2 months after it was announced.
This two month window then quickly became a major point of criticism from the #infosec twitterati and some press coverage. I frequently argue in favor of regular, automated software updates; I know from experience that unpatched software poses the biggest risk to just about any organization, that attackers are more likely to use known, published vulnerabilities rather than the ominous 0-day. But I also know that patching a large infrastructure is not trivial.
Over the last 17 years or so, I have helped design, develop, build, maintain, and secure large scale systems that you likely have used (albeit often indirectly). I have tried to solve the problem of applying software updates in a variety of ways, and I have come to the conclusion that it's not currently realistic to believe that you can keep your entire infrastructure and serving stack up to date at all times.
Today's software dependencies are so complex that your systems likely rely on components that you've never heard of on the order of thousands of distinct packages with dozens of different versions being in use. (And this is not to mention that all of the services in question nowadays frequently are running inside virtual machines on hardware that you do not even own.) We're again using static binaries and in some ways lost visibility of what's running where.
The logistics of getting a software update out to all your systems are complicated; see e.g. this Twitter thread for a summary. (There is much work needed to make this process easier. Auto-updating containers and auto-rebooting services are just some of my crazy ideas.) But that's step two -- step one is to actually be aware that a software update is available and required.
So, yes, you can slurp in all the data, but making it actionable requires a lot of work. Worse, this uses a blacklist approach to system security: "we know software X, Y, and Z is used here, so we will keep an eye open for vulnerabilities in those". (Not to mention that this only covers vulnerabilities once they actually are assigned a CVE; a lot of software providers or open source projects out there don't even (know to) request CVEs for security fixes.)
None of this is inherently unsolvable, but all of this is hard. You may argue that we should be in the business of solving hard problems like these -- and I would applaud you, for I have made that case myself. But that sort of work is like insurance: generally regarded as not particularly sexy, non-revenue generating (and frequently interrupting revenue streams), and only of interest after the shit has hit the fan.
Ensuring that even only those packages you (know you should) care about are updated across your entire fleet, patching your systems and services, as of today, is still a very difficult and laborsome task, that may take several weeks even if it is prioritized correctly.
One problem with prioritizing software updates is the constant SysAdmin's Dilemma: if you're doing your job well, nobody notices. Patching systems and then not getting compromised does not get recognition; the business may only know that this happened if the update negatively impacted productivity, revenue, uptime, or what have you. Which is one of the reasons why hardly anybody applies all software updates all the time. Instead, we (try to) spot-patch when we're under high pressure or imminent (known) threat.
But spot-patching packages only fixes symptoms of a fragile security infrastructure. Band-aids don't fix bullet holes. The only alternative here is to design and build your systems from the ground up for resiliency, to allow for rapid, regular, automated, and unattended updates. Rest assured, though, that this requires a fundamental shift in how we approach software development and system- and service maintenance.
So no, I'm not excusing Equifax for their slow patching (doing so would be both irrelevant and wrong), but I suggest that the focus on the patch-window is a silly whack-a-mole distraction. Which is in Equifax's interest: Yes, they should have updated Struts. They should have known that they use Struts, and they should have tracked releases and security announcements, then taken action. But I do think there's a bit more to this:
I don't know the details of the full attack chain, but usually, an RCE in the web framework is an entry point, not "mission accomplished". There usually is a need for lateral movement, for retaining and expanding access, to elevate privileges, and to finally exfiltrate the data without being detected. In other words, there are a number of other failures that took place here that we are not talking about.
Given the quality of the breach response and whatever else came to light since then, I suspect that making public that the breach involved CVE-2017-5638 is a convenient way for Equifax to shift the blame: In the public mind, cyber is indistinguishable from magic, and so Equifax could not possibly have defended themselves against those using dark and evil spells. An "RCE in the Apache Struts Framework" sure sounds a lot more dangerous and difficult to defend against than "we used admin/admin to let third parties access our data".
September 17th, 2017