AI just rewrote the offensive economics of finding and weaponizing vulnerabilities.
Most peers I’m talking to, and most vendor write-ups I’m reading, already get that patching alone isn’t enough. Yet patching still tends to land near the top of most response lists, and from what I’ve seen in the past 30 years, it’s the part of the plan least likely to deliver.
I don’t think patching should anchor the answer. Of course, we all want secure software, written well, and patched fast when things break. That’s the goal. It’s also not the reality, and I don’t see that changing.
Most of the recent news coverage centers on Anthropic’s Mythos. From what’s being reported, Mythos is a real jump in capability. Two hundred and seventy one vulnerabilities were discovered in Firefox, 181 working exploits in a single session, and decades-old bugs surfaced from OpenBSD. That’s worth paying attention to.
But the Mythos-as-the-threat framing misses the bigger picture, especially since Mythos isn’t unique. OpenAI’s GPT-5.5-Cyber solved about 82% of the tasks on CyberGym, a cybersecurity capability benchmark, and sits in the ‘High’ tier of OpenAI’s Preparedness Framework, a level that signals a model can develop working zero-day exploits or meaningfully assist real intrusion operations.The UK AI Security Institute confirmed it can run a multi-step simulated corporate attack end-to-end. Open-source models like DeepSeek V4 Pro, paired with a decent harness, have shown similar discovery and exploitation behavior at a fraction of the cost.
AISLE ran a follow-up after Mythos. They isolated the same vulnerable function and tested it against eight different models, including a 3.6 billion parameter open model at $0.11 per million tokens. All eight detected it. AISLE isn’t a hypothetical operation either. They’ve shipped all 12 CVEs in the January OpenSSL coordinated release, five in curl, and more than 180 externally validated CVEs across more than 30 projects. They conclude that what gets results is the system around the model, more than the model itself.
Praetorian has been making the same point publicly. They’re hitting Mythos-level results internally with Opus and their own orchestration, and their Constantine system finds vulnerabilities, proves them, patches them, and submits pull requests autonomously. There’s also OpenAnt from Knostic, an open-source LLM-based vulnerability discovery tool that explicitly supports any backend model.
Anthropic and OpenAI can gate access to their frontier models. No one can gate the open-source side, and the open-source side is already in the same neighborhood.
The capability is broad, and it isn’t going away. That’s the starting point.
Table of Contents
Now to the patching part
I’ve been in security for more than 30 years, and I have heard “patching needs to be faster” the entire time. I’ve been one of the people saying it. It just hasn’t happened.
And I don’t think that’s because security teams haven’t been pushing. Patching is slow for a variety of reasons. Patches break things. Production dependencies are messy, and the teams running them often don’t have full visibility into what depends on what. Engineering teams are working on revenue, not eagerly waiting to install a kernel update. Change management exists for reasons that predate AI. Sometimes patching at AI speed isn’t possible at all. The rest of the time, it’s complicated enough that it takes weeks of careful work to do safely. Most organizations I’ve worked with don’t have the appetite to push unverified vendor patches into production hours after release, and they’re not wrong to be cautious about that.
None of that goes away because a model can write a working exploit in 90 seconds. Patching will keep getting a little faster, the way it always has. It will not get to an under-an-hour response across the full backlog.
So if the defensive plan rests on closing the patch gap, the plan is built on something that has never been delivered in 30 years. I would not bet on it delivering in the next two either.
So what should be in the mix instead?
Patching still matters, but it gets one slot on the board, not the whole board. There are a few other slots that must be considered.
The first one is faster detection. When the time from disclosure to working exploit is measured in hours, what happens after a bug ships matters at least as much as what happens before. Detection needs to be real-time and high enough fidelity that an analyst, or another agent, can act on it without spending an hour validating that it’s real. It also depends on visibility. You must actually capture the data in the first place. Sampling and aggregation made sense when humans drove discovery, but they break the moment an AI chains six low-priority signals into one path. If the early signal didn’t make it into your data, the attack is invisible later, and there’s nothing to reconstruct from.
Right behind that is more context. The person triaging a finding should already know whether the vulnerable function is reachable, what data the affected service touches, and who can ship the fix. Most teams I’ve talked to piece that together by hand after the alert lands. That’s time you don’t have.
Virtual patching is underrated and I don’t see it discussed enough. A WAF rule, signature block, feature flag in front of vulnerable code, circuit breakers or virtual fuses that pop when something looks off and then isolates the critical pieces from whatever just went wrong – they all don’t fix the bug. They buy days while a real patch goes through normal change management. And in a meaningful share of cases, that’s the difference between exposure and incident.
Blast radius reduction is the most boring item on the list yet probably has the highest-leverage. Segment, isolate and scope service account permissions to the actual job. Assume the bug will eventually get exploited and constrain what an attacker can reach when that time comes. It’s unglamorous architecture work, and most of the leverage in a post-AI threat model comes from here.
Faster patching still has a place, but only for the small number of things that warrant it such as internet-facing, high-reachability, and sensitive data paths. Pre-authorize the playbook so those don’t sit in a queue waiting for a Tuesday window. The rest of the backlog isn’t going to move significantly faster than it does today.
One thing that doesn’t get talked about as much is that faster patching is partly an architecture problem. If your systems are built with enough resiliency that you can patch one piece without taking down the rest, the speed conversation gets a lot easier. It’s not a complete solution on its own, but a resilient architecture is what makes faster patching possible in the first place.
What does that program look like in practice?
The version I’d build today doesn’t have a patch Tuesday. It has a continuous detection-and-context loop, a small number of pre-authorized fast-track paths for the things that genuinely warrant them, and a heavy lean on segmentation and visibility so the things that don’t get patched in time are still survivable.
A lot of that is direction, not destination. Most of us aren’t there yet. But I think it’s a more honest plan than “we’ll patch faster this time.”
Another point to highlight is on the defender side. Everyone is saying defenders should use AI too. I agree but the version I care about is more specific. Give the model the same context an attacker has, which includes the reachable surface, the asset inventory, and the actual ownership of every path, and let it move at the same speed as the offensive side. That’s a context problem before it’s a model problem. If your security data is sitting in five tools that don’t talk to each other, the model is going to be slow for the same reasons your analysts are.
AI changed the offensive math. The defensive response shouldn’t be “do the thing we’ve been failing to do for 30 years, but do it harder.” It should be a wider plan where patching is one part of the answer, and the rest of the program closes the gaps that patching won’t.
If your roadmap is mostly faster patching and AI-assisted patch generation, I’d push back a little. Not because those are bad ideas, but because they’re a comfortable half of the answer, and the uncomfortable half is the one that has to land for the program to hold up.
INTERESTING POSTS
About the Author:
Joshua Scott is the CISO at Hydrolix, a real-time data platform company. Hydrolix addresses the two scale barriers facing observability and security platforms: global scale and real-time performance. The platform delivers real-time analytics that get you insights in seconds across globally distributed data at internet scale, from servers and microservices to AI agents, while enabling years of retention through next-generation compression. Trusted by Fox, ABC, and Paramount for mission-critical live events, For more information, visit hydrolix.io.







