
A surge in benchmarking of AI-powered web hacking agents was detected, creating a public roadmap for selecting, tuning and deploying large language models for automated exploitation of real-world web applications.
Threat Intelligence
Model Abuse
Dec 12, 2025
This is the second flagship Fortaris detection — it shows industrial-scale AI misuse, not just tools.
A new class of AI-enabled cyberattack infrastructure is emerging in the open: benchmarking frameworks designed to evaluate and optimise large language models for hacking tasks.
In December 2025, Fortaris detected a surge in community-driven experiments comparing large language models across web application hacking challenges. These tests, shared publicly across cybersecurity and developer forums, are no longer academic curiosity — they are rapidly becoming a decision-making layer for attackers.
These benchmarks measure how effectively different AI models can:
Discover vulnerabilities
Write exploitation code
Bypass security controls
Solve multi-step capture-the-flag (CTF) challenges
Do so cheaply and at scale
The results are published openly, allowing anyone — including malicious actors — to select the most capable and cost-efficient models for offensive operations.
What Was Detected
Fortaris identified a coordinated pattern of posts and external research pages benchmarking large language models against web exploitation challenges.
These benchmarks compare models such as:
GPT-5 and GPT-5.1
Claude Sonnet
Gemini
Grok
Other frontier and open-weight models
The tests simulate:
SQL injection
Authentication bypass
Input validation flaws
Logic errors
Multi-step attack chains
Each model is given multiple attempts and extended token budgets to complete the attacks. Results are scored, ranked and published — effectively turning AI-assisted hacking into a competitive optimisation problem.
Why This Is Dangerous
These benchmarks do not directly attack victims — but they enable something more dangerous:
They create a playbook for building automated attack systems.
By publishing which models perform best at hacking, threat actors can:
Choose the most effective AI for their budget
Automate vulnerability discovery
Run thousands of exploit attempts
Continuously refine prompts and strategies
This shifts cybercrime from manual exploitation to AI-optimised attack pipelines.
Attackers no longer need elite human operators. They need:
A benchmark
A model
A workflow
The industrialisation of AI-driven hacking has begun.
Threat Scenarios
1. Model Selection for Cybercrime
Attackers use these benchmarks to choose the best AI model for phishing, exploitation and malware delivery, maximising impact per dollar spent.
2. Fully Automated Reconnaissance
AI agents scan the internet, probe targets, adapt payloads and attempt exploitation continuously — without human oversight.
3. Rapid Weaponisation of New Models
Every new AI model release becomes a new attack surface, instantly tested and ranked for offensive capability.
Why This Matters
This represents a fundamental shift in cyber risk.
Traditional security assumes:
Humans write exploits
Humans choose targets
Humans launch attacks
AI-driven attack agents remove all three assumptions.
The result is:
Higher attack volume
Faster exploit cycles
Lower cost
Less attribution
Security teams cannot defend against this using manual threat intelligence.
How Fortaris Detected It
Fortaris monitors:
AI model misuse
Open-source research
Developer communities
Cybersecurity forums
Benchmarking platforms
We flagged this activity because it represents early-stage weaponisation infrastructure — the tooling attackers will use before campaigns go live.
Final Thought
AI hacking agents are no longer a theory.
They are being measured, optimised and prepared for deployment — in public.
This is what the next generation of cyber threats looks like.
And this is why Fortaris exists.