OpenAI has introduced Aardvark, an autonomous agent based on GPT-5 that works as a digital security researcher: it reads code, finds vulnerabilities, checks exploits in a sandbox, and offers ready-made patches. The tool has already discovered dozens of real bugs and 10 CVEs in open source projects. It is currently in private beta.

Aardvark is a new approach to cybersecurity: not just code analysis, but behavior like a real AppSec specialist. The agent connects to the repository, analyzes commits, builds a threat model, finds risks, confirms exploitation in the sandbox, and sends a pull request with a fix via Codex.
Its key features:
analysis of the entire code base
tracking of commits and changes
exploitability assessment
launching an attack in a safe environment
auto-generation of patches + review by an agent
integration with GitHub
Aardvark shows high results: 92% of detected vulnerabilities in test repositories. Similar solutions are currently being developed by other giants – Google is already testing CodeMender. But OpenAI is the first to launch the tool as a standalone agent for security teams.
40,000+ CVEs were registered in 2024. About 1.2% of commits in projects contain vulnerabilities, which makes automatic continuous code auditing critically important. OpenAI already uses Aardvark in its own repositories and plans to audit select open-source projects for free.
Aardvark is a step into a new era of security, where AI becomes a permanent member of the Blue Team and catches vulnerabilities before hackers exploit them. If the system becomes publicly available and proves stable, it could revolutionize the approach to AppSec and make code auditing available to every team, not just corporations.