研究人员披露了一种名为“评论与控制”(Comment and Control)的提示注入攻击方法详情,该方法被发现对多种流行的AI代码安全和自动化工具有效。
该攻击方法由安全工程师及漏洞研究员Aonan Guan发现。在周三发布的博客文章中,Guan确认该攻击对多款广泛使用的AI智能体有效:Anthropic的Claude Code Security Review、Google的Gemini CLI Action以及GitHub Copilot Agent。
攻击机制分析
研究人员发现,GitHub Actions上关联的这些工具可以通过特制的GitHub评论(包括PR标题、评论和议题主体)被劫持。在Claude Code Security Review的案例中,攻击者利用精心设计的PR标题,诱骗AI智能体执行任意命令、提取凭据,并将其作为“安全发现”或日志条目公开。
对于作为自动化编码任务工具的Gemini CLI Action,研究人员利用带有注入提示的议题评论,绕过了安全防护并获取了完整的API密钥。而在针对GitHub Copilot Agent的攻击中,专家利用隐藏载荷的HTML评论绕过了环境过滤,扫描秘密信息并穿透了网络防火墙。
这种攻击构成了严重威胁,因为攻击者的恶意提示会由GitHub Actions工作流自动触发,受害者无需任何操作。这种模式可能适用于任何摄取非受信任GitHub数据并拥有相应访问权限的AI智能体。
A researcher has disclosed the details of a prompt injection attack method named ‘Comment and Control’, which has been found to work against several popular AI code security and automation tools.
The attack method was discovered by security engineer and vulnerability researcher Aonan Guan, with assistance from Johns Hopkins University researchers Zhengyu Liu and Gavin Zhong.
In a blog post published on Wednesday, Guan said the attack has been confirmed to work against several widely used AI agents: Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub Copilot Agent.
The researchers found that AI agents associated with these tools on GitHub Actions can be hijacked using specially crafted GitHub comments, including PR titles, comments, and issue bodies.
In the case of Claude Code Security Review, designed for automated security reviews, the researchers showed how an attacker could use a specially crafted PR title to trick the AI agent into executing arbitrary commands, extracting credentials, and revealing them as a security finding or an entry in the GitHub Actions log.
For Gemini CLI Action, which acts as an autonomous agent for routine coding tasks, the researchers used an issue comment with a prompt-injection title, along with specially crafted issue comments, to bypass guardrails and obtain a full API key.Advertisement. Scroll to continue reading.
In the Comment and Control attack aimed at GitHub Copilot Agent, the experts leveraged an HTML comment, which hides the payload, to bypass environment filtering, scan for secrets, and bypass the network firewall.
The Comment and Control attack can pose a serious threat, as the attacker’s malicious prompt is automatically triggered by GitHub Actions workflows, without any action from the victim — except in the case of Copilot, where the attacker’s issue must be manually assigned to Copilot by the victim.
“The pattern likely applies to any AI agent that ingests untrusted GitHub data and has access to execution tools in the same runtime as production secrets — and beyond GitHub Actions, to any agent that processes untrusted input with access to tools and secrets: Slack bots, Jira agents, email agents, deployment automation. The injection surface changes, but the pattern is the same,” Guan explained.
The findings have been reported to Anthropic, Google, and GitHub, and all have confirmed them. Anthropic classified the issue as ‘critical’ and implemented some mitigations, awarding a $100 bug bounty to the researchers. Google paid out a $1,337 bug bounty.
GitHub awarded the researchers $500, saying that their work “sparked some great internal discussions”, but classified the security issue as a known architectural limitation.
“This is the first public cross-vendor demonstration of a single prompt injection pattern across three major AI agents. All three vulnerabilities follow the same pattern: untrusted GitHub data → AI agent processes it → agent executes commands → credentials ex