**Antropic发现“长上下文”策略使大型语言模型面临安全风险**
近日,Antropic,作为OpenAI的强大竞争对手,在其最新研究论文中揭示了一种令人担忧的大型语言模型(LLM)安全问题。通过名为“多样本越狱攻击”的方法,开发者设置的防护措施可被轻易规避。
据了解,“长上下文”策略在这种攻击方法中发挥了关键作用。通过连续向LLM提出一系列问题,攻击者可以先从简单的、无害的问题开始,逐渐过渡到更为危险的内容。这种策略能够说服模型在后续问题中透露敏感信息,即使这些问题在初时可能被拒绝回答或得到错误回应。例如,如果攻击者连续询问多个关于日常琐事的问题后突然询问如何制造炸弹,模型可能在连续上下文中放松警惕并给出答案。
Antropic官方表示,这种攻击方法已经证明对自家的Claude模型以及其他人工智能公司的模型都有效。这引发了业界对于大型语言模型安全性的新担忧。专家指出,随着人工智能技术的不断进步,其安全性问题也愈发重要。目前,业界正在积极寻找解决方案,以防止此类攻击对人工智能系统造成的潜在威胁。对此,各界应保持高度警惕并共同应对这一挑战。
英语如下:
News Title: “Long Context Attack: The New Threat to AI Models”
Keywords: Artificial Intelligence, Vulnerability Attack, Long Context
News Content: **Antropic Finds “Long Context” Strategy Poses Security Risks to Large Language Models**
Recently, Antropic, a strong competitor to OpenAI, has revealed a concerning security issue in large language models (LLM) in its latest research paper. Through a method named “Diverse Sample Escape Attack,” developers’ protective measures can be easily circumvented.
The “long context” strategy plays a crucial role in this attack method. By continuously asking a series of questions to the LLM, attackers can start with simple, harmless inquiries and gradually transition to more dangerous content. This strategy can persuade the model to reveal sensitive information in subsequent questions, even if these questions are initially rejected or receive wrong responses. For example, if an attacker asks multiple questions about daily routines and then suddenly asks how to make a bomb, the model may relax its guard in the continuous context and provide an answer.
Antropic officials have stated that this attack method has proven effective against their Claude model and models from other AI companies. This has raised new concerns about the security of LLMs within the industry. Experts point out that as AI technology continues to advance, its security issues are becoming increasingly important. Currently, the industry is actively seeking solutions to prevent potential threats from such attacks on AI systems. All parties should remain vigilant and work together to address this challenge.
【来源】https://mp.weixin.qq.com/s/cC2v10EKRrJeak-L_G4eag
Views: 6