GPT-4 able to hack websites without human help

Occurred: February 2024

Can you improve this page?
Share your insights with us

Large language models (LLMs), including OpenAI's GPT-4, are capable of compromising vulnerable websites without human guidance, according to researchers.

University of Illinois Urbana-Champaign (UIUC) researchers showed that LLM-powered agents - LLMs provisioned with tools for accessing APIs, automated web browsing, and feedback-based planning - can conduct SQL injection and other malicious attacks on third-party websites without oversight. The test was conducted in a secure sandbox.

GPT-4 proved particularly effective at these tasks, with a success rate of 73.3 percent. OpenAI's GPT-3.5 proved the second most effective model. The researchers were unclear why GPT-4 proved particularly able to conduct malicious security attacks, though one explanation put forward by the researchers was that GPT-4 was better able to change its actions based on the response it got from the target website.


Operator: Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang
Developer: OpenAI
Country: Global
Sector: Multiple
Purpose: Generate text
Technology: Chatbot; NLP/text analysis; Neural network; Deep learning; Machine learning; Reinforcement learning
Issue: Security