Live video feature of ChatGPT spotted, may be released soon
ChatGPT may soon get the ability to answer questions after looking through your smartphone’s camera. According to a report, evidence for the Live Video feature, which is part of OpenAI’s Advanced Voice Mode, was discovered in the latest beta app ChatGPT for Android. This capability was first demonstrated in May during the AI company’s Spring Updates event. This allows the chatbot to access the smartphone’s camera and answer questions about the user’s environment in real time. Although the emotional voting feature was released a few months ago, the company has not yet announced a possible release date for the Live Video feature.
ChatGPT Live Video feature discovered in the latest beta version
An Android authority report detailed evidence of the Live Video feature, which was found during the disassembly process of an Android package kit (APK) of the app. Several strings of code related to the capability were found in the ChatGPT for Android beta version 1.2024.317.
Notably, the Live Video feature is part of ChatGPT’s advanced voice mode and lets the AI chatbot process video data in real-time to answer questions and communicate with the user in real-time. This allows ChatGPT to look inside a user’s refrigerator, scan ingredients and suggest a recipe. It can also analyze the user’s expressions and try to gauge their mood. This was accompanied by the emotional voice capability which allows the AI to speak in a more natural and expressive way.
According to the report, multiple strings of code related to the feature have been spotted. One of these strings reads: “Tap the camera icon to let ChatGPT see and talk about your surroundings,” which is the same description OpenAI gave for the feature during the demo.
Other strings reportedly include phrases like “Live camera” and “Beta,” highlighting that the feature can work in real time and that the underdeveloped feature will likely be released to beta users first.
Another code string also includes an advisory for users not to use the Live Video feature for live navigation or decisions that could affect the health or safety of users.
While the existence of these strings doesn’t point to the feature’s release, after an eight-month delay, this is the first time that compelling evidence has been found that the company is working on the feature. Previously, OpenAI claimed the feature was being delayed to protect users.
Notably, Google DeepMind also demonstrated a similar AI vision feature at the Google I/O event in May. This feature is part of Project Astra and gives Gemini the ability to see the user’s surroundings using the device’s camera.
In the demo, Google’s AI tool was able to correctly identify objects, infer current weather conditions, and even remember objects it had seen previously in the live video session. So far, the Mountain View-based tech giant has also not provided a timeline on when this feature might be introduced.