Google Gemini will soon let you edit those AI-generated images to fix the three-eyed dogs and impossible buildings
Artificial intelligence can produce impressive images, but it’s not uncommon for these images to have strange problems, like people with too many teeth or cityscapes with Escher-style streets. Google Gemini is working on upgrading its AI image rendering feature to fix these types of issues, as the first spotted in unfinished code by Android Authority. It appears that a fine-tuning feature is on the way, allowing users to make detailed edits to their AI-generated images.
Google Gemini’s text-to-image tools currently can’t perform edits after the image has been created. Instead, users must submit new prompts, hoping that the new prompt will fix any issues and create something that matches what they want to see. That can be especially annoying if there’s even a minor, yet distracting, error. According to the revealed code, Gemini’s fine-tuning feature will address the need for limited changes with two editing methods.
The first option allows users to submit a prompt about an AI-generated image and ask for a change to one aspect. For example, if you liked the image above but wanted to place it in a city, you could keep the robot and the bird but change the background by asking Gemini to move them. The second method described in the code takes a more interactive approach. Users can circle the part of the image they want to change with their finger or a stylus. Once the area is selected, they can describe the changes they want to make and Gemini will understand that the instructions only apply to the circled area.
AI editing success
These editing tools could be particularly useful for those in industries like graphic design, marketing, and social media, where visual accuracy and fast turnaround times are crucial. Google Gemini could better serve the needs of artists, designers, and everyday users who want to create polished visual content more efficiently. While the exact release date for these features is uncertain, their appearance in the code suggests it’s not far off. It also fits well with related features like the upcoming Ask Photos image search feature.
Google won’t be the first to deploy editing tools for AI image makers. These methods are largely the same as those available with OpenAI’s Dall-E portfolio of AI image models. In ChatGPT, users can request adjustments to an already-produced image, or they can highlight parts of it and submit a new text prompt to adjust that part of the image. Similar features exist for many AI image makers, such as Ideogram.ai and Adobe Firefly. Still, Google’s plan to integrate these fine-tuning tools is a technical leap forward for Gemini. It marks Google’s continued push to match and surpass its rivals at OpenAI, Meta and elsewhere when it comes to generative AI tools.