Gemini 2.0 Flash AI model now available on web and mobile apps
Google introduced the successor to the Gemini 1.5 family of AI models on Wednesday, called Gemini 2.0. The new AI models come with enhanced capabilities, including native support for image and audio generation, the company points out. Currently, the Gemini 2.0 model is available in beta for select developers and testers, while the Gemini 2.0 Flash AI model has been added to the chatbot’s web and mobile apps for all users. Google said the larger model will also be pushed to its products soon.
Google Gemini 2.0 AI models
Nine months after the release of the Gemini 1.5 series of AI models, Google has now introduced the enhanced version of the Large Language Model (LLM). In one blog postthe company announced that it was releasing the first model in the Gemini 2.0 family: an experimental version of Gemini 2.0 Flash. The Flash model generally contains fewer parameters and is not suitable for complex tasks. However, it compensates for this with low latency and higher efficiency than larger models.
The Mountain View-based tech giant highlighted that the Gemini 2.0 Flash now supports multimodal output, such as generating images with text and steerable text-to-speech (TTS) multilingual audio. In addition, the AI model is also equipped with agentic functions. 2.0 Flash calls native tools like Google Search, code execution tools, and third-party functions as soon as a user defines them through the API.
In terms of performance, Google shared Gemini 2.0 Flash’s benchmark scores based on internal testing. On the Massive Multitask Language Understanding (MMLU), Natural2Code, MATH, and Graduate-Level Google-Proof Q&A (GPQA) benchmarks, it even outperforms the Gemini 1.5 Pro model.
Gemini users can select the experimental model via the model selector option at the top left of the web and at the top of the mobile app interface. In addition, the AI model is also available via the Gemini application programming interface (API) in Google AI Studio and Vertex AI. The model will be available to developers with multimodal input and text output. Image and text-to-speech capabilities are currently only available to Google’s early access partners.