Microsoft Unveils Phi-3.5 AI Models, Reportedly Beating Gemini and GPT-4o Mini

by Jeffrey Beilley August 21, 2024

written by Jeffrey Beilley August 21, 2024 0 comments

Microsoft on Tuesday released the Phi-3.5 family of artificial intelligence (AI) models, following up on the Phi-3 models introduced in April. The new release includes the Phi-3.5 Mixture of Experts (MoE), Phi-3.5 Vision, and Phi-3.5 Mini models. These are instructional models, so they won’t work like typical conversational AI, but instead require users to add specific instructions to get the desired output. The open-source AI models are available for download from the tech giant’s Hugging Face listings.

Microsoft Releases Phi-3.5 AI Models

The release of the new AI models was announced by Microsoft executive Weizhu Chen in a post on X (formerly known as Twitter). The Phi 3.5 models offer improved capabilities over their predecessors, but the architecture, dataset, and training methods remain largely the same. The Mini model has been updated with multi-language support, and the MoE and Vision models are new additions to the AI model family.

To get technical, the Phi-3.5 Mini has 3.8 billion parameters. It uses the same tokenizer (a tool that breaks text into smaller units) and a dense decoder-only transformer. The model only accepts text as input and supports a context window of 1,28,000 tokens. The company claims that it was trained using 3.4 trillion tokens between June and August, and that the knowledge threshold is October 2023.

A major highlight of this model is that it now supports several new languages, including Arabic, Chinese, Czech, Danish, Dutch, English, Finnish, French, German, Hebrew, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Thai, Turkish and Ukrainian.

The Phi-3.5 Vision AI model has 4.2 billion parameters and includes an image encoder that allows it to process information in an image. With the same context length as the Mini model, it accepts both text and images as input. It was trained on 500 billion tokens of data between July and August and has a text knowledge cutoff of March.

Finally, the Phi-3.5 MoE AI model has 16×3.8 billion parameters. However, only 6.6 billion of them are active parameters when using two experts. Notably, MoE is a technique where multiple models (experts) are trained independently and then combined to improve the accuracy and efficiency of the model. This model was trained on 4.9 trillion tokens of data between April and August and has a knowledge cutoff date of October 2023.

In terms of performance, Microsoft shared benchmark scores from all of the individual models, and based on the shared data, the Phi-3.5 MoE outperforms both Gemini 1.5 Flash and GPT-4o mini in the SQuALITY benchmark, which tests readability and accuracy when summarizing a long block of text. This tests the AI model’s long context window.

However, it should be noted that it is not a fair comparison, as MoE models use a different architecture and require more storage and more advanced hardware to run. Separately, the Phi-3.5 Mini and Vision models have also outperformed relevant competing AI models in the same segment in some metrics.

Those interested in trying out the Phi-3.5 AI models can reach them via Hugging Face mentions. Microsoft said these models use flash attention, which requires users to run the systems on higher-end GPUs. The company tested them on Nvidia A100, A6000 and H100 GPUs.

Microsoft Releases Phi-3.5 AI Models

Useful Links

Edtior's Picks

Latest Articles

Microsoft Unveils Phi-3.5 AI Models, Reportedly Beating Gemini and GPT-4o Mini

Microsoft Releases Phi-3.5 AI Models

Best Internet Providers in Joliet, Illinois

Democrats Neglect Cyber, But Pledge Data Privacy at DNC

You may also like

Leave a Comment Cancel Reply

Useful Links

Edtior's Picks

Latest Articles