Friday, September 20, 2024
Home Tech & Gadgets Meta Releases ‘Largest’ Llama 3.1 AI Model, Beating OpenAI’s GPT-4o

Meta Releases ‘Largest’ Llama 3.1 AI Model, Beating OpenAI’s GPT-4o

by Jeffrey Beilley
0 comments

Meta released its latest and largest artificial intelligence (AI) model to the public on Tuesday. Dubbed Meta Llama 3.1 405B, the company says the open-source model outperforms large closed-loop AI models such as GPT-4, GPT-4o and Claude 3.5 Sonnet in various benchmarks. The previously released Llama 3 8B and 70B AI models have also been upgraded. The newer versions are derived from the 405B model and now offer a context window of 1,28,000 tokens. Meta claims that both models are now the leading open-source large language models (LLMs) for their size.

Announcing the new AI model in a blog afterAccording to the technology conglomerate, “Llama 3.1 405B is the first publicly available model that can compete with the best AI models in terms of advanced capabilities in general knowledge, controllability, mathematics, tool usage, and multilingual translation.”

It is notable that 405B here refers to 405 billion parameters, which can be understood as the number of knowledge nodes of the LLM. The higher the parameter size, the more adept an AI model is at processing complex queries. The context window of the model is 128,000 tokens. It supports English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

The company claims that the Llama 3.1 405B has been evaluated on more than 150 benchmark tests across multiple expertises. Based on the data shared in the post, Meta’s AI model scored 96.8 on the Grade School Math 8K (GSM8K) GPT-4’s 94.2, GPT-4o’s 96.1, and Claude 3.5 Sonnet’s 96.4. It also outperformed these models on the AI2’s Reasoning Challenge (ARC) benchmark for scientific proficiency, Nexus for tool use, and the Multilingual Grade School Math (MGSM) benchmark.

Meta’s largest AI model was trained on over 15 trillion tokens using over 16,000 Nvidia H100 GPUs. One of the major introductions in Llama 3.1 405B is official support for tool calling, allowing developers to use Brave Search for web searches, Wolfram Alpha to perform complex mathematical calculations, and Code Interpreter to generate Python code.

Since the Meta Llama 3.1 405B is available as open source, individuals can access it through the company’s website. website or of his embracing face mentionHowever, since it is a large model, it requires about 750 GB of disk space to run. Inferencing also requires two nodes on Model Parallel 16 (MP16). Model Parallelism 16 is a specific implementation of model parallelism where a large neural network is divided into 16 devices or processors.

In addition to the model being publicly available, it is also available on major AI platforms from AWS, Nvidia, Databricks, Groq, Dell, Azure, Google Cloud, Snowflake and more. The company says that a total of 25 such platforms are powered by Llama 3.1 405B. For safety and security, the company has deployed Llama Guard 3 and Prompt Guards, two new tools that protect the LLM from potential damage and misuse.

You may also like

Leave a Comment

Soledad is the Best Newspaper and Magazine WordPress Theme with tons of options and demos ready to import. This theme is perfect for blogs and excellent for online stores, news, magazine or review sites.

Buy Soledad now!

Edtior's Picks

Latest Articles

u00a92022u00a0Soledad.u00a0All Right Reserved. Designed and Developed byu00a0Penci Design.