News

Meta reveals Llama 3’s biggest AI model, claiming gains in language and math

Share on facebook
Share on twitter
Share on linkedin
Share on pinterest
Share on telegram
Share on email
Share on reddit
Share on whatsapp
Share on telegram


The model is set to be free, challenging the subscription-based ChatGPT-4

New York:

Meta Platforms launched the biggest version of its mostly free Llama 3 artificial intelligence models on Tuesday, boasting multilingual abilities and general performance metrics that track paid models from rivals like OpenAI.

The new Llama 3 model can converse in eight languages, write higher quality computer code and solve more complex math problems than previous versions, Facebook’s parent company said in blog posts and a research paper announcing the launch.

With 405 billion parameters, or variables that the algorithm takes into account to generate answers to user queries, it surpasses the previous version released last year, although it is still smaller than the main models offered by competitors.

OpenAI’s GPT-4 model, on the other hand, has a trillion parameters and Amazon is preparing a model with 2 trillion parameters.

Promoting the Llama 3 across multiple channels, Chief Executive Mark Zuckerberg said he expects future Llama models to overtake proprietary competitors next year. The Meta AI chatbot built using these models was on track to become the most popular AI assistant by the end of this year, with hundreds of millions of people already using it, he said.

The launch comes as tech companies race to show that their growing portfolios of large, feature-hungry language models can deliver significant enough gains in known problem areas, like advanced reasoning, to justify the gargantuan sums that have been invested in them.

Meta’s lead AI scientist said he believes such models will face limits in reasoning and that other types of AI systems will be needed to produce breakthroughs.

In addition to its flagship 405 billion parameter model, Meta is also releasing updated versions of its lighter 8 billion and 70 billion parameter Llama 3 models, initially introduced in the spring, the company said.

All three new models are multilingual and can handle larger user requests through an expanded “context window,” which Meta’s head of generative AI, Ahmad Al-Dahle, said would improve the code generation experience. computer in particular.

“That was the main feedback we got from the community,” Al-Dahle told Reuters in an interview, noting that larger context windows give models something akin to longer memory that helps with processing multi-step requests.

Separately, Al-Dahle said his team was able to improve the Llama 3 model’s performance on tasks such as solving mathematical problems by using AI to generate some of the data it was trained on.

Meta releases its Llama templates largely free for developers to use, a strategy that Zuckerberg says will pay off in the form of innovative products, less dependence on would-be competitors, and greater engagement on the company’s key social networks. However, some investors have raised eyebrows at the costs involved.

The company will also benefit if developers choose to use its free models instead of paid ones, which would undermine its rivals’ business models. With its announcement, Meta touted gains on key math and knowledge tests that could make that prospect more appealing.

While measuring progress in AI development is notoriously difficult, test results provided by Meta seemed to suggest that its largest Llama 3 model was nearly matching, and in some cases surpassing, Anthropic’s Claude 3.5 Sonnet and OpenAI’s GPT-4o , which are widely considered to be the two most powerful Frontier models on the market.

On the MATH benchmark of competition-level math word problems, for example, Meta’s model recorded a score of 73.8, compared to GPT-4o’s 76.6 and Claude 3.5 Sonnet’s 71.1.

The model scored 88.6 on MMLU, a benchmark that covers dozens of disciplines in mathematics, science and humanities, while GPT-4o scored 88.7 and Claude 3.5 Sonnet scored 88.3.

In their paper, Meta researchers also teased upcoming “multimodal” versions of the models coming later this year, which layer image, video, and speech capabilities on top of Llama 3’s main text model.

Early experiments indicate that these models can work “competitively” with other multimodal models, such as Google’s Gemini 1.5 and Anthropic’s Claude 3.5 Sonnet, they said.

(Except the headline, this story has not been edited by NDTV staff and is published from a syndicated feed.)



This story originally appeared on Ndtv.com read the full story

Support fearless, independent journalism

We are not owned by a billionaire or shareholders – our readers support us. Donate any amount over $2. BNC Global Media Group is a global news organization that delivers fearless investigative journalism to discerning readers like you! Help us to continue publishing daily.

Support us just once

We accept support of any size, at any time – you name it for $2 or more.

Related

More

1 2 3 9,595

Don't Miss

Argentine economy snapshot: faltering recovery after stagnation

Argentine economy snapshot: faltering recovery after stagnation

By Hernán Nessi BUENOS AIRES (Reuters) – Argentina’s struggling economy
Khloe Kardashian reveals ‘extreme’ diet when she was ‘overweight’ and admits she ‘cried’ after eating

Khloe Kardashian reveals ‘extreme’ diet when she was ‘overweight’ and admits she ‘cried’ after eating

KHLOE Kardashian revealed the extreme dieting methods she went through