Tech

Elon Musk’s xAI is working to make Grok multimodal

Share on facebook
Share on twitter
Share on linkedin
Share on pinterest
Share on telegram
Share on email
Share on reddit
Share on whatsapp
Share on telegram


Elon Musk’s AI company xAI is making progress adding multimodal inputs to its Grok chatbot, according to public developer documents. What this means is that users will soon be able to upload photos to Grok and receive text responses.

This was first teased in a blog post last month from xAI, which said the Grok-1.5V will offer “multi-modal models across multiple domains.” The latest update to the developer docs appears to show progress in submitting a new model.

In the developer docs, an example Python script demonstrates how developers can use the xAI Software Development Kit library to generate a text- and image-based response. This script reads an image file, sets up a text prompt, and uses the xAI SDK to generate a response.

This is a major update for Grok, which xAI first released in November 2023 and is available to users who pay for the X Premium Plus subscription. The last update was Grok 1.5 in March, which came with improved reasoning features.

The model is trained “on a variety of text data from publicly available internet sources through Q3 2023 and datasets reviewed and curated by… human reviewers,” according to a blog post of X. Grok-1 was not trained on X data (including public posts from X), the blog added. However, Grok has “real-time knowledge of the world,” including posts on X.

xAI, founded by Elon Musk in March 2023, is relatively new to the AI ​​field and lags behind competitors like OpenAI’s ChatGPT. However, according to a blog post from xAI, its Grok 1.5 model is closing the gap with GPT-4 on several benchmarks covering a wide range of competition problems from elementary to high school. It is important to note that benchmarks for large language models are often criticized because models can perform well on benchmarks if those benchmarks are included in their training data. It’s like memorizing test answers instead of actually learning the material.

Multimodal conversational chatbots appear to be the next frontier for AI, with several advances announced at Google I/O and OpenAI releasing GPT-4o, so Grok’s lack of multimodal capabilities has put it behind – until now.



Source link

Support fearless, independent journalism

We are not owned by a billionaire or shareholders – our readers support us. Donate any amount over $2. BNC Global Media Group is a global news organization that delivers fearless investigative journalism to discerning readers like you! Help us to continue publishing daily.

Support us just once

We accept support of any size, at any time – you name it for $2 or more.

Related

More

1 2 3 6,200

Don't Miss

Trump gives GOP moderates ‘very helpful’ breathing room on abortion

Former President Trump is giving moderate Republican lawmakers some breathing

Modi magic: Why Indian exit polls predict record BJP victory | India Election 2024 News

New Delhi, India – India’s 73-year-old Prime Minister Narendra Modi