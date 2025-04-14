Moscow, April 14 (IANS) Sber's GigaChat 2.0 is now available to all users, the company announced on Monday, adding that thanks to a new approach to training, the level of all skills of the model has increased significantly.

The artificial intelligence (AI) has learned to recognise audio files, analyse user requests more deeply, process larger volumes of text, and recognise images, according to the company.

All GigaChat features are available in one product and on any interface, so the user does not have to switch between different services.

The model range includes two versions: GigaChat 2 Pro and GigaChat 2 Max. Max is the most advanced model for solving complex and professional tasks, while Pro is suitable for quick and high-quality solutions to everyday tasks, from getting answers to various questions to creating and editing texts.

According to the company, GigaChat 2.0 now knows how to work with current data from the Internet. It analyses queries more deeply and provides concise answers with links to sources. The artificial intelligence finds the information for the user, filters out the most relevant information, and supports its conclusions with links that can be used if the user needs additional information.

For example, you can ask the model: "Where to go in St. Petersburg this weekend with children aged 7 and 12?"; "How much will it cost to repair a standard one-room apartment in Moscow?".

Now you can work with multiple files in one conversation. A document of up to 200 A4 pages can be uploaded to the chat. Prompt sample: "What should I pay attention to in the lease agreement? Focus on the laws of the Russian Federation". You should also attach the contract itself.

GigaChat 2.0 processes audio files on a fundamentally new level. The model perceives audio data directly, without intermediate conversion to text. This allows it to more accurately highlight key points and answer questions about the content.

"Just attach a recording and formulate a query. It supports files up to 60 minutes long and 30 MB. And if typing is inconvenient or impossible, you can record a voice message. GigaChat 2.0 can communicate in different languages, understand complex terms better, and recognise spoken language and accents, as well as background noise and music," said the company.

Sample prompts: "Listen to the audio recording and tell me what in my words my colleague might not have liked"; "Write a list of medications and recommendations from my doctor's voice message"; "Listen to the recording of the video call and write down everything that was said about outdoor advertising"; "Help me structure my speech for a project presentation. [text of speech]".

Now all you have to do is upload links to the content you want, and GigaChat will extract important information. The model creates short summaries of website content, compares articles on the same topic, works with multiple links simultaneously, and recognises images from websites.

Sample prompt: "Help me prepare for an interview for this job."

GigaChat 2.0 can also process videos from links. By understanding the audio track, the model can explain the main point of a video essay or answer questions about a lecture (also works with English and other languages). Sample prompt: "What is this video about? link".

The ability to generate music and songs by text prompts with GigaChat has reached a new level. The maximum song length is now up to 3 minutes, and the generation time is the same (about 1 minute). The team has improved the relevance of the final generation to the prompt, the sound quality, and improved the generation of songs in Chinese.

Sample prompt: Click "Generate a song", enter the lyrics or theme to be generated, select a genre or describe your own, for example: "A song in the style of modern youth pop music. Use pulsating bass, bright synths, and a tight beat".

The model can analyse and extract more useful information from an image and give more accurate answers about its content. For example, it can advise on what style of clothing to choose for a particular case, help solve an equation from a textbook or interpret medical test results.

Sample prompt: "I received a bill for housing and utilities. Can you explain what I am paying for?

For the first time in Russia, smart speakers have been fully integrated with a large language model, bringing their intellectual capabilities to a whole new level.

GigaChat conducts a live dialogue with the user in a language they understand or in each role, maintaining the thread of the conversation up to 10 times longer.

For example, it can explain the theory of relativity in simple terms to a child or give the weather forecast on behalf of a movie awards presenter.

Now this artificial intelligence not only manages the dialogue but also applies skills such as music or reminders. You can also set multiple commands at once in one query, and the speaker will switch between them independently.

Interaction with the assistant is also now tailored to the user's preferences, with 18 combinations of settings available, including communication style, assistant’svoice, addressing the user formally or informally.

Prompt samples: "Hi, I drew a giraffe, but it looks boring. What can I add?", "Salute, explain the theory of relativity to a seven-year-old", "Salute, set your alarm for 6 a.m. every day and play some workout music."

One of the first platforms where GigaChat 2.0 appeared was the Russian digital platform MAX by VK. It is an application with a built-in messenger, mini-app, chatbot builder, online registration system, and payment service.

“Using Sber's neural network model, MAX users can create texts and images, transcribe audio, get short retellings of videos, articles, and answers to many questions. To evaluate the capabilities of GigaChat, you need to search for @gigachat and then follow the instructions,” said the company.

