The latest in technology, Marketing and Startups.

Google Gemini: All about the new generative AI platform

Google launched Gemini, a generative AI platform, to create waves. Gemini is promising in certain ways but lacking in others. How about Gemini? Use it how? Compare it to the competitors.

This guide will be updated when new Gemini models and features are introduced to make it easy to stay up-to-date with Gemini news.

What’s Gemini?
DeepMind and Google Research produced Gemini, Google’s long-promised next-gen generative AI model family. It has three flavors:

The Gemini family includes the flagship Gemini Ultra, the “lite” Gemini Pro, and the “lite” Gemini Nano, which is compatible with mobile devices like the Pixel 8 Pro.
All Gemini models were taught to be “natively multimodal”—able to utilize more than text. They were pre-trained and fine-tuned on audio, photos, videos, codebases, and multilingual text.

Gemini differs from Google’s massive language model, LaMDA, which was trained primarily on text data. Gemini models can interpret and create non-text documents like essays and email drafts, but LaMDA cannot. Their visual, audio, and other comprehension is still limited, but it’s better than nothing.

The difference between Bard and Gemini?
Google, again showing its branding incompetence, didn’t distinguish Gemini from Bard from the start. Gemini and other GenAI models may be accessible using Bard, which is an app or client. However, Gemini is a model family, not an app or front end. Gemini has no solo experience and presumably never will. Bard represents ChatGPT, OpenAI’s popular conversational AI software, while Gemini represents its language model, GPT-3.5 or 4.

Gemini is also independent of Imagen-2, a text-to-image model that may fit within the company’s AI approach. Don’t worry—you’re not alone in confusion!

What can Gemini do?
Transcribing speech, captioning photos and videos, and creating artwork are all possible with multimodal Gemini models. Only a few of these features are in products today (more on that later), but Google promises all of them and more soon.

Naturally, the company’s word is hard to believe.

The original Bard rollout by Google was disappointing. A recent video purportedly showing Gemini’s powers was significantly doctored and ambitious. Despite its restricted availability, Gemini is accessible through the tech giant.
Assuming Google is telling the truth, the various Gemini model tiers will be able to perform the following:

Gemini Ultra
Only a “select set” of users across a number of Google products and services have access to Gemini Ultra, the “foundation” model on which the others are based. Later this year, Google’s biggest model will debut worldwide. Most Ultra information comes from Google-led product demonstrations, so be skeptical.

Google believes Gemini Ultra can aid with physics homework, worksheet difficulties, and identifying errors in already-filled-in solutions. According to Google, Gemini Ultra may be used to locate scientific publications relevant to an issue, extract information from them, and “update” a chart by producing the formulae needed to reproduce it with more current data.

Gemini Ultra can generate images, as said. Google says the feature won’t be in the productized model at launch, maybe because the method is more complicated than how ChatGPT generates photos. Gemini generates graphics “natively” without feeding instructions to an image generator like ChatGPT’s DALL-E 3.

Gemini Pro
Gemini Pro is public, unlike Ultra. Its capabilities vary with usage, which is perplexing.
Google claims that Gemini Pro outperforms LaMDA in thinking, planning, and comprehension in Bard, where it premiered originally in a text-only version. An independent Carnegie Mellon and BerriAI investigation demonstrated that Gemini Pro outperforms OpenAI’s GPT-3.5 in longer and more complicated reasoning chains.

The research also discovered that, like other big language models, Gemini Pro suffers from multi-digit arithmetic issues, and users have uncovered many instances of faulty reasoning and blunders. It made several factual mistakes for basic questions like Oscar winners. Google promises changes, but when is unclear.

Google Vertex AI, a fully managed AI developer platform that receives text input and outputs text, offers Gemini Pro via API. Gemini Pro Vision, another endpoint, processes text and images (pictures and video) and outputs text like OpenAI’s GPT-4 with Vision model.

By “grounding” Gemini Pro in Vertex AI, developers may tailor it to particular use cases. Gemini Pro may also use third-party APIs to accomplish activities.

At some point in “early 2024,” Vertex clients may use Gemini Pro to power custom-built chatbots. Gemini Pro will also drive search summarization, recommendation, and response creation in Vertex AI, using PDFs, pictures, OneDrive, Salesforce, and other sources to answer inquiries.
Google’s AI Studio online app and platform development tool has Gemini Pro workflows for freeform, structured, and conversational prompts. Developers may alter the model temperature to regulate the output’s creative range, offer examples to provide tone and style directions, and tweak safety settings using Gemini Pro and Gemini Pro Vision endpoints.

Gemini Nano
The Gemini Nano is a smaller version of the Gemini Pro and Ultra variants and is efficient enough to operate on mobile phones instead of a server. So far, it powers two Pixel 8 Pro features: Summary in Recorder, Smart Reply on Gboard.

The Recorder app, which records and transcribes audio with a button, offers a Gemini-powered summary of chats, interviews, presentations, and other snippets. As a privacy measure, no data leaves the phone when users get these summaries without a signal or Wi-Fi.

The developer preview of Google’s keyboard software, Gboard, includes Gemini Nano. In messaging apps, it drives Smart Reply, which suggests the next thing to say. First available on WhatsApp, Google claims it will expand to other applications in 2024.

Is Gemini better than OpenAI GPT-4?
The Gemini family won’t be fully tested until Google launches Ultra later this year, but the firm claims improvements over OpenAI’s GPT-4.

Google has repeatedly claimed that Gemini Ultra outperforms “30 of the 32 widely used academic benchmarks used in large language model research and development.” Gemini Pro, according to the firm, is better at summarizing, brainstorming, and writing than GPT-3.5.

The results Google cites are just slightly better than OpenAI’s models, putting aside the issue of whether benchmarks signify a superior model. As indicated, users and scholars have complained that Gemini Pro gets fundamental information incorrect, has problems with translations, and offers poor code advice.

Will Gemini cost?
Gemini Pro is free in Bard, AI Studio, and Vertex AI for now.

After Gemini Pro leaves Vertex Preview, the model will cost $0.0025 per character and produce $0.00005 per character. Vertex charges per 1,000 characters (140–250 words) and every picture ($0.0025) for products like Gemini Pro Vision.

Suppose a 500-word article has 2,000 characters. Summarizing that article using Gemini Pro costs $5. A similar-length article costs $0.1.

Where to try Gemini?
Gemini Pro
Gemini Pro is the easiest in Bard. A fine-tuned version of Pro answers text-based Bard inquiries in English in the U.S., with other languages and countries coming.

Gemini Pro is previewable via API in Vertex AI. The API supports 38 languages and locations, including Europe, chat, and filtering, and is free “within limits” for now.

Gemini Pro is in AI Studio somewhere. Developers may create prompts and Gemini-based chatbots, acquire API keys to use in their applications, or export the code to a more powerful IDE using the service.

In the next few weeks, Google’s Duet AI for Developers will use a Gemini model for code completion and creation. Google wants to add Gemini models to Chrome and Firebase dev tools in early 2024.

Gemini Nano
The Pixel 8 Pro has a Gemini Nano, which will be available on other smartphones. Android app developers may get an early preview of the model.

Eltrys Team
Author: Eltrys Team

Share this article
0
Share
Shareable URL
Prev Post

Isomorphic signs drug discovery partnerships with Eli Lilly and Novartis.

Next Post

Telehealth vitals are checked using Withings’ new multiscope.

Leave a Reply

Your email address will not be published. Required fields are marked *

Read next
Subscribe to our newsletter
Get notified about our latest news and insights