What is Gemini in Google? Google just released Gemini 1.0, their most capable and general AI model yet. Built natively to be multimodal, it’s the first step in our Gemini-era of models. Gemini is a new and powerful artificial intelligence model from Google that can understand not just text but also images, videos, and audio. As a multimodal model, Gemini is described as capable of completing complex tasks in math, physics, and other areas, as well as understanding and generating high-quality code in various programming languages. It is currently available through integrations with Google Bard and the Google Pixel 8 and will gradually be folded into other Google services. Also: ChatGPT vs Bing Chat vs Google Bard: Which is the best AI chatbot? “Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research,” according to Dennis Hassabis, CEO and co-founder of Google DeepMind. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video.” Who made Gemini? Gemini was created by Google and Alphabet, Google’s parent company, and released as the company’s most advanced AI model to date. Google DeepMind also made significant contributions to the development of Gemini. Also: Bing’s new Deep Search uses GPT-4 to get you more thorough search results Are there different versions of Gemini? Google describes Gemini as a flexible model that is capable of running on everything from Google’s data centers to mobile devices. To achieve this scalability, Gemini is being released in three sizes: Gemini Nano, Gemini Pro, and Gemini Ultra. Gemini Nano: The Gemini Nano model size is designed to run on smartphones, specifically the Google Pixel 8. It’s built to perform on-device tasks that require efficient AI processing without connecting to external servers, such as suggesting replies within chat applications or summarizing text. Gemini Pro: Running on Google’s data centers, Gemini Pro is designed to power the latest version of the company’s AI chatbot, Bard. It’s capable of delivering fast response times and understanding complex queries. Gemini Ultra: Though still unavailable for widespread use, Google describes Gemini Ultra as its most capable model, exceeding “current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.” It’s designed for highly complex tasks and is set to be released after finishing its current phase of testing. How can you access Gemini? Gemini is now available on Google products in its Nano and Pro sizes, like the Pixel 8 phone and Bard chatbot, respectively. Google plans to integrate Gemini over time into its Search, Ads, Chrome, and other services. Also: I asked DALL-E 3 to create a portrait of every US state, and the results were gloriously strange Developers and enterprise customers will be able to access Gemini Pro via the Gemini API in Google’s AI Studio and Google Cloud Vertex AI starting on December 13. Android developers will have access to Gemini Nano via AICore, which will be available on an early preview basis. How does Gemini differ from other AI models, like GPT-4? Google’s new Gemini model appears to be one of the largest, most advanced AI models to date, though the release of the Ultra model will be the one to determine that for certain. Compared to other popular models that power AI chatbots right now, Gemini stands out due to its native multimodal characteristic, whereas other models, like GPT-4, rely on plugins and integrations to be truly multimodal. Also: Google says Bard is now smarter than ChatGPT, thanks to Gemini update A comparison chart from Google shows how Gemini Ultra and Pro compare to OpenAI’s GPT-4 and Whisper, respectively. Google/ZDNET Compared to GPT-4, a primarily text-based model, Gemini easily performs multimodal tasks natively. While GPT-4 excels in language-related tasks like content creation and complex text analysis natively, it resorts to OpenAI’s plugins to perform image analysis and access the web, and it relies on DALL-E 3 and Whisper to generate images and process audio. Also: The best AI chatbots: ChatGPT and other noteworthy alternatives Google’s Gemini also appears to be more product-focused than other models available now. It’s either integrated into the company’s ecosystem or with plans to be, as it’s powering both Bard and Pixel 8 devices. Other models, like GPT-4 and Meta’s Llama, are more service-oriented, and available for various third-party developers for applications, tools, and services.