Google's chatbot, Gemini (formerly Bard), is offering a free AI image generator upgrade.
To enhance its competitiveness against ChatGPT, Gemini is introducing new features based on its proprietary Imagen 2 model. The features include free image generation capabilities and an enhanced version called Gemini Pro, aiming to expand Gemini AI's user base.
VentureBeat reported Google's claim that the Imagen 2 model can generate highly customized, high-quality, and photorealistic images based on textual descriptions. This puts Gemini in direct competition with OpenAI's subscription-based ChatGPT Plus, which pairs with the DALL-E 3 image generation model.
Users can report any generated media for copyright or data protection issues. It is hoped that this will assuage infringement concerns.
AI-generated pitfalls
Advancements in AI image generation have received a mixed reception. Many champion its cause, while others caution against potential challenges and hidden dangers, which have already begun to reveal themselves.
Jason M. Allen's prize-winning AI-generated artwork drew scorn from artists. Fellow artist and X user @GenelJumalon said, "Someone entered an art competition with an AI-generated piece and won the first prize. Yeah that's pretty f*****g sh***y," per The New York Times.
The UK news publication The Telegraph reported that thousands of child abuse were discovered in a database for artificial intelligence systems. Its discovery prompted fears that AI tools have been trained on the illegal images.
Google's solution
Google restricts the generation of offensive, explicit, or violent images. It utilizes SynthID, developed by Deepmind, to embed digital watermarks in the pixels of the output images and adds IPTC metadata, allowing viewers to distinguish between images generated by Google AI and those created by humans.
Gemini comes in three versions: Nano, which supports mobile devices; Pro, which Google says is the best model for scaling across a wide range of tasks; and Ultra, designed to surpass GPT-4 in functionality and outperform other LLMs.
Gemini Pro supports medium-sized use cases but lags behind OpenAI's older GPT-3.5 Turbo in third-party evaluation. The fine-tuned version of Pro supporting only English was released to overturn the competitive disadvantage in generative AI.
The enhanced Gemini Pro supports over 40 languages, including Italian, Korean, Russian, Spanish, and Tamil. It is available in more than 230 countries and regions to serve more consumers.
It provides Gemini with review capabilities based on web search confirmation. It also boasts advanced encoding, reasoning, summarization, and understanding abilities.
Google has begun testing ImageFX, an independent image generator based on the Imagen 2 model, and is already available on Google's AI Test Kitchen app. It offers users relevant aspects and suggestions through an "expressive chips" interface to optimize textual descriptions and stimulate creativity. Competitors like Ideogram also offer similar functionality.