Called Imagen, the program takes in text — for example, “a photo of a Persian cat wearing a cowboy hat and red shirt playing a guitar on a beach” — and outputs a result. Imagen can produce images that are photorealistic or an artistic rendering.
Imagen follows other text-to-image generators such as DALL-E, VQ-GAN+CLIP and Latent Diffusion Models. When asked to compare images created by Imagen and other text-to-image generators, Google said, people found its model outperformed competitors in accuracy and image fidelity. Google shared several examples of text prompts and the resulting images created by the AI on its Imagen website — including gems such as “A cute corgi lives in a house made out of sushi” — but these may only represent the best results generated. Google declined to comment for this story.