AI art generators have emerged as a fascinating and increasingly popular trend in the world of digital art. These innovative systems utilize artificial intelligence algorithms to create original artworks, blurring the lines between human creativity and machine-generated art.
The rise of AI art generators can be attributed to advancements in machine learning and deep neural networks. These algorithms are trained on vast datasets of existing artworks, enabling them to learn patterns, styles, and techniques employed by human artists.
One of the notable aspects of AI art generators is their ability to produce diverse styles and genres. They can mimic the brushstrokes of famous painters, create abstract compositions, generate realistic portraits, or even craft entirely new and surreal art forms.
Now, Meta has introduced its latest AI art generator called CM3Leon, which the company claims achieves state-of-the-art performance for text-to-image generation, according to a report from TechCrunch.
CM3Leon's Capabilities
According to Meta, CM3Leon's capabilities enable image generation tools to produce more coherent and contextually aligned imagery based on input prompts. The company sees this as a step towards achieving higher-fidelity image generation and understanding.
Unlike other modern image generators that rely on diffusion, such as OpenAI's DALL-E 2 and Google's Imagen and Stable Diffusion, CM3Leon is a transformer model.
By employing the "attention" mechanism, CM3Leon effectively evaluates the relevance of input data, be it text or images. This unique feature of transformers not only improves the speed of model training but also facilitates parallelization, resulting in enhanced computational efficiency.
Meta asserts that CM3Leon surpasses the efficiency of most transformers, demanding less computational resources and a smaller training data set. This advantage positions CM3Leon as an appealing option for various applications, particularly those that necessitate real-time responsiveness.
7 Billion Parameters
Meta employed a vast collection of licensed images from Shutterstock to train CM3Leon. The most advanced iteration of CM3Leon developed by Meta boasts an impressive parameter count of 7 billion, surpassing that of DALL-E 2.
Supervised fine-tuning (SFT) played a crucial role in enhancing CM3Leon's performance. Meta employed this technique, which has been successfully used in training text-generating models like OpenAI's ChatGPT, to improve CM3Leon's capabilities in both image generation and image caption writing.
As a result, CM3Leon can now answer questions about images and edit them based on text instructions.
Although Meta's CM3Leon marks a notable achievement in AI art generation, it is interesting to consider that OpenAI had previously ventured into using transformers for image generation through its Image GPT model.
However, OpenAI decided to pivot its focus towards diffusion-based methods and is now exploring the concept of "consistency" as a potential avenue for future advancements in the field.