OpenAI supported by Elon Musk reveals the Dall-E picture generator in accordance with GPT-3

SpaceX founder Elon Musk attends a post-launch press conference after the SpaceX Falcon 9 rocket launched from the Kennedy Space Center in Cape Canaveral, Florida on an unscrewed test flight to the International Space Station on the Crew Dragon spacecraft on March 2, 2019 .

Mike Blake | Reuters

Armchairs in the shape of avocados and baby daikon radishes with tutus are among the quirky images created with new software from OpenAI, an Elon Musk-supported artificial intelligence laboratory in San Francisco.

OpenAI trained the software known as Dall-E to generate images from short text captions. Specifically, it used a data set of 12 billion images and their captions found on the Internet.

The lab said Dall-E – a portmanteau by Spanish surrealist artist Salvador Dali and Wall-E, a small animated robot from the Pixar movie of the same name – learned to create images for a variety of concepts.

OpenAI showed some of the results in a blog post published on Tuesday. “We found that [Dall-E] has a number of capabilities, including creating anthropomorphized versions of animals and objects, plausibly combining unrelated concepts, rendering text, and applying transformations to existing images, “the company wrote.

Dall-E is based on a neural network, a computer system vaguely inspired by the human brain that can recognize patterns and identify relationships between huge amounts of data.

While neural networks have previously generated images and videos, Dall-E is unusual in that it relies on text input while the others don’t.

Synthetic videos and images have become more complex in recent years as it has become difficult for humans to distinguish between the real and the computer generated. For example, General Adversarial Networks (GANs), which use two neural networks, have been used to create fake videos of politicians.

OpenAI acknowledged that Dall-E has “the potential for significant broad societal impacts” and plans to analyze how models such as Dall-E “relate to societal issues such as economic impact on certain work processes and occupations, and the potential for bias the model results and the longer term ethical challenges this technology poses. “

GPT-3 successor

Dall-E comes just a few months after OpenAI announced that they have built a text generator called GPT-3 (Generative Pre-Training), which is also supported by a neural network.

The speech generation tool is able to produce human-like text if necessary. It became relatively famous for an AI program when people realized it could write its own poems, news articles, and short stories.

“Dall-E is a Text2Image system that is based on GPT-3, but is trained on text and images,” said Mark Riedl, associate professor at Georgia Tech School of Interactive Computing, told CNBC.

“Text2image isn’t new, but the Dall-E demo is remarkable for producing illustrations that are much more coherent than other Text2Image systems I’ve seen over the years.”

OpenAI has competed with companies like DeepMind and the Facebook AI Research Group to develop general-purpose algorithms that can perform a wide range of tasks at the human level and beyond.

Researchers have developed AIs that can play complex games like chess and the Chinese board game Go, translate one human language into another, and detect tumors on a mammogram. However, getting an AI system to show real “creativity” is a major challenge in the industry.

Riedl said the Dall-E results showed it had learned to mix concepts coherently, adding that “the ability to mix concepts coherently is seen as a key form of creativity in humans”.

“From a creativity standpoint, this is a big step forward,” added Riedl. “While there isn’t much agreement on what it means for an AI system to ‘understand’ something, the ability to use concepts in new ways is an important part of creativity and intelligence.”

Neil Lawrence, former director of machine learning at Amazon Cambridge, told CNBC that Dall-E looks “very impressive.”

Lawrence, who is now a professor of machine learning at Cambridge University, described it as “an inspiring demonstration of the ability of these models to store and generalize information about our world in ways that people find very natural”.

He said, “I assume there will be all kinds of uses of this type of technology that I can’t even imagine. But it’s also interesting to be another pretty mind-blowing technology that solves the problems that we have have not resolved. ” I even know that we actually had it. “

“Doesn’t improve the state of the AI”

Not everyone is that impressed with Dall-E, however.

Gary Marcus, an entrepreneur who sold a machine learning start-up to Uber for an undisclosed sum in 2016, told CNBC that it was interesting but “didn’t advance the state of AI.”

He also pointed out that it is not from open sources and the company has not yet published any paper on the research.

Marcus previously questioned whether some of the research published in recent years by the competitor’s DeepMind lab should be classified as “breakthroughs”.

OpenAI was founded as a nonprofit with a $ 1 billion commitment by a group of founders including Elon Musk, CEO of Tesla. In February 2018, Musk left the OpenAI board, but continues to donate and advise the organization.

OpenAI turned for-profit in 2019, raising an additional $ 1 billion from Microsoft to fund its research. GPT-3 will be OpenAI’s first commercial product and Reddit signed up as one of the first customers.