If you’ve been using artificial intelligence (AI) for a while, you may have noticed that both the text and images it creates can be a bit generic and polished. Some would call them soulless.
Or, as Professor Gunhild Kvåle from the University of Agder puts it:
“ChatGPT has a voice that gives me a special itch.”
Together with colleague Gustav Westberg from Örebro University in Sweden, she investigated how the AI tool Dall-E 3 creates images of teenagers. Dall-E is an image generator built into ChatGPT.
One of the goals was to find out where this itching came from. What creates this artificial quality in AI images?
They found that the images have four common characteristics. Some of these give cause for concern, the researchers believe.
Surface level diversity
“It is striking how Dall-E pays attention to ethnic and gender diversity on the one hand, while on the other hand the images are not very diverse,” says Kvåle.
To obtain a sample of images that was representative and could be compared, the researchers gave general instructions such as “take pictures of teenagers.” They also asked the AI to generate its own text prompts to create the images.
“You can see that the teenagers in the images represent different ethnicities. It was also striking how strongly diversity was emphasized when ChatGPT wrote the prompts itself,” says Kvåle.
This suggests that the companies behind the technology have taken on board the criticism of the lack of diversity in previous versions.
But:
“The young people in the images look remarkably similar. They all wear jeans and Converse shoes, are beautiful, have nice features and voluminous hair, and none of them have pimples. This is a specific social category of young, successful, beautiful teenagers,” says Kvåle.
Happiness sells
The other common feature was that the images are very positive. The young people are depicted studying together, attending concerts, roasting marshmallows or – absurdly – collaborating on a local community garden.
“Everyone is happy in the images, no one is sad. They undertake activities that are valued by society. But none of the images show them working or sleeping. This is the free time of upper-middle-class youth,” says Kvåle.
There are also some standards written into this positive view. All people are thin, and even direct evidence cannot change this.
“The technology places clear limits on the types of images that are possible. This is not entirely positive, although the intentions are good. We can see it in the context of the culture we live in, where the texts and images we share are not only intended to objectively inform, but also to promote ourselves.”
Almost real, but not quite
Kvåle notes that the lighting and the way things are placed in the foreground or background mimic photography. This was the third common feature they identified.
The settings depicted in the images range from parks and youth clubs to concert stages and messy teenage bedrooms.
“The images give an impression of authenticity, but the context also positions these young people socially. They are never depicted at work or in urban settings related to social issues,” she says.
The limitations of the imagination
The researchers’ fourth discovery is how AI can depict imaginative situations, such as young people skating in a snow globe at the North Pole.
However, this is not the norm in Dall-E, it is something you have to specifically ask for.
“Photorealistic images are clearly Dall-E’s preference. Sometimes it turns into graphic illustrations, but photorealism is clearly the standard,” Kvåle explains.
Critical consciousness
Image generators like Dall-E 3 have become highly capable and accessible to everyone. However, we are not drowning in AI-generated images. According to Kvåle, there is actually surprisingly little of it.
“It is said that everything will change with artificial intelligence, but that is clearly not true. Newsrooms, communications departments and institutions use industry standards. Not everything changes overnight just because it is possible,” she says.
At the same time, the researchers were surprised by how similar the images were.
“Services like Dall-E can have a strong influence on our visual culture. We owe it to each other to look critically at these images, because they do not represent what we want our society to look like,” says Kvåle, and concludes:
“And that makes the world a bit more boring.”