Generative AI’s Shifting Attention

Jan 15, 2024

It is October 2022.

I am writing an article about a process called optography that came to prominence in the second half of the 19th century. Optography is a photographic process that involves the utilization of the eye of a living being instead of a camera and their retina instead of a traditional light-sensitive surface. A regenerative light-sensitive protein (today known as rhodopsin) was discovered in the retina of frogs by physiologist Franz Christian Boll in 1876. Inspired by Boll’s finding, another physiologist, Wilhelm Friedrich Kühne, further experimented with rabbits. Kühne aimed to see if the eye of a living being could be used as a camera and the rhodopsin-covered retina as a photographic plate to capture the last vision of the said living being.

As my article progressed, I felt the need to somehow visualize this bizarre 19th-century story. With the suggestion of a friend, I consulted Midjourney for this endeavor. In October 2022, Midjourney had been accessible to mass users for only three months, and compared to today, it was still under the radar of the wider public. In order to draw more interest to the platform, Midjourney was offering 25 free GPU minutes to each user at the time. This meant that users could generate approximately 25-30 images free of charge. It was a good opportunity, so I started to experiment with it to visualize the story I was writing about.

One of the text prompts I wanted to turn into an image was “a bearded 19th-century scientist in a laboratory examining a rabbit's eye, black and white photography, wet plate collodion.” I wanted this image to look like it was a photograph from the 1870s. As the leading technique of producing photographic negatives on glass from 1850 onwards, I wanted the image to carry the characteristics of a wet plate collodion. As a result of my prompt, I received the following variations from Midjourney.

“a bearded 19th-century scientist in a laboratory examining a rabbit's eye, black and white photography, wet plate collodion” prompt visualized by Midjourney in October 2022.

The most eye-catching aspects of these images were the various distortions observed in the subjects. Their faces and hands had significant deformities, and the two intended subjects in the text prompt (a scientist and a rabbit) were mashed together to create a hybrid creature. Despite “problems” in the creation of the subjects, the characteristics of the intended medium were impressive. Because collodion images are exposed on glass surfaces, they are susceptible to scratches, stains, and the flaking off of emulsions from that surface. These inherent challenges of the medium contribute to the unique visual appearance of collodion images. To my surprise, while Midjourney struggled with generating coherent subjects, it was relatively successful at imitating a collodion look.

Surface imperfections such as scratches and flaked-off emulsions at the edges and corners of each AI-generated image can be easily observed. Furthermore, chemical stains and dust particles over the surface of the image are visibly imitated in the first one. The following wet collodion plates from the 1860s by Scottish photographer John Thomson demonstrate what actual imperfections on the glass surface look like and how Midjourney was able to generate similar artifacts (I am using these photographs by Thomson because they are open access, there is no other particular reason).

John Thomson. Chinese child seated, Canton, Kwangtung province, China. c. 1860s. Wet plate collodion.

John Thomson. Saigon, Cochin China [Vietnam]. 1867. Wet plate collodion.

After over a year, in November 2023, with an urge, I decided to feed the same text prompt into Midjourney. I knew that in the last year, generative AI has made significant “progress” and mostly overcame the problem of bodily distortions to provide more coherent prompt following. Midjourney returned the following image variations to the same text prompt.

Subjects in this new set of generated images turned out to be highly photo-realistic, with only slight bodily distortions and no quirky hybrid creatures compared to the earlier set. These changes were expected. However, the distinctive appearance of the collodion image was absent. With no surface imperfections in place to indicate the medium’s characteristics, these images felt too perfect, too artificial. The studio lighting made it even worse. It seemed like the subjects were being pushed to the limits of photorealism to impress while the imitated medium's tangible characteristics were sacrificed or suppressed as secondary conditions in the prompt hierarchy.

In my opinion, the transition of importance from the medium to the subject of the generated image illuminates one of the main trajectories of mass-oriented generative AI technologies, that is, to impress the average user with its ability of photorealistic depiction. It is, of course, possible to ask Midjourney to create images in different styles. However, as this example demonstrates, when asked for a photograph from the 19th century, Midjourney puts more emphasis on the subject than the technique of mediation. Unfortunately, I was not as excited and impressed as before with the later images by Midjourney. This could be due to the fact that, as contemporary individuals, we get quickly desensitized to the abilities of media technologies, or it could be because of the business decisions that are being made at the backstage of generative AI technologies: to turn them into more marketable and profitable products by boosting the allure of generated images.

I think that generative AI carries a promising potential that we can use for our benefit. However, if the decisions behind the technology push its potential to promote a specific standardized look (against individual user intentions), it might turn it into an entity that contributes further to the unbearable image inflation we are already experiencing. Moreover, the risk lies not only in homogenizing visual content but also in potentially stifling creativity and diversity in personal expression. We can all agree that generative AI technologies will hold significant roles in everyday life in the years to come. We will not be asked to opt into this type of life. Voluntarily or involuntarily, we will be absorbed by it. If generative AI becomes confined to a narrow set of aesthetic norms dictated by market demands and commercial interests, it may have substantial effects on our way of seeing and experiencing the world, and not in a good way. Therefore, as we navigate the evolving landscape of generative AI, it becomes imperative to advocate for approaches that prioritize user agency, diversity, and the preservation of creative individuality.

My article on optography can be reached by clicking here → Eye-ppratus: Re-imagining the Human Eye in the Nineteenth Century

Camera Archaeologia

Discussion about this post