Goda Plaum, Lars Grabbe und Klaus Sachs-Hombach
Lukas R.A. Wilde, Marcel Lemmes and Klaus Sachs-Hombach
By Lukas R.A. Wilde | This introduction examines whether generative imagery represents a new paradigm for image production and an emerging research field. It explores a humanities approach to machine learning-based image generation and questions posed by media studies. Rather than focusing on radical shifts in media history, it emphasizes continuities and connections. It highlights the unique aspects of generative imagery compared to photography, painting, and earlier computer-generated imagery. The ’new paradigm‘ is based on emergent or stochastic features, the interplay between immediacy-oriented and hypermediacy-oriented forms of realism, and a novel text-image relationship grounded in human language. The survey then discusses the conditions under which generative imagery should be seen as a distinct media form rather than a new technology. It suggests viewing it as a mediation within evolving socio-technological configurations that reshape agency and subject positions in contemporary media cultures, particularly between human and non-human actors. To understand the cultural distinctness, the essay proposes examining the establishment, attribution, and negotiation of cultural ‘protocols‘ within existing and emerging media forms.
By Lev Manovich | I’ve been using computer tools for art and design since 1984 and have already seen a few major visual media revolutions, including the development of desktop media software and photorealistic 3D computer graphics and animation, the rise of the web after, and later social media sites and advances in computational photography. The new AI ‘generative media’ revolution appears to be as significant as any of them. Indeed, it is possible that it is as significant as the invention of photography in the nineteenth century or the adoption of linear perspective in western art in the sixteenth. In what follows, I will discuss four aspects of AI image media that I believe are particularly significant or novel. To better understand these aspects, I situate this media within the context of visual media and human visual arts history, ranging from cave paintings to 3D computer graphics.
By Andreas Ervik | This paper explores generative AI images as new media, focusing on the questions of what these images depict, how image generation occurs, and how AI impacts the imaginary. It reflects on other forms of image production and identifies AI images as radically new, distinct from traditional methods as they lack light or brushstroke registration. However, they draw from the remains of other production forms, relying on connections between images and words as well as other forms of images as training data. AI image generators function as search engines, allowing users to enter prompts and explore the virtual potential of the latent space. Agency in AI image generation is shared between the program, platform holder, and users‘ prompts. Generative AI creates a social form of images, relying on human-created training datasets and shared on social networks. It gives rise to a ‚machinic imaginary,‘ characterized by techniques, styles, and fantasies from earlier media production. AI-generated images become part of the existing collective media imaginary. As discourse on AI images focuses on their future capabilities, the AI imaginary is filled with dreams of technological progress.
By Hannes Bajohr | The ongoing debate around machine learning focuses on ‘big’ terms like intentionality, consciousness, and intelligence; the philosophical challenge lies in more nuanced concepts. This contribution explores a limited type of meaning called “dumb meaning.” Traditionally, computers were seen as handling only syntax, their semantic abilities being limited by the “symbol grounding problem.” Since they operate with mere symbols lacking any indexical relation to the world, their understanding is restricted to empty signifiers whose meaning is ‘parasitically’ dependent on a human interpreter. This was true for classic or symbolic AI. With subsymbolic AI and neural nets, however, an artificial semantics seems possible that operates below meaning proper. I explore this limited semantics brought about by the correlation of data types by looking at two examples: the implicit knowledge of large language models and the indexical meaning of multimodal AI such as DALL·E 2.
By Amanda Wasielewski | Text-to-image generation tools, such as DALL·E, Midjourney, and Stable Diffusion, were released to the public in 2022. In their wake, communities of artists and amateurs sprang up to share prompts and images created with the help of these tools. This essay investigates two of the common quirks or issues that arise for users of these image generation platforms: the problem of representing human hands and the attendant issue of generating the desired number of any object or appendage. First, I address the issue that image generators have with generating normative human hands and how DALL·E has tried to correct this issue by only providing generations of normative human hands, even when a prompt asks for a different configuration. Secondly, I address how this hand problem is part of a larger issue in these systems where they are unable to count or reproduce the desired number of objects in a particular image, even when explicitly prompted to do so. This essay ultimately argues that these common issues indicate a deeper conundrum for large AI models: the problem of representation and the creation of meaning.
By Eryk Salvaggio | Generated images are data patterns inscribed into pictures, and close readings can reveal aspects of these image-text datasets and the human decisions behind them. Examining AI-generated images as ›infographics‹ informs a methodology, described in this paper, for the analysis of these images within a media studies framework. It proposes an analytical methodology to determine how information patterns manifest through visual representations. This methodology consists of generating a series of images of interest. It examines this sample of images as a non-linear sequence. The paper finds examples of patterns, absences, strengths, and weaknesses and connects them to structures of the underlying model and dataset. The hypothesis is extended to a broader sample. The paper offers a case study, reading to images of humans kissing created through DALL·E 2. The paper draws conclusions and presents avenues of future exploration.
By Roland Meyer | Text-to-image generators such as DALL·E 2, Midjourney, or Stable Diffusion promise to produce any image on command, thus transforming mere ekphrasis into a means of production. However, prompts should not be understood as instructions to be carried out, but rather as generative search commands that guide AI models through the stochastic spaces of possible images. A comparison can thus be drawn between text-image generators and stock photography databases. But while stock photography searches retrieve pre-existing images, prompts are used to explore latent possibilities. This, the article argues, fundamentally changes how value is attributed to individual images. AI image generation fosters the emergence of a new networked model of visual economy, one that does not rely on closed image archives as monetizable assets, but rather conceives of the entire web as a freely available resource that can be mined at scale. Whereas in the older model each image has a precisely determinable value, what DALL·E, Midjourney, and Stable Diffusion monetize is not the individual image itself, but rather ‘styles’: repeatable visual patterns derived from the aggregation and analysis of large ensembles of images.
By Jens Schröter | As has been remarked several times in the recent past, the images generated by AI systems like DALL·E, Stable Diffusion, or Midjourney have a certain surrealist quality. In the present essay I want to analyze the dreamlike quality of (at least some) AI-generated images. This dreaminess is related to Freud’s comparison of the mechanism of condensation in dreams with Galton’s composite photography, which he reflected explicitly with regard to statistics – which are also a basis of today’s AI images. The superimposition of images results at the same time in generalized images of an uncanny sameness and in a certain blurriness. Does the fascination of (at least some) AI-generated images result in their relation to a kind of statistical unconscious?