Analyzing Emojis Semiotically: Towards a Multi-Dimensional, Theoretical Model Inspired by Charles S. Peirce

By Deborah Enzmann


Emojis have become an essential part of the communicative sign repertoires of billions of people. Their use has increased rapidly in recent years. The variability of meanings, the context-sensitive, polyfunctional, and ambiguous character of the signs is fascinating. The inconsistency of their many meanings, however, is a recurring topic in academic discourse. Research on the functions of emojis has mainly come from the field of linguistics, trying to structure and classify these signs from the respected perspectives. But what do sign processes using emojis look like? What influences do the formal, especially pictorial aspects have on the effect and the meaning of these signs? What is the relation between form and content? And how are ›abstraction‹ and ›identification‹ connected to each other? In the present article, the sign process with emojis will be explained based on Charles Sanders Peirce’s semiotics. Using messages from the ›Textmoji‹ case study and proposing a semiotic model, it will be shown what a sign process with emojis looks like. The focus will be on the intentions of users to employ these signs. Charles S. Peirce’s semiotics is applied to specific messages from the case study through a novel model based on his theory of signs. In a concluding step, the present article will demonstrate how recognition of a face takes place across different degrees of abstraction, drawing especially on comic book theory and cognitive semiotics.


Digital communication, and with it the colorful world of digital signs, has changed our everyday communication in recent years. For this reason, it is important to analyze these communicative interactions semiotically. In my recent dissertation, published In German (Enzmann 2023), I examine emojis from a historical, design, and semiotic perspective with a focus on the intentions of the sign users. The present article gives a little insight into the semiotic part of my work. It includes the development of a semiotic model based on the theory of Charles Sanders Peirce. The case study ›Textmoji‹ shows what the user wants to convey using emojis and explains sign meanings that would not be accessible on the basis of a reconstruction – in the sense of an interpretation by an outside person. The combination of a case study with a semiotic model allows instead to show how a sign process with emojis could be conceptualized and visualized graphically within a semiotic model.


So far, there is no consistent terminology for the classification of various sorts of emojis across existing research. Terms like ›smiley‹, ›emoji‹, ›emoticon‹, ›kawaicon‹, and many more are used in different ways. For this reason, I propose a definition below for the use of the terms that differs from the usual usage. In the following, a distinction is made between ›ASCII-‹, ›graphic‹, and ›AR-emojis‹. ›ASCII-emojis‹[1] are composed of the keyboard set. Subforms are ›ASCII-emoticons‹ and ›ASCII-pictomojis‹. ASCII-emoticons can also be differentiated between vertical and horizontal emoticons. The last ones are also often called ›kaomojis‹ in the relevant literature. Graphic emojis are picture signs that are inserted into the text. They have spread globally mainly due to their inclusion in Unicode in 2010. Characteristics of emojis until 2017 were that they were mainly used in text messages, consequently together with writing. More recent developments allow the use of ›AR-emojis‹, which are used as videos in connection with a voice message or as stickers. One form of AR-emojis is comprised by individually designed avatars, sometimes called ›AR-emoticons‹. According to this definition, emoticons are digital signs that represent emotional states through facial expressions. ASCII-, graphic and AR-emojis can thus all be addressed as emoticons (cf. fig. 1).

Figure 1: Structuring of the terms (cf. Enzmann 2023: 11)

The semiotics of Charles S. Peirce

Charles S. Peirce’s semiotics is especially suitable for analyzing and categorizing these sign processes further. In contrast to other approaches, Peirce was not only concerned with linguistic signs but with all kinds of signs and sign systems. In order to study emojis, a multidimensional and interdisciplinary approach is helpful because emojis are, in the first instance, images that are mainly used as signs in text-based messages. Emoji communication is thus rather novel – and multimodal. Accordingly, this phenomenon needs to be studied with an all-encompassing theory as provided by Peirce’s semiotics; its foundation lies in Peirce’s philosophy of universal categories. His approach is still relevant today, but unfortunately, it is often rendered in a highly reduced form or presented in such complex diagrams that the application to specific case studies is pushed into the background. It is my concern to make Peirce’s semiotics more accessible by means of a model based on his theory of signs and to apply it to specific text messages containing emojis in a practical manner. In the following, I will briefly explain the relevant areas of Peirce’s theory on which the subsequent semiotic model shall be based.[2]

Peirce has proposed one of the most complex interpretations and expositions of what a ›sign‹ is and how it works (cf. FRIEDRICH/SCHWEPPENHÄUSER 2010: 30). In his work, he created many different definitions. All of these have in common, however, that a ›sign‹ is understood as a triadic relation (cf. NÖTH 2000: 62), which can be represented by a semiotic triangle (cf. fig. 2):

Figure 2: The semiotic triangle (cf. Enzmann 2023: 115; also Eco 1977: 30)

The triadic sign relation is the precondition for the formation of any sign process and cannot be reduced to a dyadic or monadic relation. According to this, a ›Medium‹ (M)[3] relates to an ›Object‹ (O) and it is interpreted by the ›Interpretant‹ (I); all three elements are defining each other through these respective relations. This triad is understood through different terms in various semiotic traditions, some of which differ considerably.[4] Peirce subdivides the three references into further trichotomies, denoting sub-types of sign classes. This is usually presented in the form of a table (cf. NÖTH 2000: 66):

Fig. 3: The triadic relation within the types of sign classes (cf. Enzmann 2023: 116; also Peirce 2000b: 48; Nöth 2000: 66; Walther 1974: 56)

Fig. 4: A reduced representation of figure 3 (cf. Enzmann 2023: 116)

The numbering system is adopted from semiotician Max Bense (and many others later, cf. WALTHER 1974: 56). The table can then be further reduced to a mere numeric notation as in figure 4. This table serves as the basis for generating ten classes of signs.

Peirce’s ten classes of signs

The respective nine constituents are then not complete signs by themselves. Only the combination of one sign type each from the other sign reference results in a sign. Ten well-established sign classes can then be derived from the table. The conceptualization of the classes follows the subsequent combination rule: Any first can only be combined with another first, any second with either a first or a second, and any third with either a third, a second, or a first (cf. SCHÖNRICH 1999: 28). This results in the following ten semiotically valid sign classes (ibid.: 27; cf. Walther 1974: 78f.):

  1. ›Rhematic Ionic Qualisigns‹ (1.1 / 2.1 / 3.1)
  2. ›Rhematic Iconic Sinsigns‹ (1.2 / 2.1 / 3.1)
  3. ›Rhematic Iconic Legisigns‹ (1.3 / 2.1 / 3.1)
  4. ›Rhematic Indexical Sinsigns‹ (1.2 / 2.2 / 3.1)
  5. ›Rhematic Indexical Legisigns‹ (1.3 / 2.2 / 3.1)
  6. ›Rhematic Symbolic Legisigns‹ (1.3 / 2.3 / 3.1)
  7. ›Dicent Indexical Sinsigns‹ (1.2 / 2.2 / 3.2)
  8. ›Dicent Indexical Legisigns‹ (1.3 / 2.2 / 3.2)
  9. ›Dicent Symbolic Legisigns‹ (1.3 / 2.3 / 3.2)
  10. ›Argument Symbolic Legisigns‹ (1.3 / 2.3 / 3.3)

From 1) to 10) the ›semioticity‹ (or ›representational capacity‹) increases respectively according to Elisabeth Walther (1974: 79).[5]

The universal categories

The basis of Peirce’s theory of signs is the universal categories addressed as ›firstness‹, ›secondness‹, and ›thirdness‹. To Peirce, they amount to a theory of experience, a phenomenology. He understands a phenomenon to be anything that is present to the mind in some way at a specific point in time (cf. PEIRCE 1983: 40). He then differentiates three types of phenomenological elements that are, however, not absolutely distinguishable from each other. Firstness is characterized by possibilities; secondness by actual facts; and thirdness by the concluding achievement of thought (cf. ENZMANN 2023: 112-115). These three universal categories not only form the foundation of Peirce’s semiotics but also shape it structurally. As can be seen from the numeric notation, the sign types and the sign relations are only assigned relative to these categories (cf. fig. 3-4). The medium belongs to the firstness, the sign types of it are again divided into firstness, secondness, and thirdness. The same is true for the object reference, which stands for secondness, and the interpretant reference, which stands for thirdness. The three universal categories thus act doubly on sign references. Initially, the sign-references (medium, object, and interpretant) are divided into firstness, secondness, and thirdness; then, their respective sign types are divided into categories once again.

In addition to the foundational triad and its internal trichotomies, Peirce distinguished between two objects and three interpretants (cf. PEIRCE 2000b: 50). In the following diagram, again based on Peirce, such a subdivision is shown as a second level (cf. fig. 5):

Figure 5: The integration of the sign relata according to Peirce (cf. Peirce 2000b: 50)

All three sign references include an immediate level (cf. fig. 5). In addition, the object and interpretant references include a dynamic level and the interpretant reference includes a logic level. It can be seen that the interpretant reference is triadic, the object reference is dyadic, and the medium reference is monadic.

A semiotic model of analysis based on Peirce

This further subdivision can once again be organized into additional trichotomies, which adds to the complexity and granularity of Peirce’s theory. This prompts to represent the initially presented table in three dimensions – in the form of a cube model (cf. fig. 6):

Figure 6: The semiotic cube model (cf. Enzmann 2023: 126).

If the sign references are separated, it is evident that the medium is monadic, the object is dyadic, and the interpretant is triadic (cf. fig. 7). The medium contains an immediate aspect, the object contains an immediate and a dynamic aspect, and the interpretant contains an immediate, a dynamic, and a logical – also called final – aspect (cf. PEIRCE 2000b: 51).

Figure 7: The sign references are separated (cf. Enzmann 2023: 129)

The immediate object and the immediate interpretant depend on the representation of the medium. This can be seen in the model because the immediate is on the same level as the medium (cf. fig. 8):

Figure 8: The immediate, dynamic, and final level separated (cf. Enzmann 2023: 129)

The universal categories thus function not only from left to right and from back to front, but also from top to bottom (cf. fig. 9).

Figure 9: The categories of firstness, secondness, and thirdness are separated from top to bottom (cf. Enzmann 2023: 129)

This model clarifies the fundamental validity of the categories in Peirce’s semiotics. The yellow areas result in the table shown at the beginning (cf. fig. 4). The ten classes of signs can be represented by the cube model (cf. ENZMANN 2023: 130). It should thus be pointed out that the complexity of a sign increases the more complex the diagram becomes.[6]

In the following section, I would like to demonstrate how the cube model helps to visualize and analyze any actual sign process. To examine the sign process that takes place with emojis, the following section analyzes the intended use of emojis. I am going to present my case study ›Textmoji‹[7] and address some important aspects regarding the classification of emojis into sign classes. In a further step, the formal and aesthetic characteristics of emojis and their influence on the sign process are addressed.

Case study ›Textmoji‹

To investigate what is actually conveyed with emojis and how a respective sign process works accordingly, I am going to rely on messages from my case study in relation to the semiotic model introduced before. The case study is a qualitative analysis in which users were asked to explain the background of their message and the emojis employed. All black speech bubbles contain the respective original message of the sender while the yellow thought bubbles contain the interpretations (or ›translations‹) of the respective emojis created as part of the study.

The following figures 10 and 11 are two messages selected from the study that have a similar context but use emojis quite differently. First, the context and the message are presented. Then, the participating emojis are semiotically analyzed. The context of N5 is as follows: Sender (A5) and receiver (B5) are planning to meet up later in the week. They discover during their exchange that their appointments overlap inconveniently. Because of this, B5 reschedules and informs A5 that the meeting can take place next Thursday. In response, A5 writes the following message to B5:

Fig. 10: Message N5, from ›Textmoji‹ (cf. Enzmann 2023: 157)

Fig. 11: English translation of N5

N5 contains the emoji ›Folded Hands‹ (T9). If the sign is interpreted in the way the name suggests, it would be an icon because it is an image of the gesture ›folded hands‹. An icon refers to its object based on characteristics it has in common with the object. It should be noted that T9 is used differently across cultures. The pictured gesture means ›please‹ or ›thank you‹ in Japanese contexts. In addition, the gesture is used in Thailand as a traditional greeting – also called ›wai‹. The same sign, however, is also commonly used for praying hands or for a ›high-five‹. A5 used T9 indeed as a ›high-five‹ here, as is clear from ÜT9. A high five primarily signals a success. If the sign is interpreted accordingly, it refers to its object ›success‹ merely symbolically because ›success‹ shares no features with folded hands. The sign must instead be learned to understand it accordingly. In the case of N5, it represents the successful scheduling of the joint appointment.

Peirce distinguishes between rhema, dicent, and argument in interpretant reference. While a rhema is used for an isolated term like ›Folded Hands‹, a dicent is used for a propositional statement. Interpreted as something positive in the context of the message above (in the sense that the meeting on Thursday is successfully scheduled), it conveys a piece of information and provides a kind of judgment. Thus, T9 can be analyzed as a Dicent Symbolic Legisigns (1.3/2.3/3.2). This class of sign could accordingly be represented with the cube model as follows:

Figure 12: Dicent Symbolic Legisign (1.3/2.3/3.2) (cf. Enzmann 2023: 139)

The context to the dialogue of the following message (N2, fig. 13/14) is: The sender and the recipient had agreed to meet each other soon. The recipient now cannot attend the planned meeting and suggests to the sender an alternative date in the subsequent week. The sender (A2) then writes the following reply to the receiver:

Fig. 13: Message N2 from ›Textmoji‹ (cf. Enzmann 2023:148)

Fig. 14: English translation of N2[8]

The message contains an ASCII– (T3) and a graphic emoticon (T4). »So« with the ASCII-emoticon (T3) refers to the meeting that did not happen and the graphic emoticon (T4) refers to the upcoming meeting. Both emoticons represent a facial expression, which both act as a kind of personal comment on the preceding statement. The sender (A2) explained to me (in additionally provided information) the reason for using the almost obsolete ASCII-emoticon: its shape makes it seem less negative to A2 than its graphic counterpart, especially when combined with the graphic emoji (T4), which A2 uses as reinforcement. According to A2, it expresses ›joy/excitement about the upcoming meeting‹.

The functions of the two emoticons are in some respects the same. Both signs express the emotion of the sender to a given situation. What would that look like in Peirce’s sense? The reference to an object is made iconically in T3 and T4 by recognizing a negative and a positive facial expression. T3 and T4 resemble their objects (facial expressions) in some ways. Once the facial expression is recognized, it is in turn associated with A2’s emotional state intended to be conveyed. In the case of T3, this is the ›disappointment‹ (about the meeting that did not take place). Facial expressions are thus in some way related to the assumed mental life of the sender, namely emotions or states of mind. A feeling can, after all, be seen as the cause of a corresponding facial expression in a face-to-face (FTF) situation. In such a case, a facial expression can thus arise through causality and is, therefore, to be read as a kind of symptom: The medium (facial expression) is thereby connected to its object (emotion). It is influenced by it in the sense that facial expressions are expressions of emotions; a facial expression in an FTF situation can function for an interpreter as a cue or trace of some assumed emotion. An emoticon, however, is always deliberately chosen and used. When an emoji is employed to convey a certain emotional state, the sign acts as a deliberate cue to the sender’s intended emotional state to be conveyed. While the facial expression might represent the true, original, or genuine index, the emoji would be a degenerate index of an emotional state with an included icon. It does not matter whether A2 really feels that state, because an emoticon cannot be causal. Emoticons are always used consciously.

It could be objected that the use of facial expressions and gestures is also culturally rooted and accordingly subject to certain conventions. Ludwig Nagl, for instance, stated that the symbolic aspect of such signs is philosophically interesting even when we do not have a convention ›constructed‹ in the same way as in a verbal language (cf. NAGL 1992: 49f.). Whether emotions precede culture or are its product is of course a long, controversial discourse in emotion research (cf. Plamper 2012: 116-128). From a semiotic perspective, we could settle for the following: In order to know that a smile is linked to an emotion and is accordingly related to, for example, satisfaction or joy (something satisfying or something joyful), we have learned with the help of a fully developed, thoroughly structured symbolic language and through social interactions what a smile can do or mean (cf. NAGL 1992: 45). Peirce describes this through the distinction between a genuine and a degenerate index. Thus, degenerate indices refer to their objects indirectly via the detour of symbolic signs (cf. NÖTH 2000: 187). Genuine indices, on the other hand, refer directly to their object and, according to Peirce, stand in an existential relation with their object (cf. NÖTH 2000: 186). According to Nagl, indices and icons can accordingly only be thought of as embedded in the symbolic structure of language (cf. NAGL 1992: 45). It must be noted that Peirce describes interpretation from the point of view of the interpreter. It is not necessary that a smile must be linked to any positive emotion. When an emoji is used to convey a particular emotional state, the sign acts as a cue to the intended emotional state of the user to be conveyed. While the facial expression might constitute a true, original, or genuine index, the emoticon is merely a degenerate index of a feeling – with an included icon.

In summary, all of this means that an emoticon is in fact an index because it stands for an actual event in the mind of an interpreter, in this case for the emotional state of the sender that is allegedly intended to be conveyed. Since emoticons are always used consciously, however, and are not causally related to their objects, T3 and T4 can be classified as sub-indices/hyposemes (which would also include a finger pointing towards an object, or a signpost).

Regarding interpretant reference: while both signs (T3 & T4) are rhemas (3.1) in themselves (considered without the context of the message), in that T3 can convey something negative (disappointment or sadness) and T4 can convey something positive (joy or satisfaction), the actual meaning of the signs is revealed only through their use in context. T3, as mentioned above, is then interpreted as ›disappointment that the meeting did not take place‹. Thus, it conveys information and makes a judgment indicating that ›A2 is disappointed; not having the meeting is negative for A2‹. A similar situation applies to T4. The sign is interpreted as follows: ›A2 is looking forward to the upcoming meeting; the fact that a meeting is taking place is positive‹. Both emoticons thus convey information about A2’s state of mind related to the meeting.

While the interpretant reference of T3 and T4 is a rheme when considered without context, it actually becomes a dicent through that context. Accordingly, the two emoticons can be called Dicent Indexical Sinsigns (1.2/2.2/3.2). This class of sign could be represented with the cube model as follows:

Figure 15: Dicent Indexical Sinsign (1.2/2.2/3.2) (cf. Enzmann 2023: 139)

The case study – in combination with the semiotic cube model – allows us to crystallize the alleged intentions of emoji users and to consider the communication process that occurs surrounding any given emoji. The analysis extends Peirce’s formal and abstract theoretical construct toward applied sign use.[9] But the influence of different qualitative characteristics on the interpretation of the signs, such as the use of an ASCII-emoticon in contrast to a graphic emoticon in a message, remains still somewhat unclear. To analyze this, further research from cognitive semiotics and comic book theory will be consulted.

The recognition of a cognitive type

To investigate the formal aspects of emojis, the analysis of N2 is suitable, because the message contains a graphic and an ASCII emoticon. According to the sender (A2), formal properties of the emoticons regulate the intensity of the respective sign. First of all, the emerging process is reflected when A2 selects T3: At the beginning, there is the dynamic object – the sender’s emotion intended to be conveyed; in the case of T3, A2’s disappointment. Based on this, A2 selects an emoticon (T3) that is capable of referring to the dynamic object through its immediate object. Consequently, in the case of emoji, the medium represents an object through its appearance, the immediate object, which refers to the real – dynamic – object: it appears to be capable (for the sender A2) to refer to the emotion intended to be conveyed through formal properties of the medium. A2 interpreted the formal aspects of T3 and T4 as regulators to intensify and weaken the respective statements, respectively. According, to A2, the shape of the ASCII emoticon (T3) attenuates the intensity of the sign, while the graphic emoticon (T4) has a more intense effect. Accordingly, once again, the formal-aesthetic attributes of emoticons are crucial criteria of selection for A2. But what is the basis for such a selection? How is it possible that the formal properties of emojis (the medium) can have an impact on the intensity of the meaning of the signs? How are we able to recognize a face in the most rudimentary forms – a bracket and a simple colon? Borrowing from cognitive semiotics, perception is defined as follows according to Lukas R.A. Wilde: A sensory stimulus is related to a repertoire of known types and categorized as one of its elements (cf. WILDE 2018: 95f.). In this context more specifically, sensory or perceptual types, which include visual ones, are matched with a ›cognitive type‹. This enables the production of an ›image object‹ in the mind that confirms to the type according to certain criteria of perceptual relevance. According to Börries Blanke (2003: 47-70) and Wilde, recognizing relevant cognitive types takes place even before additional cultural encodings or connotations come into play (cf. WILDE 2020: 17/186f.). Subsequently, in order to be able to interpret something pictorially, it must first be recognized or perceived (categorized) within a medium.

To recognize a facial expression, for example, we make use of these cognitive types just as well. These are not directly linked to the meaning of the signs but are inherent in the form of the sign. According to Wilde, once a face is recognized, a relevant cognitive type must be available that enables us to decode all signs confirming to the same type to a certain degree (cf. WILDE 2020: 182). If an interpreter is aware that an ASCII combination can represent a laughing facial expression rotated by 90°, the interpreter is able to recognize other facial expressions conceived in the same way. According to Wilde, the difference in looking at a word and a pictorial sign is then mainly that, in the latter, we cannot help but recognize a represented object (cf. WILDE 2019). Although what is recognized in the picture may not be the meaning or content of the sign, recognizing the cognitive type enables us to further guess such communicative meanings of the picture sign. In the case of an unknown word, it is difficult or outright impossible to guess such meanings. Consequently, according to Wilde, ›pre-attentive intelligibility‹ is often present in image recognition (cf. WILDE 2018: 94). A large part of the recognition takes place without conscious attention.

The process of image recognition is further explained by Blanke and Wilde through the term ›categorization threshold‹ (cf. BLANKE 2003: 91-106; WILDE 2018: 94-112). In order to recognize an image, the medium must be above the ›iconic categorization threshold‹ of a type. Crucial in the categorization (crossing the iconic threshold) is thus the recognition of relevant cognitive types on which we rely when we recognize real-world objects just as images. Given an image, at least one salient feature must be recognized that corresponds to a salient feature of the type. The iconic categorization threshold can thus be crossed to varying degrees, as we see from the difference between a photo of a face and an emoji: both cross the iconic categorization threshold for the type ›face‹ – but to different extents. This can be illustrated with emojis using Scott McCloud’s model for abstractions in comic books (cf. MCCLOUD 2001: 57; figure 15).

Figure 16: Gradual abstractions from photographs (cf. Enzmann 2023: 184; see also McCloud 2001: 57)

The photo on the left of figure 16 exceeds the iconic categorization threshold by a very wide margin, while the ASCII emoticons only just exceed the threshold. The word ›face‹ is certainly below the iconic categorization threshold. Even further below the categorization threshold, according to McCloud (2001: 57), would be the description of a face, such as ›two eyes, a nose, and a mouth‹. Thus, a so-called ›degree of iconicity‹ can be determined. It depends on three different criteria according to Blanke (2003: 96) and Wilde (2018: 101). The first is the quantity of iconic aspects. For example, the photo contains a lot of iconic aspects. The more detail a representation contains, the more likely it is that the iconic categorization threshold will be exceeded. The second criterion is the relevance of these iconic characteristics. What is typical for a type, however, can vary culturally. For example, in Japanese culture, it is common for interlocutors to focus more on the other person’s eyes to infer their feelings, as facial expressions are traditionally used with restraint in Japan (cf. YUKI et al. 2007: 303-310). Because of this, a wide variety of horizontal ASCII emoticons for the eyes emerged. This shows that eyes are relevant for representing emotions in Japan, while in other countries, the mouth is more relevant for representing emotions. The third criterion is the cognitive accessibility of the iconic type (cf. BLANKE 2003: 95). It requires experience or prior knowledge. By knowing the rule that punctuation marks can represent facial expressions rotated ninety degrees in vertical ASCII emoticons, it is possible to also decrypt previously illegible signs.

In consideration of the cube model, it is not quite clear which relevance criteria belong to the medium and which to the object reference – and which role the interpreter plays. On the one hand, this may be due to the different use of the semiotic terms. On the other hand, it may be because of differences in the underlying semiotic theories. However, a careful differentiation is necessary in order to find out which criteria are related to which relations in which way. This will be considered in the following on the basis of the proposed semiotic model. In order to ›locate‹ the cognitive type in the cube model, it is helpful to add the term of the repertoire, since – according to Blanke (2003: 95) – the accessibility of the cognitive type is relative to the structure of the repertoire. The Stuttgart School uses the term ›repertoire‹ to designate the medium’s reference (cf. WALTHER 1974: 50f.). Repertoires can address different sensory perceptions, as language, for instance, can be perceived acoustically and visually (cf. WALTHER 1974: 51). In its firstness, the medium reference consists of possibilities (qualisigns) that are realized as sinsigns on the level of secondness and can belong to a law or a rule on the level of thirdness. ›Repertoire‹ consequently addresses the fact that qualisigns are realized as sinsigns, such as tones, colors, or forms, which correspond to the types of a certain sign repertoire as legisigns. According to Gerhard Schönrich, legisigns describe culturally established routines that control cognition already in the medium reference (cf. SCHÖNRICH 1990: 345). A legisign is comparable to a rule or a law and is consequently equivalent to a cognitive type.

In the constitution of cognitive types, a distinction can be made between different sensory types, such as visual or auditory (cf. BLANKE 2003: 36). The cognitive type, however, comprises all sensory knowledge associated with a certain class of phenomena (cf. BLANKE 2003: 36). It is not in itself bound to any particular realization or appearance. It is an idea (thirdness), such as that of vanilla, the realization of which (secondness) may be a scent or a pictorial representation of vanilla. Thereby, all realizations correspond to the same cognitive type vanilla and while the word ›vanilla‹ may be far removed from any immediate sensory experience, it is still a cultural ›shorthand‹ to everything that experience might entail.

Thus, it can be assumed that due to the existence of sign repertoires – which in the case of emojis are constituted by visual types – relative cognitive types can be determined. Based on these types, the medium is associated with reality, with an object that exists in reality or in the imagination, and is interpreted as an image or a sign of it. Consequently, a medium (qualisigns realized in a sinsign) contains a relevant cognitive type (legisign), which in turn refers iconically, indexically, or symbolically to an object in a sign process by the interpretant. The degree of iconicity can be determined when a sign is interpreted iconically in relation to an object. Thus, the relevance criteria in recognizing iconic types are not to be located in the medium itself, but in the relation of the latter to its object. The degree of iconic relevance is to be located in the object relation and the meaning of the sign in the object and interpretant relation. Consequently, it is absurd to disregard the object and interpretant reference in a formal consideration of a sign, because the representations depend on what and how they represent and what effect they thereby evoke. Colors, shapes, and details can be used as reinforcements and influence the interpretation of the signs. Thus, the formal-aesthetic properties of an emoji can influence the effect of the characters, for example, by regulating intensity. This is why it is so important to study emojis from a formal and aesthetic perspective in the future.


BLANKE, BÖRRIES: Vom Bild zum Sinn: Das ikonische Zeichen zwischen Semiotik und analytischer Philosophie. Wiesbaden [Deutscher Universitätsverlag] 2003

ECO, UMBERTO: Zeichen: Einführung in einen Begriff und seine Geschichte. Translated by Günter Memmert. Frankfurt/M. [Suhrkamp] 1977

ENZMANN, DEBORAH: Emojisierung: Eine historische und semiotische Studie zu Emojis. Salenstein [Niggli] 2023

FRIEDRICH, THOMAS; SCHWEPPENHÄUSER, GERHARD: Bildsemiotik: Grundlagen und exemplarische Analysen visueller Kommunikation. Basel [Birkhäuser Verlag] 2010

MCCLOUD, SCOTT: Comics richtig lesen: Die unsichtbare Kunst. Translated by Heinrich Anders. Hamburg [Carlsen] 2001

NAGL, LUDWIG: Charles Sanders Peirce. Frankfurt/M. [Campus] 1992

NÖTH, WINFRIED: Handbuch der Semiotik. 2nd ed. Stuttgart [J.B. Metzler] 2000

PEIRCE, CHARLES S.: Phänomen und Logik der Zeichen, edited and translated by Helmut A. Pape. Frankfurt/M. [Suhrkamp] 1983

PEIRCE, CHARLES S.: Semiotische Schriften, vol. 1: 1865-1903, edited and translated by Christian J.W. Kloesel and Helmut Pape. Frankfurt/M. [Suhrkamp] 2000a

PEIRCE, CHARLES S.: Semiotische Schriften, vol. 2: 1903-1906, edited and translated by Christian J.W. Kloesel and Helmut Pape. Frankfurt/M. [Suhrkamp] 2000b

PEIRCE, CHARLES S.: Semiotische Schriften, vol. 3: 1906-1913, edited and translated by Christian J.W. Kloesel and Helmut Pape. Frankfurt/M. [Suhrkamp] 2000c

PLAMPER, JAN: Geschichte und Gefühl: Grundlagen der Emotionsgeschichte. Munich [Siedler Verlag] 2012

SCHÖNRICH, GERHARD: Zeichenhandeln: Untersuchungen zum Begriff einer semiotischen Vernunft im Ausgang von Ch. S. Peirce. Frankfurt/M. [Suhrkamp] 1990

WALTHER, ELISABETH: Allgemeine Zeichenlehre: Einführung in die Grundlagen der Semiotik. Stuttgart [Deutsche Verlags-Anstalt] 1974

WILDE, LUKAS R.A.: Im Reich der Figuren: Meta-narrative Kommunikationsfiguren und die ›Mangaisierung‹ des japanischen Alltags. Cologne [Herbert von Halem] 2018

WILDE, LUKAS R.A.: Bildlichkeit von Emojis: Cartoonisierung und Manga-Symboliken. Lecture at the interdisciplinary symposium ›Emojisierung – wie das digitale Schreiben unsere Kommunikation verändert.‹ 07.06.2019 at saasfee*pavillon, Frankfurt/M.

WILDE, LUKAS R.A.: The Elephant in the Room of Emoji Research: Or, Pictoriality, to what Extent? In: Elena Giannoulis; Lukas R.A. Wilde (eds.): Emoticons, Kaomoji and Emoji: The Transformation of Communication in the Digital Age. New York [Routledge] 2000, pp. 171-196

MASAKI, YUKI; WILLIAM W. MADDUX; TAKAHIKO MASUDA: Are the Windows to the Soul the Same in the East and West? Cultural Differences in Using the Eyes and Mouth as Cues to Recognize Emotions in Japan and the United States. In: Journal of Experimental Social Psychology, 43(2), 2007, pp. 303-311. [accessed July 14, 2023]


1 It is important to emphasize that ASCII-emojis do not only consist of the ASCII-code but can be composed with different encoding systems from different character sets (cf. Enzmann 2023: 76f.).

2 For a more detailed explanation of the conception and application of this semiotic model, see Emojisierung: Eine historische und semiotische Studie zu Emojis (Enzmann 2023). In the present article, only specific aspects relevant to the exemplary analysis are mentioned. To understand the complexity of the model and its application in more detail, the original reading would be necessary.

3 Peirce sometimes called the means ›representamen‹ or simply ›sign (in itself)‹. However, he thought that the word ›medium‹ could replace the term sign since the sign stands as an intermediary between an object and an interpretant (cf. Nöth 2000: 467). In this paper, the term ›medium‹ is used in a similar way. This seems to make sense in that it refers to an aid or a tool that – like a sign – has been designed to achieve a certain goal.

4 In the subsequent article, the terms in figure 2 will be used. Note especially that the one who interprets the sign is called the ›interpreter‹ (not the ›interpretant‹).

5 Peirce used a different ordering system of sign classes (cf. Peirce 1983: 133). Since, in the following, the semioticity of emojis will be examined, however, the ordering system according to the Stuttgart Institute will be used.

6 In my actual thesis, I introduce and discuss more classes of signs, based on the analyses of my case studies (cf. Enzmann 2023: 131-135; 166-171).

7 A kind of ›meta-level‹ to the emojis was created, where their alleged meaning is paraphrased. This serves to crystallize the intentions of the sender. Despite my attempt to ›translate‹ the emojis in this way, I am not claiming that the textual interpretation of the signs can actually replace the emojis; the paraphrases have an explanatory function and are meant to explain the intention of the user when using an emoji. All messages are taken from original dialogues, meaning they were not explicitly created for the study. They were instead explained or ›translated‹ by the actual authors of the respective messages. The original messages in German are reproduced in authentic form, preserving errors, dialectal elements, and colloquial expressions. In addition to indicating the age of the sender, labels such as ›W‹ for female and ›M‹ for male, and abbreviations such as ›ÜT‹ for translation, ›A‹ for sender, ›B‹ for receiver, and ›T‹ for emoji(s) from the case study supplement the examples. For the analysis, 10 messages were selected and qualitatively semiotically analyzed. The messages and analyses presented in this article are only a small part of a larger study. The English translations do not originate from the respective users and it cannot be guaranteed that the authors would translate the messages in the same way. They were instead created by myself specifically for this article.

8 Unfortunately, some linguistic expressions are lost in the translation process. The receiver used »freu freu«, which is here translated as »yeeeey«. The translation is not entirely satisfactory, however, because »freu« is the inflective of the noun ›Freude‹, which stands for a positive emotion and can be translated as ›Joy‹, but ›Joy‹ is not used in a similar way as a verb as ›Freude‹ is. The user wanted to use the graphic emoticon to express his happiness about the upcoming meeting, so he translated the emoji as »freu freu«. »yeeeey« can thus only approximate this, with the difference that the word is not a designation of an emotion.

9 In my book, I consider applying further subdivisions (cf. Enzmann 2023: 131-139, 166-171). The consideration of the subclasses makes it possible to examine aspects of the sign process that represent important factors for communication and interpersonal relationships. Subclasses illustrate the multi-layered nature and complexity of any sign process and provide additional differentiation through the classification of signs into more complex sign classes.

About this article


This article is distributed under Creative Commons Atrribution 4.0 International (CC BY 4.0). You are free to share and redistribute the material in any medium or format. The licensor cannot revoke these freedoms as long as you follow the license terms. You must however give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. More Information under


Deborah Enzmann: Analyzing Emojis Semiotically: Towards a Multi-Dimensional, Theoretical Model Inspired by Charles S. Peirce. In: IMAGE. Zeitschrift für interdisziplinäre Bildwissenschaft (Themenheft: The Semiotics of Emoji and Digital Stickers), Band 38, 19. Jg., (2)2023, S. 178-195





First published online