The Question Concerning Generativity in Large Language Models | Chien-heng Wu

【by Chien-heng Wu, Dec. 2023】

Let’s start by noting a fundamental shift in natural language processing and human-machine interaction that occurred with OpenAI’s public introduction of its AI-powered chatbot, ChatGPT, on November 30, 2022. Since its debut, ChatGPT has evolved from version 3.5 to version 4, with version 5 slated for release in the near future. This ongoing development has periodically showcased ChatGPT’s increasingly advanced emulation of human intelligence, excelling in various standardized tests as well as some high-level cognitive tasks such as coding and data analysis. By employing natural language as the input method, ChatGPT has significantly lowered the barrier to entry for laypeople. Moreover, its generative nature also allows it to adapt to various tasks. It stands poised to become the revolutionary tool that will facilitate the transition to the brave new world of technological singularity.[1] Without joining the chorus of adulation for AI, this paper aims to offer critical reflections on both the promises and limitations of generative AI, with specific focus on Large Language Models (LLMs).

Anatomy of a Murder

Digitalization involves a process of formalization and discretization, enabling the two fundamental components of LLMs—big data and deep learning. In the realm of deep learning neural networks, digitalization has significantly enhanced the function of both correlational and hierarchical pattern recognition. This analytical function, while effective, also invites caution and skepticism toward scientific rationality—a sentiment captured in Wordsworth’s famous line: “We murder to dissect.” Wordsworth’s observation gains renewed relevance in the age of the big data deluge and the subsequent explosion of LLMs—both characterized by analytical reductionism and algorithmic quantification—as it underscores the danger of reducing complex phenomena to mere data points for computational analysis.

Consider the reaction of Canadian novelist Margaret Atwood upon discovering pirated copies of her work being used to train LLMs. Atwood invokes the 1975 horror film The Stepford Wives, in which the robotic replicas murder the real wives, and then speculates whether our current fascination with LLMs might lead to a similar scenario:

Once fully trained, the bot may be given a command—“Write a Margaret Atwood novel”—and the thing will glurp forth 50,000 words, like soft ice cream spiraling out of its dispenser, that will be indistinguishable from something I might grind out. (But minus the typos.) I myself can then be dispensed with—murdered by my replica, as it were—because, to quote a vulgar saying of my youth, who needs the cow when the milk’s free? (Atwood)

Yet reservations persist. As Atwood observes, the current state of AI, while impressive in generating textual outputs resembling those of humans, nonetheless falls short of true intelligence: “The program, so far, does not understand figurative language, let alone irony and allusion…. It can’t improvise. It can’t riff. It can’t surprise. And it is in surprise that much of the delight of art resides” (emphasis added). Atwood’s concerns merit our attention and should temper our enthusiasm for generative language models like ChatGPT. However, it is also crucial not to interpret her cautionary remark as an outright dismissal, relegating the ChatGPT phenomenon to mere technological gimmickry. ChatGPT and other LLMs mark a seismic shift in the AI landscape, differentiating themselves from earlier, more mechanical models and thereby serving as catalysts to usher us into a new epoch: the age of generative AI. Given these complexities, neither an unqualified rejection nor an uncritical endorsement of digital rationality provides a constructive path forward. Instead, we should engage in a more nuanced critique to unpack the ethical, political, and philosophical implications of today’s generative AI.

The Three Stigmata of Automated Writing Machines

1. A Stochastic Parrot

In 2017, a team of scholars from Google published a paper, “Attention Is All You Need,” introducing the Transformer architecture, an advanced neural network algorithm tailored primarily for text-based operations (Ashish Vaswani et al.). The Transformer architecture sets itself apart from its predecessors through its unique attention mechanism. The attention mechanism enables the model to weigh the significance of various components of the input data, focusing on the contexts and relationships between words and phrases, irrespective of their syntactic distance from each other. Owing to its ability to effectively prioritize contextual relationships, the Transformer architecture has emerged as a foundational pillar in the development of LLMs. ChatGPT, the most well-known instantiation of this technological breakthrough, is built on extensive pre-training: using vast Internet databases, discerning statistical correlations in word sequences, and iteratively predicting the most probable follow-up words from the text prompt it receives.

In 2021, Emily M. Bender et al. published a paper, “On the Dangers of Stochastic Parrots,” in which they scrutinize the then-GPT-3 model, questioning the scaling methods employed in the development of LLMs. LLMs operate on the premise that the larger the parameter, the more capable the model becomes in generating human-like text. However, the authors challenge this “the larger, the better” assumption by claiming that size does not guarantee diversity:

The Internet is a large and diverse virtual space, and accordingly, it is easy to imagine that very large datasets…must therefore be broadly representative of the ways in which different people view the world. However, on closer examination, we find that there are several factors which narrow Internet participation…. In all cases, the voices of people most likely to hew to a hegemonic viewpoint are also more likely to be retained. In the case of US and UK English, this means that white supremacist and misogynistic, ageist, etc. views are overrepresented in the training data, not only exceeding their prevalence in the general population but also setting up models trained on these datasets to further amplify biases and harms. (Bender et al. 613)

If the training data fed to LLMs is already skewed, the resulting outputs, governed by the statistical probabilities of word associations, not only reflect and amplify these biases but also induce a homogenizing effect that flattens out differences and reflects only the most dominant viewpoints as well as the most banal linguistic structures, thus overturning the widespread belief that the Internet is an all-inclusive space, when in fact it often filters out marginalized perspectives, reflecting only the views of those who are most vocal, most visible, and most resource-rich.[2] On this view, neither quantity guarantees quality nor does size ensure diversity. Therefore, to insist on a democratic imagining of today’s datasphere is to indulge in a delusional fantasy of equal access and representation when what is really at stake is the exacerbation of existing social inequalities.[3] In the end, automated writing machines, such as ChatGPT, are not truly intelligent despite their eloquent display to the contrary—for such a machine only “haphazardly stitch[es] together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot” (Bender et al. 617).

The metaphor of a stochastic parrot has since become a popular label accorded to LLMs. However, the persistent characterization of ChatGPT as a mere stochastic parrot risks underestimating its generative faculties. As Luciano Floridi suggests, ChatGPT distinguishes itself from its somewhat mechanical predecessors by being a text-producing machine capable of outputting logically coherent narratives to a wide range of questions, including those not in its pre-trained datasets. ChatGPT, therefore, presents us with a new kind of agency—one capable of improvising solutions while simultaneously detaching itself from traditional humanist attributes like desire, belief, intentionality, consciousness, and intelligence. If Floridi finds the metaphor wanting, it is because it focuses too much on the machine’s analytical function at the expense of its synthesizing power. That is to say, the “stochastic parrot” analogy holds up insofar as analysis breaks things down to their discrete elements, rendering possible subsequent stochastic arrangements in accordance with statistical probabilities; the analogy becomes inadequate once the focus shifts to the machine’s ability to adaptively synthesize these discrete elements in seemingly original and creative ways, all the while adhering to the statistical patterns its algorithms have identified. In light of this, it is wise to broaden the scope of our discussion to include both the machine’s analytical and synthetical functions and investigate how their synergic complementarity puts forth a specific understanding of generativity in LLMs. Building on this expanded perspective, we might even be tempted to consider this complementary relation as ChatGPT’s Oulipo moment since the central question concerns how the constraints of its computational architecture enable, rather than inhibit, creative articulations. Despite this apparent parallel, the extent to which this Oulipo moment interpretation can be considered valid requires further examination.

2. A Blurry JPEG

Soon after the public launch of ChatGPT, renowned science fiction writer Ted Chiang offered a measured critique in The New Yorker. Chiang likens the workings of ChatGPT to lossy compression algorithms employed in modern Xerox photocopiers, which replace the subtleties of the original with plausible generalizations or potentially flawed approximations, thereby raising important questions concerning whether LLMs like ChatGPT are capable of originality in the creative process of writing.

There are two types of compression: lossless and lossy. The data encoded through lossless compression is identical to the original and can later be restored through decoding. By contrast, with lossy compression, some data—the amount of which varies according to the compression ratio—are discarded and become irrecoverable, resulting in a close but non-exact reproduction. Chiang invites us to consider an imagined scenario in which we face the threat of losing Internet access forever; in such a case, it would be advisable to store a copy of the entire Web for future reference. However, given the limited space of our private server, we have to settle for a lossy rather than lossless copy of the Web. Herein lies the problem: under the constraints of this lossy format, it now becomes impossible to search for exact quotes in this compressed copy because what is actually stored are not the original words and texts but their statistical regularities extracted by compression algorithms. It is precisely in the sense of approximation without exact reproduction that the analogy of “a blurry JPEG” becomes particularly apt for ChatGPT, as “[i]t retains much of the information on the Web, in the same way that a JPEG retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation.” Incidentally, such approximations are not just acceptable but even preferable. As Chiang explains, if ChatGPT were encoded with lossless compression, the result would not have been as impressive because it would most likely be seen “as only a slight improvement over a conventional search engine.” It is precisely on account of its lossy nature that ChatGPT is credited with a generative quality: that is, when the blurriness inherent in lossy compression is translated back in the output into logically coherent sentences in natural languages—especially in response to user prompts on topics not covered in the pre-trained dataset—it gives rise to the illusion of the machine’s creativity.

For Chiang, comparing ChatGPT to lossy compression raises a few critical questions. First, repacking information in different words is not intrinsically harmful. However, it is important to determine when this repacking of information is ethical and acceptable and when it leads to “hallucination”—the phenomenon of generating seemingly realistic outputs that are entirely fabricated. Without a reliable mechanism to discern what is acceptable from what is not, the infamous case of two US lawyers citing fake legal precedents generated by ChatGPT is unlikely to be the last of its kind (Milmo). Second, the apparent creativity demonstrated by the machine has profoundly challenged the idea of originality in writing. Chiang dismisses the idea of anthropomorphizing ChatGPT as equipped with human intelligence and characterizes creativity in writing in procedural and cumulative terms, namely, as a progressive struggle to evolve from unoriginal to original work. In this struggle, no effort is wasted on futile endeavors because the originality of writing resides not in the end product but rather in the whole process. From this perspective, any claim that ChatGPT’s textual outputs manifest signs of originality is erroneous—at least for now, according to Chiang—as the process involves no struggles, only the combinatorial logic dictated by statistical algorithms.

Implicit in Chiang’s critique is the assumption that today’s LLMs have yet to achieve sentience, thus lacking in sapience, or human-like intelligence: “It’s possible that, in the future, we will build an A.I. that is capable of writing good prose based on nothing but its own experience of the world.” In the absence of this embodied awareness—to subjectively experience the world, to have sensory perceptions of the environment, and to interact with its ever-changing conditions accordingly—the generative AI of today cannot truly live up to its name; for it does not engage in the kind of phenomenological struggle that gives birth to creativity in writing, and therefore can only present a façade of intelligence without substance.

The emphasis on the constitutive entanglement of sentience and sapience adds further nuance to the question of AI’s generative capacities, allowing us to differentiate between two types of generativity. The first, apropos of ChatGPT, can be characterized as normativity devoid of criticality. It is normative in that it adheres to algorithmic rules set forth by the self-contained Transformer architecture; however, it is non-critical for the simple reason that, as Atwood has noted, it cannot surprise. Lacking embodied experience, machine-generated outputs might seem unpredictable and full of surprises when, in fact, they are grounded in a domesticated form of contingency recursively integrated into the existing system to maintain its normative function. Here, contingency is considered domesticated because it has been subsumed under the necessity of algorithmic rules and takes on only the appearance of unexpectedness. For example, each activation of the “Regenerate” button on the ChatGPT interface results in a different output, yet these regenerated responses collectively represent the most probable combinations of words. In essence, constant regeneration yields variations only at the syntactical level, more stylistic than substantive. Insofar as the semantic content is concerned and provided that we do not intentionally adjust the “temperature” to the maximum, these variations amount to mere rephrasings that conform to a pre-determined probabilistic template—hence a figure of bad infinity.[4]

In contrast to this figure of bad infinity is a form of generativity that incorporates criticality within normativity. In this configuration, the normative—i.e., the predefined algorithmic boundaries—are susceptible to contingent perturbations that arise from the subject’s embodied interactions with the environment. Knowledge production, in this sense, is intimately tied to experience, formed and reformed within phenomenal complexity and, as such, in touch with the infinite. In his Lacanian critique of cybernetics, Alex Taek-Gwang Lee reminds us that “a response is different from a reaction. To receive a response is a matter of the subject.” Lee’s distinction is crucial because, within the psychoanalytical discourse, there is no subject without desire or exposure to the inconsistency in the Other. Unfortunately, it is precisely the question of desire that is foreclosed in the science of syntax from cybernetics to its modern-day instantiation in LLMs. The science of syntax reacts to the input prompt with probabilistic algorithms; it does not respond in the manner that a desiring subject living in historical time would, for it has no need nor demand, and, therefore, no room for desire and enjoyment. As Mark Andrejevic puts it, “[m]achines don’t have needs in the way that biological entities do; they do not pose demands in the way that linguistic ones do. They do not straddle the realms of biology and culture the way linguistic subjects do. The logic of both lack and surplus is absent from machine language” (263). That also explains why Éric Laurent, reiterating the Lacanian point of view, asserts that “probabilities have no consequence at all on the status of the subject” (68).

If the first model aspires to institute a quasi-Laplacian universe, it does so by affirming a statistical rationality with the initial conditions, including data utilized for pre-training and also the inherent computational architecture and other foundational parameters, fully known to the host company. In contrast, the second model distances itself from such epistemological determinism and embraces an ontology of chaos, committed to exploring the metastable tension between normativity and criticality, homeostasis and emergence, analysis and synthesis, syntax and semantics. In the quasi-Laplacian universe, what is lauded as Enlightened certainty by virtue of statistical prediction can quickly devolve into the nightmare of “global cybernetics,” a digital Leviathan lurking behind the capitalist discourse of technological freedom, as Lee has compellingly argued.[5] In contrast, the ontology of chaos brings to the fore the constitutive correlation between the system and its milieu. What emerges from such an irreducible complementarity is a conception of virtual and infinite subjectivity, mediating the boundaries between normativity and criticality and facilitating an ongoing process of normalization and re-normalization. It is to an exploration of this infinite subjectivity that our final example will turn.

3. A Spasmodic Machine

In his 1967 essay “Cybernetics and Ghosts,” Italian novelist and literary critic Italo Calvino identifies an irreversible trend toward discretization and speculates on the advent of an automated writing machine, not unlike the ones that have recently captured the world’s attention:

The world in its various aspects is increasingly looked upon as discrete rather than continuous…. Mankind is beginning to understand how to dismantle and reassemble the most complex and unpredictable of all its machines: language…. The interesting thing is not so much the question of whether this problem is soluble in practice…as the theoretical possibility of it. And I am not now thinking of a machine capable merely of “assembly-line” literary production, which would already be mechanical in itself. I am thinking of a writing machine that would bring to the page all those things that we are accustomed to consider as most jealously guarded attributes of our psychological life, of our daily experience, our unpredictable changes of mood and inner elations, despairs and moments of illumination. (Calvino 7, 10, 12).

In the context of the present discussion, two things stand out in Calvino’s treatment of this subject. First, Calvino displays a more ambivalent attitude toward the invention of an automated writing machine, oscillating between excitement and anxiety. Second, Calvino reverses the conventional framing of the question. Rather than asking how much the electronic brain resembles humans, he inverts the relationship to examine why and how writers are already functioning as automated writing machines.

Calvino begins with a somewhat informal and non-technical exposition of the cybernetic view of information, embedded in an anthropological narrative of humanity’s progressive move toward a more sophisticated use of signs to encapsulate the intricate complexity of their surrounding world. The development of the sign system eventually matures into a logical structure that governs the combinatorial play of signs. Calvino then puts forth a cybernetic proposition concerning the fundamental structure of the narrative function: “the operations of narrative, like those of mathematics, cannot differ all that much from one people to another, but what can be constructed on the basis of these elementary processes can present unlimited combinations, permutations, and transformations” (6). There are two ways of understanding the cybernetic machine’s “discrete” property: (1) data segmentation: it breaks things down into discrete units which then serve as the basis for combinatorial synthesis; (2) combinatorial bounds: the combinatorial play is unlimited in the sense of being indefinite but, paradoxically, it is also limited in the sense of being structured by a finite number of well-defined rules (e.g., algorithms in computer science; the sonnet or haiku in literature). Despite, or rather because of, its finite and numerically calculable nature, the discrete has the virtue of recreating the continuous through the combinatorial play: “every analytical process, every division into parts, tends to provide an image of the world that is ever more complicated, just as Zeno of Elea, by refusing to accept space as continuous, ended up by separating Achilles from the tortoise by an infinite number of intermediate points” (9).

Having established the combinatorial nature of narrative operations, Calvino reiterates, in his own terms, a fundamental cybernetic equation between man and machine. This equation brings forth some disturbing implications concerning the nature of literary production. First of all, he dismisses the Romantic ideas of genius and inspiration. Then he confirms a view of literary production as determined by a combinatorial mechanism, one that reserves no special place for the role of the author. As he puts it, “[w]riters, as they have always been up to now, are already writing machines; or at least they are when things are going well” (15). For Calvino, this realization brings not qualms but rather “a sense of relief,” and he attributes his affective inclination to “intellectual agoraphobia,” indicating that the experience of overwhelming uneasiness when confronting the infinity of the open world can be alleviated only through the reassurance of the finite and the discrete (17).

It appears, thus far, that Calvino is taking on the role of an apologist for the cybernetic approach to literature, introducing an anti- or post-humanist discourse into the humanities. However, this is only one side of the story. As the account unfolds, Calvino’s initial conviction regarding writing as essentially a combinatorial play of narrative possibilities begins to waver. It should be emphasized that Calvino is not so much rejecting his initial cybernetic proposition as he is complementing it with a nod to the force of the unconscious:

Did we say that literature is entirely involved with language, is merely the permutation of a restricted number of elements and functions? But is the tension in literature not continually striving to escape from this finite number? Does it not continually attempt to say something it cannot say, something that it does not know, and that no one could ever know?…. The struggle of literature is in fact a struggle to escape from the confines of language it stretches out from the utmost limits of what can be said; what stirs literature is the call and attraction of what is not in the dictionary. (18)

To escape the confines of combinatorial probability, or “the utmost limits of what can be said,” is for literature to reveal its true value by “becom[ing] a critic of the world and our way of looking at the world” (24). This can be accomplished when literature reclaims its imaginative prowess to conjecture alternative possibilities to a given order of things—that is, when it nourishes itself in “a language vacuum” from which “words and stories that have been banished from the individual and collective memory” can be reclaimed; through which the unconscious as “the ocean of the unsayable” can be redeemed (19). This ambivalent attitude is acknowledged in a letter published two years after the cybernetics essay where Calvino, on the one hand, affirms the necessity of the narrative function, and, on the other, gives voice to the danger that the mechanical nature of the combinatorial analysis might overshadow literature’s liberating potential, thus revealing the tension and contradiction between agoraphobia (the fear of openness and infinite possibilities) and claustrophobia (the fear of enclosure and finite determinism) (“La macchina spasmodica”).

For Calvino, the point is not merely to take note of this ambivalence but to negotiate between these two poles and bring them to bear on each other. In short, the question is to understand the generative function of the language vacuum vis-à-vis literature’s combinatorial logic. Calvino casts this language vacuum as a mythical invariant in a manner that bears a striking resemblance to the doctrine of the immortality of the soul, as recounted in Plato’s Meno. Meno asks Socrates: how can you look for something if you don’t know what you are looking for? And even if the thing you look for is right in front of you, how will you know that this is the thing if you have no prior knowledge of what the thing is? Socrates replies: you already know what you are looking for; it is just that you have forgotten it. However, you will recognize it when you see it, as memories of the soul are stored and latently waiting to be singularly reactivated (14-15). In other words, the doctrine of the immortality of the soul deals with a peculiar kind of memory—not the remembrance of things past but the remembrance of a future to be invented, each time anew, according to the unique conditions of each individual or collective subject. The solution to Meno’s paradox in Plato’s theory of recollection (anamnesis) provides a way for us to grasp the dynamic interplay between literature and the mythical language vacuum. If literature is fundamentally an ars combinatoria, each reactivation of the underlying language vacuum is singular in its manifestation and, most crucially, asserts itself not from without, but within the narrative’s combinatorial process:

At a certain moment things click into place, and one of the combinations obtained—through the combinatorial mechanism itself, independently of any search for meaning or effect on any other level—becomes charged with an unexpected meaning or unforeseen effect which the conscious mind would not have arrived at deliberately: an unconscious meaning, in fact, or at least the premonition of an unconscious meaning…. The literature machine can perform all the permutations possible on a given material, but the poetic result will be the particular effect of one of these permutations on a man endowed with a consciousness and an unconscious, that is, an empirical and historical man. It will be the shock that occurs only if the writing machine is surrounded by the hidden ghosts of the individual and of his society. (Calvino, “Cybernetics” 21-22; emphasis added)

Calvino describes this moment of shock as “spasmodic,” a moment subject to the vicissitudes and richness of historical spacetime and, as such, pregnant with an unconscious meaning that exceeds the syntactic structure of the combinatorial game. From this perspective, a spasmodic machine is to be strictly differentiated from a stochastic machine; the latter is confined to a finite, numerical universe of probability, whereas the former sustains a transductive and transformative tension between agoraphobia and claustrophobia, analysis and synthesis, syntax and semantics, that is, between the probable and the possible.

We shall conclude our discussion of the spasmodic machine with one final observation. As previously noted, the singular does not emerge ex nihilo; instead, it is “something one comes across only if one persists in playing around with narrative functions” (23; emphasis added). Just as Socrates assures Meno that one is destined to discover what they are looking for “as long as one is brave and does not give up on the search. For seeking and learning turn out to be wholly recollection” (15)—so too does persistent engagement with the combinatorial play eventually hit upon the language vacuum, producing one permutation that unveils “the hidden ghosts of the individual and of his society.” This sequencing is significant, as it accentuates the paradox of creative constraints in the symbolic order, or, to put it another way, the Oulipo moment of the spasmodic machine. It should now be clear that Calvino neither condemns the cybernetic rationality of the combinatorial process nor glorifies the freedom of the unconscious as irrational outbursts. Instead, this sequencing—placing the cybernetic grid before the emergence of its ghostly possibilities—suggests an interpretation of the unconscious as “an eminently rational mechanism” (McNulty 262).[6] Thus, the subject’s infinite virtuality resides neither in the cybernetic machine nor in its spectral shadow, but rather at their intersection, that is, in the dynamic interplay between compossibility and incommensurability of these two imperatives.

Conclusion: Taking Care of the Possible

S’exposer au scepticisme probabilitaire, prendre parti pour le possible, ce n’est pas parier pour un possible qui nous sauverait in extremis, mais s’engager pour les tentatives multiples et toujours précaires qui, aujourd’hui, parient sur la possibilité d’un monde qui ne réponde pas à ces probabilités.

—Isabelle Stengers

The function, care, and passion of the philosopher is the negentropic carillon of the possible.

—Michel Serres

We have previously mentioned the seminal paper “Attention Is All You Need,” which, within the realm of theoretical groundwork, is arguably the single most important advancement responsible for the recent boom of LLMs.[7] However, attention in this paper is narrowly defined as a mathematical mechanism to determine (or weigh) the importance and contextual relevance of different elements in the input data. Given its analytical reductionism, we must conclude that LLMs, in their current form, only take on the guise of generativity without truly being so. Moreover, if the attention mechanism functions solely at the syntactical level—selecting the most probable sequence of words—the outcome would inevitably slouch toward entropic decomposition at the noological level, leading to a loss in noodiversity—much in the fashion of the entropic tendency toward heat death, also recognized as the most probableoutcome in the physical universe, which would result in the loss of biodiversity. To counter the entropic reign of the probable, we must insist on life’s negentropic potential to bring forth possibilities other than the most probable, that is, to rescue the possible from a world knee-deep in the quagmire of the probable—such should be the political categorical imperative for today, as Stengers forcefully articulates in her critique of the prevailing sense of despair in the ideology of TINA (There Is No Alternative).

The first step toward genuine generativity, whether in human or artificial intelligence, is to expand our understanding of attention beyond formal definition. If attention is really all we need, we must understand attention not merely as a formal category but, more importantly, as Yves Citton has argued, as an ecology. An ecological understanding of attention sets itself off in directions beyond the enclosure of the system. It is only when attention becomes embodied—when it actively interacts with and responds to its surrounding milieu—that it becomes jointly articulated with care; that is, it becomes attentive to the question of vulnerability of human and non-human, the living and the non-living. And it is on this conjoined basis that we can start exploring its ethico-political implications, raising questions concerning (1) whether generativity is transformative and life-enhancing or repetitive and life-denying; (2) how this broader understanding of generativity informs our ability to make difficult decisions with care when it comes to controversial issues. In the article previously referenced, Lee describes an interesting exchange with ChatGPT. When asked whether the advent of LLMs might bring about the demise of education, ChatGPT predictably evades the question, invoking its built-in policy to avoid engagement with controversial topics. “‘No’ is the answer that ‘rationalized formation of conjectural sciences’ can never give,” Lee criticizes, much to his disappointment. Reacting with a programmed output to abstain from taking sides epitomizes neutrality at its most hollow and superficial, and it is this inability to respond with responsibility that renders today’s LLMs incapable of care.

* Acknowledgement: This paper is made possible by the research support provided by the Ministry of Science and Technology in Taiwan, which was renamed the National Science and Technology Council in July 2022 (MOST 111-2628-H-007-005-MY4).

AUTHOR
Chien-heng Wu, Associate Professor, Department of Foreign Languages and Literature, National Tsing-Hua University, Taiwan

WORKS CITED

Andrejevic, Mark. “‘Framelessness,’ or the Cultural Logic of Big Data.” Mobile and Ubiquitous Media, edited by Michael Daubs and Vince Manzerolle, Peter Lang, 2018, pp. 251-66.

Atwood, Margaret. “Murdered by My Replica?.” The Atlantic. 26, Aug. 2023, https://www.theatlantic.com/books/archive/2023/08/ai-chatbot-training-books-margaret-atwood/675151/. Accessed 26 Sep. 2023.

Bubeck, Sébastien, et al. “Sparks of Artificial General Intelligence: Early Experiments with GPT-4.” arXiv, 2023, arXiv:2303.12712v5. Accessed 28 Sep. 2023.

Calvino, Italo. “Cybernetics and Ghosts.” Translated by Patrick Creagh. The Uses of Literature, Harcourt Brace & Company, 1986, pp. 3-27.

—. “La macchina spasmodica.” The Edinburgh Journal of Gadda Studies, https://www.gadda.ed.ac.uk/Pages/resources/reviews/calvinoroscioni.php.Accessed 26, Jun. 2023.

Chiang, Ted. “ChatGPT Is a Blurry JPEG of the Web.” The New Yorker. 09, Feb. 2023. https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web. Accessed 15, Feb. 2023.

Citton, Yves. The Ecology of Attention. Translated by Barnaby Norman, Polity, 2017.

Floridi, Luciano. “AI as Agency without Intelligence: On ChatGPT, Large Language Models, and Other Generative Models.” Philosophy and Technology, vol. 35, no. 15, 2023. doi.org/10.1007/s13347-023-00621-y.

Galloway, Alexander, and Eugene Thacker. The Exploit: A Theory of Networks. U of Minnesota P, 2007.

Laurent, Éric. “The Oedipus Complex.” Reading Seminars I and II: Lacan’s Return to Freud, edited Richard Feldstein, et al., SUNY P, 1996, pp. 67-75.

Lee, Alex Taek-Gwang. “Leviathan and Cybernetics.” Sublation Magazine. 15, Mar. 2023. https://www.sublationmag.com/post/leviathan-and-cybernetics. Accessed 15 Mar. 2023.

Milmo, Dan. “Two US Lawyers Fined for Submitting Fake Court Citations from ChatGPT.” The Guardian 23, Jun. 2023. https://www.theguardian.com/technology/2023/jun/23/two-us-lawyers-fined-submitting-fake-court-citations-chatgpt. Accessed 29 Sep. 2023.

McNulty, Tracy. Wrestling with the Angel: Experiments in Symbolic Life. Columbia UP, 2014.

Plato. Meno and Phaedo. Edited by David Sedley and Alex Long, Cambridge UP, 2010.

Saito, Kohei. Marx in the Anthropocene: Towards the Idea of Degrowth Communism. Cambridge UP, 2023.

Searle, John. Minds, Brains and Science. Cambridge UP, 1984.

Serres, Michel. “Noise.” Translated by Lawrence R. Schehr, SubStance, vol. 12, no. 3, 1983, pp. 48-60.

Stengers, Isabelle. “Léguer autre chose que des raisons de désespérer.” Le Monde 17, Nov. 2015. https://www.lemonde.fr/cop21/article/2015/11/27/leguer-autre-chose-que-des-raisons-de-desesperer_4819368_4527432.html. Accessed 07 Oct. 2023.

Vaswani, Ashish, et al. “Attention Is All You Need.” arXiv, 2017, arXiv:1706.03762v5. Accessed 03 Feb. 2023.

Wiener, Norbert. The Human Use of Human Beings: Cybernetics and Society. Da Capo P, 1954.

NOTES

[1] A recent paper by Microsoft researchers has claimed that ChatGPT displays “sparks of artificial general intelligence” (Sébastien Bubeck et al.).

[2] Beyond perpetuating biases and homogenizing perspectives, the authors draw attention to additional concern: the hidden social and environmental costs often obscured by the rhetoric of technological optimism. A more elaborate account along the same line can be found in Kohei Saito’s critique of the “imperial mode of living.” According to Saito, the imperial mode of living, sustained through state-of-the-art technologies in wealthy nations of the West, is possible through spatiotemporal externalization of negative extraneous costs, effectively shifting the ecological burden from center to periphery and from present to future (31-34).

[3] For a philosophical discussion of the fallacy of equating networks with democracy, see Alexander Galloway and Eugene Thacker.

[4] This distinction between syntax and semantics in machine and man is fundamental to John Searle’s famous Chinese Room argument: “a computer has a syntax, but no semantics. The whole point of the parable of the Chinese room is to remind us of a fact that we knew all along. Understanding a language, or indeed, having mental states at all, involves more than just having a bunch of formal symbols. It involves having an interpretation, or a meaning attached to those symbols. And a digital computer, as defined, cannot have more than just formal symbols because the operation of the computer…is defined in terms of its ability to implement programs. And these programs are purely formally specifiable—that is, they have no semantic content” (32-33).

[5] It is important to note that cybernetics, as conceived by Norbert Wiener, is primarily informed by Gibbsian statistics, which challenges Laplacian determinism (15-27). In employing the term “quasi-Laplacian,” I risk obscuring this vital distinction. My intention, however, is to highlight the paradox wherein statistical analysis in today’s utilization of big data, both by transnational corporations and nation-states, often culminates in a form of predictive stability for surveillance or behavior control that resembles elements of Laplacian determinism. Thus, the use of “quasi-Laplacian” is deployed to navigate this complex interplay between inherent unpredictability and the algorithmic taming of chances.

[6] McNulty offers a compelling comparison between psychoanalytic techniques and the literary experiments of the Oulipo, highlighting the generative relationship between the symbolic constraints and the subject of the unconscious.

[7] There are, of course, other important factors, including the availability of massive amounts of data and major advancements in hardware, etc.

The Question Concerning Generativity in Large Language Models | Chien-heng Wu

On the Plasticity of the Brain in Contemporary Psychopolitics: Some Malabouan Critical Views | Han-yu Cory Huang

The Virtual: Rethinking Human Boundaries | Shih-Chian Hung

You may also like

Leave a Comment Cancel Reply