What is the entropy of a ghost?
“Welcome to YOV (You, Only Virtual), the AI startup pioneering advanced digital communications so that we Never Have to Say Goodbye to those we love,” read the company’s website, before it was recently updated.
Rohrer said that his grief bot has an “in-built” limiting factor: users pay $10 for a limited conversation.
The fee buys time on a supercomputer, with each response varying in computational cost. This means $10 doesn’t guarantee a fixed number of responses, but can allow for one to two hours of conversation. As the time is about to lapse, users are sent a notification and can say their final goodbyes.
By now I think everone knows that LLMs work by stringing together patterns extracted from the huge amounts of text they were trained on. It's not true that they paste together the response like cut-up poetry, where each clipping comes from some web page or book written at some point in the past, because the text information is stored in the network's weights and not in a separate index (though you could generate entirely with RAG and then it would work like that). But it's true in spirit. LLM output is pastiche but it's also copy/paste.
If the LLM is trained on the text that a person has written then that person is, in spirit, part of the LLM. The LLM will repeat the same patterns of speech and of content that the person used. This is starkly true for "grief bots": conversational language models finetuned on the text of a specific dead person for their loved ones to use as a simulacrum of them. When you're using one of these services you are, in a very literal sense, renting someone's ghost. People's speech patterns seem to be predictable enough that these services are convincing, though having never used one myself I don't know how much of that is due to the system and how much is priming on the part of the user. The fine-tuning to generate one of these for a specific person is apparently cheap enough that they can be offered for $10 per conversation, which means it must cost less than tens of dollars of compute time to make one.
Haunts of the rich and famous
In Jaques Derrida's book "Spectres of Marx" he introduces the term "hauntology" originally with respect to Marx, arguing that Marxism will haunt society from beyond the grave. Mark Fisher appropriated it to refer to a process of cultural production that, rather than generating anything new, draws on echoes of the 20th century and feeds them back into themselves until they're sufficiently loud. He argues that "capitalist realism" is preventing culture from moving into the future by insisting that, since we live in the best of all possible worlds, any movement to a new world must be a movement to a worse one. As a result we not only are in a state of political stasis, constantly trying to resurrect political identities rooted in modes of production that haven't existed in the developed world in most people's lifetimes, but we're also in a state of cultural stasis, constantly triyng to resurrect culture from the 1950s and 1960s hoping that listening to rockabilly and recapitulating operation wetback will make houses cost one year of median wages. This cargo cult consumes the energies of people who might otherwise instigate to move things in a direction, and if you're sitting at the top of the power structure and in a position to influence mass culture you like things just where they are.
He's dead, Jim, but not as we know it
Dead Internet theory is the idea that all the content on the internet is generated not by actual people, but by bots. Proponents point to the movement of social media feeds to selecting content to show you not based on what you asked for but on an opaque recommender system. They point to the falicity of LLMs to generate text that looks like it was written by a person. They point to the proliferation of obvious astroturfing bot accounts on all the major social media platforms. They can also point to the practice of hellbanning (also called shadowbanning), where a social media website deals with a bad poster by making it appear that they're participating in the normal way but actually blackholing all their posts. If a user is banned and knows it they might create a new account, but if they think they're not banned and no one is responding to their posts the idea is that they get bored and leave. But wouldn't it be nice if you could still keep all that engagement and ad impressions? With modern LLMs you can.
There's no reason to think that 100% of the content on the internet is fake. In fact I run my own website and activitypub server and I know exactly what happens when I post to those. But it is true that more and more content is, if not actually generated by a predictive model, chosen to be served to the user by one. And it is true that the incentives baked into those systems are to maximize engagement and revenue. And it is true that as search engines get flooded with slop it becomes harder to find organic content.
The ghosts in the machine
Why do people find it plausible that everything on the internet might be synthetic? Since LLMs can only write things that are like text that is in their training data, and the training data is from the past, if people were writing things that were new it would be easy to identify machinic text. But if culture is hauntological, then people are doing the same thing that LLMs do, referencing old ghosts instead of creating anything new.
This diagnosis implies a cure; if we want to rise above the slop all we have to do is start creating the future.