Skip to main content

The words we ban: a look at the AI slop vocabulary

Every Döppelscript generation passes through a filter that strips a small list of words and phrases before the text reaches you. The filter runs server-side, in code, on every single generation, regardless of which voice profile you are using. This post is the list, the reasoning behind each entry, and what Döppelscript does instead when the model tries to reach for one of them.

The list

The hard ban, as of this writing, includes:

There is also a longer soft list that the prompt warns the model to avoid, but the entries above are the ones that get replaced in post-processing even if the model manages to emit them anyway. They are belt and braces.

Why these specific words

Most banned-word lists come from a marketing team's subjective taste. These came from a specific observation: these words and phrases are the ones that, if I see them in a sentence, I cannot unread the fact that the sentence was written by an AI. They are the tells. And the tells have a distribution: some of them come from the training data, some come from the prompt defaults that most AI writing tools ship with, and some come from the statistical tendencies of large language models themselves.

Delve is the most notorious. It started showing up in ChatGPT output around early 2023 and never stopped. It is the word every researcher studying AI-generated academic text identifies first. Human writers used to use "delve" occasionally; post-2023 AI writes "delve" at roughly 60 to 80 times the pre-2023 human frequency. Any sentence with "delve" in it triggers a pattern-recognition reflex in readers who are paying attention. Replacing it with "explore" is not a big loss. Explore is a fine word.

Leverage as a verb is the corporate equivalent of delve. It is passive, it is fashionable, and it says nothing. "Leverage this opportunity" and "use this opportunity" mean the same thing, except one of them is what a middle manager would write if a middle manager had just been told to sound strategic. The model picks leverage because leverage is the most common verb in the corporate writing LLMs trained on. Döppelscript replaces it with "use."

Synergy and its conjugations got banned for the same reason anyone over thirty bans them at work: they stopped meaning anything the first time a consulting firm put them on a slide in 1994, and nothing has rehabilitated them since. The word is functional replacement for "collaboration" or "work together." Döppelscript picks one of those instead.

Game-changer and deep dive are stock phrases that signal "I could not think of a more specific thing to say so I reached for the most marketable generic available." A game-changer becomes "significant shift" or gets rewritten around. A deep dive becomes "close look." In almost every real sentence, a more specific verb would have been better than either of them, which is what the replacement forces.

Landscape is a tricky one. The word has legitimate uses, including geographic and visual design contexts. The ban is specifically on the cliché usage, where "the current landscape" or "the evolving landscape" is a throat-clearing phrase that delays the actual point of the sentence. Döppelscript strips these openers when it can detect them and lets the sentence start at the real subject.

The em dash is a sentence-level tell, not a word. The em dash is fine on its merits (this post uses semicolons and colons and commas and still communicates everything it needs to), but when generative AI reaches for an em dash, it usually does so in the same specific way: to insert a mid-sentence parenthetical aside that sounds smart. The tell is not the character; the tell is the frequency. Human writers use em dashes maybe once every few paragraphs. LLMs tend to use them several times per paragraph, especially in the "confident authority" voice they default to. Stripping them forces the model to commit to a sentence structure instead of hedging with an aside.

"In today's..." is the most obvious tell of all. Every AI-generated article about anything seems to open with "In today's fast-paced digital landscape, ..." or "In today's interconnected world, ..." or "In today's competitive market, ..." None of these openers add information. All of them signal "this was written by a model that was trained on a lot of content marketing blog posts." Döppelscript strips the opener and lets the post start at the actual first real sentence.

What the filter catches that the prompt doesn't

The filter exists because asking the model nicely to avoid these words does not work reliably. You can tell Claude "do not use the word delve under any circumstances" in the system prompt (and Döppelscript does), and the model will comply 95 percent of the time and then slip it in the 5 percent of cases where the context feels like it called for delve. The filter catches that 5 percent. It is simple regex substitution and it runs in milliseconds. It is not sophisticated and it does not need to be.

There are failure modes. Sometimes the replacement is graceless. "We will use this opportunity" in place of "We will leverage this opportunity" reads fine, but some substitutions occasionally produce an awkward phrase that a human editor would rewrite differently. Döppelscript is not trying to be a perfect editor. It is trying to make sure that the one thing that does ship is not recognizably generic AI writing. If you spot a bad substitution in your output, the editor step is right there and you can fix it in three seconds.

What's not on the list

The absence of certain words from the list is deliberate.

"Actually," "basically," "literally" are conversational fillers that can be signals of real voice, not AI slop. A particular user might write "literally" sincerely and often, and banning it would flatten their voice.

"Utilize" was considered. It is a worse version of "use" in almost every sentence. But it is also a word some people genuinely write in engineering documentation and academic contexts. Banning it would be overreach.

The semicolon is absolutely not on the list. The semicolon is fine. The em dash is banned because of its distinctive frequency and context in AI output, not because any punctuation mark deserves to be blacklisted on principle.

Bullet lists are not banned but they are discouraged in the prompt. LLMs default to bullet lists for anything that could conceivably be broken into points, and that default is wrong for LinkedIn posts, where prose is more effective than lists for the short-form audience. If a user's voice profile genuinely calls for bullet lists, the profile can override the default.

Why this matters for the product

Döppelscript is a trust product. The whole thing only works if the output feels like the person it claims to represent. If a user generates a post, spots "delve" in the fourth sentence, and thinks "oh, this is the same tool as everything else," the product has failed, regardless of how good the rest of the post was. The ban list is not the main thing that prevents that; the voice profile does most of the work by matching sentence rhythm and word choice and stance to the user's own writing. But the ban list is the last line of defense before a generic AI tell can reach the screen, and it is cheap enough to run on every generation that there is no reason not to.

If you see a word in your Döppelscript output that should be on this list and isn't, email support@doppelscript.com. If your voice profile genuinely uses one of these words and you want it off the ban list for your account specifically, that is also worth emailing about. The list is not immutable; it is the current best guess at which words signal AI and which ones are just words.