There is a particular feeling writers describe when they read an AI draft of their own work. The grammar is clean. The sentences are competent. And it is not theirs. For a long time that was a vibe, the kind of complaint you could feel but not prove. As of 2026 there is a stack of peer-reviewed research that measures it, and the measurements are not flattering to the way most people use these tools.
What did the 2026 study on AI writing actually find?
A study led by Natasha Jaques of the University of Washington and Google DeepMind found that heavy reliance on AI does not only change how you write. It changes what you end up saying, and strips out the markers that make the writing recognisably yours.
The team gave 100 people an old, simple prompt: does money lead to happiness. Some wrote with heavy AI help, some with light edits, some with none. The participants who generated more than 40% of their text with a model produced essays that landed on a flat, neutral answer 69% more often than the writers who used little or no AI. The light and no-AI writers were passionate, one way or the other. The heavy-AI writers came out beige.
Then the harder number. The heavy-AI essays had 50% fewer pronouns, part of a broad shift toward impersonal language with fewer anecdotes and fewer references to actual human experience. The writing got further from a person. As Jaques put it, reported by NBC News, the models “are pushing the essays away from anything that a human would have ever written.” She calls it blandification. The study ran on three of 2025’s widely used systems: Claude 3.5 Haiku, GPT-5 Mini, and Gemini 2.5 Flash.
The quiet, awful detail: the heavy-AI writers were just as satisfied with their work as everyone else. They reported it was less creative and less in their own voice, and they were happy with it anyway. You do not feel the voice leaving. That is the part that should bother working writers most.
The same paper looked at editing, not just drafting, and found the models “replace a much larger fraction of the original writing than humans do when revising their own work.” A human editor swaps a word here and there and leaves your vocabulary mostly intact. The model rewrites wholesale. “This substitution of words,” the authors write, “contributes to the loss of individual voice, style, and meaning, as the unique lexical fingerprint of each writer is overwritten by the given model’s preferred vocabulary.” Which means the flattening happens even when you think you are just running a grammar check.
Is this the same thing as AI “homogenizing” writing?
Yes, and it is not one study having a bad day. A line of research now shows that when many writers lean on the same model, their writing converges toward the same generic centre.
Vishakh Padmakumar and He He, in a paper accepted to ICLR 2024, ran a controlled experiment where people wrote argumentative essays with a feedback-tuned model, with a base model, or alone. Writing with the feedback-tuned model produced a statistically significant drop in diversity and made different authors’ essays measurably more alike. The cause was specific: the model contributed the less diverse text, and everyone’s prose slid toward it.
A Cornell team presented work at CHI 2025 showing AI writing suggestions push prose toward generic Western norms and sand off cultural specificity, with non-Western writers paying the higher tax because they spent more time correcting the model back toward themselves. A 2026 piece in Trends in Cognitive Sciences pulls the thread up a level and frames the whole thing as a large-scale homogenizing effect on human expression and thought.
The pattern underneath all of it is the same. The model has a centre of gravity, a most-likely way of saying any given thing, and your prose drifts toward that centre unless something holds it in place.
But can’t AI write in my voice if I ask it to?
It can, but only when it is deeply conditioned on your actual writing, not when you hand it a one-line style description. That distinction is the entire game.
A 2025 study from the University of Michigan, Stony Brook, and Columbia (on arXiv) is the sharpest evidence here. With simple in-context prompting, the kind of “write in the style of” instruction most people use, expert judges preferred the human-written text 82.7% of the time. The AI was caught. But after the model was conditioned on an author’s complete body of work, the preference flipped: 62% of the time the experts preferred the machine. Same model. The only thing that changed was how much of the writer’s real prose was governing the output.
Their method was full fine-tuning, which is heavy and raises real questions about consent and copyright, so it is not a recipe for the rest of us. But the lesson is directional and it is the one that matters: weak conditioning loses the voice, strong conditioning holds it. “AI cannot do voice” is the wrong conclusion. “AI does voice badly when you barely tell it who you are” is the right one.
Then why does the voice drift back even when I give it samples?
Because in most tools your samples live in the prompt, and the prompt is the first thing the model lets go of. A style note at the top of a long generation decays within a few paragraphs as the model reverts to its own centre.
This is the mechanism behind the blandification finding. A prompt is a suggestion. The model weighs it against everything else it believes about “good writing” and, over a long passage, its own defaults win. You can watch it happen in real time. The first paragraph sounds like you. By the third the adverbs are back, the sentences have settled into an even, mid-tempo cadence, and a polite stranger has moved into the middle of your prose.
The fix is not a cleverer prompt. It is architectural. Your voice has to sit outside the prompt layer, as a constraint the output is measured against every time, rather than a hint the model is free to round off once it gets going. The difference between a setting and a constraint is the difference between asking nicely and not giving the option.
What does this mean for writers using AI?
The category of AI writing tool worth using is the one that treats your voice as a binding constraint, compiled from your own prose, rather than a setting you pick from a dropdown. Everything else flattens you by default, and now there is data to prove it.
Two practical things follow. First, stop handing the machine a single adjective. “Literary” and “punchy” are not voices, they are vibes, and the research is clear that vibes do not survive contact with a long draft. Give the tool real, extended samples of your own writing and make it derive the actual patterns: the rhythm, the sentence shapes, the diction, the way you handle dialogue, the things you do that you have stopped noticing.
Second, insist that the profile applies to every generation, not just the first. If a tool only respects your voice in the opening paragraph, it does not respect your voice. It respects your prompt, briefly, and then it goes back to being itself.
This is the same fingerprint I wrote about in voice isn’t a vibe: voice is measurable enough that a model can identify you from a couple of hundred words. The new research is the other half of that proof. The same fingerprint that makes you identifiable is the thing the average AI workflow quietly sands off. If you want it back, see how to find your writing voice and the constraint-based approach that holds it across a whole novel.
The honest summary is small and a little uncomfortable. Used the ordinary way, AI does not help you sound more like yourself. It helps you sound like the average of everyone, and it does it so smoothly that you stay satisfied while it happens. The tools that survive the next year will be the ones that treat that as the problem to solve, not a footnote. Writers will sort them by it, because writers are the ones who can feel it going.