There is a genre of advice thread that appears daily wherever writers use AI: how to humanise the output. Kill the em dashes. Ban “delve” and “tapestry.” Break up the rule-of-three sentences. Run it through a rewriter. The advice works, narrowly. Surface style is the easiest thing to change, and the newest models are already scrubbing their own tics. GPT’s latest release cut its em dash habit; the tells writers police hardest are the ones evaporating on their own.
Which makes the new research uncomfortable. In April 2026, a team at the University of Maryland and Google DeepMind released StoryScope, the largest study yet of what AI fiction is actually like underneath the sentences. Their finding, in one line: you can take every stylistic cue away, and AI-generated stories are still identifiable 93.2% of the time, because the machine writes a different kind of story.
What did the study actually test?
The researchers compared 10,272 human-written stories against versions of the same premises written by five current AI models, then deliberately blinded their analysis to style.
The models were Claude Sonnet 4.6, GPT-5.4, Gemini 3 Flash, DeepSeek V3.2, and Kimi K2.5: 61,608 stories in total, averaging around 4,800 words. Instead of reading the prose, the pipeline converted every story into a structured map of its narrative decisions across ten dimensions: plot, character agency, temporal structure, setting, how information is revealed, and so on. Sentence rhythm, word choice, and figurative language were withheld on purpose.
On narrative features alone, a classifier told human from machine with a macro-F1 of 93.2%, nearly matching detectors that read the raw text. It could also tell which model wrote a story 68.4% of the time from structure alone. Five models, one shared region of story-space, each with its own fingerprint inside it.
The timing matters. In March 2026, Hachette pulled a commercially published horror novel after it was flagged as roughly 78% AI-generated, a first for a major publisher. One detection firm found nearly 20% of a 14,000-novel sample of self-published Amazon books was largely AI-written, up 41% year over year. Readers, agents, and publishers are all learning to smell this. The question of what exactly they are smelling now has data.
What are the tells of AI fiction?
The study distilled 30 core features that separate human stories from AI stories across all five models. They read like a craft workshop's list of notes.
- AI explains its themes. The narrator states the story’s meaning outright in 77% of AI stories, versus 52% of human ones. A grieving character’s arc ends with the lesson spelled out. Dialogue turns into philosophical debate 59% of the time, versus 34% for humans. The pattern is over-determination: telling the reader what to conclude rather than trusting them to infer it.
- AI keeps the plot on rails. 79% of AI stories have no subplots at all (57% for humans). Causal chains are tight, resolutions come from the protagonist’s own tidy choice or acceptance far more often, and endings resolve rather than linger. Human stories jump in time, open at the funeral and spiral backwards, leave loose ends deliberately untied.
- AI over-writes the body. Emotion arrives as physical sensation in 81% of AI stories, versus 38% of human ones: the tightening chest, the cold sweat, the breath they didn’t know they were holding. Smell-based imagery shows up in 82% of AI stories against 57% of human ones. Human authors are far more willing to just say a character felt afraid (29% vs 8%). What workshops teach as “show, don’t tell,” the models have turned into a compulsion.
- AI writes as if no one is watching. Human authors name real books, writers, and places at nearly double the AI rate (47% vs 24%), and break the fourth wall far more often (67% vs 39%). The models allude vaguely; people point at actual things.
- AI stories are simply more alike. Measured for statistical rarity, 24.7% of human stories land in the rarest tenth of the whole corpus, versus 7.1% of AI stories. Given six versions of the same premise, the human one was the rarest of the six 57.8% of the time. Chance would be 16.7%.
If you have read an AI draft of your own chapter and felt a competent stranger at the wheel, this list is probably why. It is the same flattening I wrote about in AI doesn’t just change your style, it overwrites your voice, one level up: not your sentences this time, your storytelling.
Does every AI model write fiction the same way?
No, and this is the study’s most quotable finding. Each model has a house style distinct enough that a classifier can attribute a story to its author-model two times out of three from structure alone.
The researchers’ own headings are hard to improve on. Claude keeps it cool: the most distinctive of the five, defined by restraint. Its event intensity escalates less than any other source, it takes a reverent, continuist approach to literary tradition (62% of its stories, versus 39 to 56% elsewhere), favours epilogues, avoids dream sequences, and prefers a quiet ending to an avalanche. GPT likes to gossip: rumour and hearsay drive the plot in 64% of its stories, casts are ensemble-heavy at human levels, events are framed in retrospect from years later, and it subverts expectations more than any other model. Gemini writes the tidiest endings, the longest denouements, and the bleakest worlds: 88% of its settings were tagged bleak and oppressive. DeepSeek front-loads context that other sources withhold. Kimi has the fewest quirks of all, which is its own kind of tell: it sits at the generic centre of the AI distribution.
Two things follow. First, if you draft with one model exclusively, your manuscript inherits its house style at the structural level, not just the sentence level. Second, none of them escape the cluster. In the study’s feature space, the distance from the human centre to the AI cluster is 1.6 times the distance between any two models. The models are more like each other than any of them is like a person.
Why can’t you edit the tells out?
Because editing operates on sentences, and the tells live in decisions. The researchers tested exactly this, and the result should end the “humanizer” industry’s claims about fiction.
They took AI-generated stories and rewrote them with a span-level editing framework built from professional writers’ edits of AI text, targeting seven categories of artifact: cliché, redundant exposition, purple prose, the surface layer writers are told to police. Detection on the edited stories dropped from 95.5% to 93.9%. A point and a half. The narrative classifier barely noticed, because nothing it measures had changed. The story still explained its theme, still ran on one track, still resolved through the protagonist’s neat internal acceptance.
To remove structural tells you have to restructure: add the subplot, break the timeline, cut the moral, unresolve the ending. At which point you are not cleaning up an AI story. You are writing a story, using the machine for the labour in between. The study quietly proves that the only reliable “humanizer” is a human making the narrative choices.
What does this mean if you write with AI?
It means the important question about any AI writing workflow is: who is making the story decisions? Everything measured in this paper is a decision. Whoever makes them is the author the classifier detects.
Note what the study measured: models handed a premise and asked to write the whole story. Autonomous AI fiction. The prompt-and-pray workflow. That is the thing readers are learning to identify, and that publishers are starting to pull from shelves.
The alternative workflow inverts it. The writer decides the structure: which chapter holds which turn, what stays unresolved, which thread runs underneath, what the reader is trusted to infer. The model drafts inside those decisions, and a voice constraint built from the writer’s own prose governs the sentence level. Then the structure being measured is yours, because it came from you.
This is the architecture bookmoth is built on, and this research is why I keep insisting the distinction is not marketing. Your story plan is the spine: chapters and scenes draft to your beats, not to the model’s preferred single track. Your voice profile is a binding constraint, derived from your prose, applied to every generation. And the drafting rules push directly against the reflexes in this paper’s list: trust the reader’s memory, do not moralise the theme, do not render every feeling as a tightening chest. The tells the study found at the sentence level, we were already fighting. The tells it found at the story level are the reason the writer, not the model, owns the plan.
The honest summary: the em dashes were never the problem. They were the visible edge of a deeper pattern, and the deeper pattern is that a model left to tell a story tells its own, in its own house style, more like every other model than like any person who ever wrote. If your name is going on the cover, the decisions need to be yours. The machine can carry water. It cannot be trusted with the architecture, and now there are 61,608 stories’ worth of evidence saying so. If you want to see what your own patterns look like from the outside, start with how to find your writing voice, or let the voice tool read a few pages of your prose.