How AI Writing Detectors Actually Work (And Why They Get It Wrong)
A university professor failed a student for "using AI" on an essay. The student had written every word themselves. The AI detector gave it a 94% probability of being AI-generated. The student appealed, showed their drafts and research notes, and was cleared. But the damage was done — weeks of stress over a false positive from a tool that the professor treated as infallible.
How Detection Works
AI detectors analyze text for statistical patterns that are common in AI-generated content. The two main approaches:
Perplexity Analysis
Perplexity measures how "surprising" each word is given the words before it. AI models tend to choose the most probable next word, resulting in low perplexity (predictable text). Human writing is more varied and surprising, resulting in higher perplexity.
The problem: academic writing, technical documentation, and formulaic business writing are naturally low-perplexity. A well-structured essay with clear topic sentences and logical flow looks "AI-like" to a perplexity analyzer because good writing IS predictable.
Burstiness Analysis
Burstiness measures variation in sentence complexity. Humans write with "bursts" — some sentences are short and punchy, others are long and complex. AI tends to produce more uniform sentence lengths and complexity.
The problem: some humans write uniformly (especially non-native English speakers who learned formal writing patterns), and some AI can be prompted to vary its output.
Accuracy Numbers (Honest Ones)
| Detector | True Positive Rate | False Positive Rate | What This Means |
|---|---|---|---|
| Best commercial detectors | 70-85% | 5-15% | Misses 15-30% of AI text, falsely flags 5-15% of human text |
| Free online detectors | 50-70% | 10-25% | Essentially a coin flip with bias |
| Watermark-based detection | 95%+ | <1% | Only works if the AI provider embeds watermarks |
According to writing technology research, no current detector can reliably distinguish between AI-generated and human-written text with the certainty needed for academic or legal decisions.
Why False Positives Happen
- Non-native English speakers. Learned English from textbooks and formal sources — their writing patterns resemble AI training data.
- Technical writers. Precise, structured, low-creativity writing looks "robotic."
- Students who follow writing guides closely. Good structure = predictable = "AI-like."
- Edited and polished text. Heavy editing removes the "human messiness" that detectors look for.
What AI Detectors Cannot Do
- Detect AI-assisted writing (human writes, AI suggests improvements)
- Detect AI-generated text that has been manually edited
- Distinguish between AI text and human text that happens to be well-structured
- Provide legally or academically defensible proof of AI use
The Practical Takeaway
If you are a writer worried about false positives: write naturally, include personal anecdotes and specific examples, vary your sentence structure, and keep your drafts as evidence. Use the AI Detector to check your own writing before submitting.
If you are evaluating others' writing: never rely solely on an AI detector. Use it as one signal among many — alongside writing samples, drafts, and in-person discussion.
Related Tools
As Google has stated, the focus should be on content quality, not on whether AI was involved in creating it. Good content is good content regardless of the tool used.
Check your writing for AI patterns.
Try the AI Detector →