CopyClean
Hidden characters

U+202F: the space that is not a space

The narrow no-break space looks identical to a normal space but breaks string matching, code and macOS text rendering. Where U+202F comes from and how to handle it.

March 31, 2026 · 3 min read · CopyClean Blog

Of all the invisible characters, U+202F NARROW NO-BREAK SPACE might be the most successful impostor. It is not zero-width; it renders as a slightly slimmer gap than a regular space. Side by side in a paragraph, you will never tell them apart. To software, they could not be more different: 0x20 versus a three-byte UTF-8 sequence, e2 80 af.

on screen

249 USD == 249 USD

in the bytes

2490x20USD  !=  249U+202FUSD

A legitimate character with a real job

U+202F is correct, professional typography in several languages. French orthography calls for a narrow non-breaking space before tall punctuation: Vous venez ? is properly set with one before the question mark, so the mark never wraps to its own line. German number formats, Russian typography and scientific notation (a thin gap between a number and its unit) all use it. Any text that passed through serious typesetting - academic publishing, quality newspapers - is full of them, on purpose.

This is exactly why AI models picked the character up. Their training data included mountains of professionally typeset text, and in 2025, users started finding U+202F scattered through ChatGPT output where plain spaces belonged - which fueled a watermarking theory we examined in Does ChatGPT watermark its text? The mundane explanation held up: models reproduce the typography of their sources. The character even made headlines again when a GPT-5 variant emitted so many of them that macOS apps hit text-rendering glitches, and developers filed it as a bug.

What it breaks

Everything that compares strings byte-for-byte:

  • "249 USD" with a narrow no-break space does not equal "249 USD" with a plain one. Spreadsheet lookups, database joins and dedupe passes fail invisibly.
  • Search fails: Ctrl+F for 9:41 AM will not find 9:41 AM typed with U+202F before the AM.
  • Code and configs: a U+202F inside a shell command or YAML file is a syntax error dressed as a space. The error message will point at a line that looks perfect.
  • CSV parsing, regex \s assumptions (some engines match it, some contexts don't), fixed-width formats: all of it wobbles.

Its cousin U+00A0, the ordinary non-breaking space, causes the same breakage and is even more common: every   on the web becomes one when you copy.

The right way to handle it

Blanket destruction is wrong: a French user's punctuation spacing is not junk, and a cleaner that flattens it corrupts correct writing. The right policy is locale-aware normalization: replace space impostors with plain spaces where they are noise, keep them where the language genuinely uses them.

That policy is built into CopyClean: space variants are normalized to a regular space on copy, with per-language preservation rules (French, German, Russian and Mongolian keep their narrow no-break spaces, CJK keeps the ideographic space, and so on). If you just need to check one suspicious string right now, here are five free ways to see hidden characters - U+202F shows up as e2 80 af in a hex dump.

A space should be a space. When it isn't, you deserve to know, and your clipboard can handle it for you.

Clean every copy, automatically CopyClean removes hidden characters, AI typography and link trackers the instant you copy. Free 7-day trial, then 12.99 USD once. macOS 14+.
Download on theMac App Store