Mojibake Fixer (Repair Garbled UTF-8)
Text
Mojibake is the garbled text you get when UTF-8 bytes are mistakenly read as a single-byte encoding — almost always Windows-1252, the legacy Windows default. An é becomes é, a curly apostrophe becomes ’, a non-breaking space becomes  , and an emoji turns into a run of four odd characters like 😀. This tool reverses that: it re-encodes each character back to the Windows-1252 byte it came from and decodes the resulting bytes as UTF-8, recovering the original text. It applies the fix repeatedly for doubly-mangled text, and it's safe by design — correctly-encoded text (including non-Latin scripts) doesn't form valid UTF-8 when reversed, so it's left untouched rather than damaged. Paste the broken text and copy the repaired version. Everything runs locally; nothing is uploaded.
Reverses UTF-8 mis-decoded as Windows-1252. Correct text (any script) is detected as valid and left unchanged.
How to use
- Paste the garbled text.
- Read the repaired output — the tool shows how many passes it took, or that nothing needed fixing.
- Copy the corrected text.
Frequently asked questions
- What causes mojibake?
- It happens when text saved as UTF-8 is later read using a different, single-byte encoding — most often Windows-1252 or ISO-8859-1. Each non-ASCII character was stored as two or more UTF-8 bytes, and reading those bytes one at a time produces the wrong characters: é (two bytes) shows up as the two characters é. CSV imports, database migrations, and copy-paste between mismatched systems are common culprits.
- Will it damage text that's already correct?
- No. The repair only succeeds when the reversed bytes form valid UTF-8, which genuine mojibake does but correctly-encoded text does not. So 'café', 'Köln', '한국어', or '日本語' that are already right are detected as valid and left exactly as they are — the tool reports that no fix was needed.
- Why does it sometimes apply more than one pass?
- If text was mis-decoded twice — for example UTF-8 read as Windows-1252, saved, then read as Windows-1252 again — the garbling is layered. The tool repeats the repair until the text stops changing or no longer reverses to valid UTF-8, and tells you how many passes it used.
- It didn't fix my text — why?
- Either the text is already correct, or the corruption isn't the common UTF-8-as-Windows-1252 kind (for example it was mis-decoded as Shift_JIS or EUC-KR, or bytes were actually lost). This tool targets the most frequent case; for opening a file in a specific legacy encoding, use a text-encoding converter instead.
Related tools
Markdown Table to CSV Converter
Convert a GitHub-flavored Markdown table into CSV, TSV or semicolon-separated rows, in your browser.
Markdown Table Generator
Paste CSV, TSV, or pipe-delimited data and get a properly aligned GitHub-flavored Markdown table.
Text Diff Viewer
Compare two pieces of text and see line-by-line or word-by-word additions and removals.
Lorem Ipsum Generator
Generate placeholder text by paragraphs, sentences, or words.
Case Converter
Convert text between UPPER, lower, Title, camelCase, snake_case and more.
Character & Word Counter
Count characters, words, sentences, lines, and bytes in real time.