Question 1

Which form should I use?

Accepted Answer

NFC is the safest default for storage, transport, and the web — it's the shortest canonical form and what most systems expect. Use NFD when a system requires decomposed text (e.g. some macOS contexts). Use NFKC/NFKD only when you deliberately want compatibility folding (ligatures, full-width, super/subscripts collapsed), since those are lossy transformations.

Question 2

What does 'strip diacritics' do?

Accepted Answer

It decomposes the text (NFD), removes all combining marks, then re-normalizes to your chosen form — so 'café' becomes 'cafe' and 'Crème Brûlée' becomes 'Creme Brulee'. This is handy for building ASCII slugs or accent-insensitive search keys, but it changes meaning in many languages, so don't use it on text you need to keep correct.

Question 3

Why do the byte counts differ between forms?

Accepted Answer

Decomposed forms (NFD/NFKD) often use more code points — a precomposed 'é' is one 2-byte character in UTF-8, while 'e' + combining acute is two characters totaling 3 bytes. Compatibility forms can go either way. The table lets you compare exact code-point and byte lengths.

Question 4

Is normalization reversible?

Accepted Answer

NFC ↔ NFD is information-preserving and reversible for canonical equivalence. NFKC/NFKD are not reversible — once a ligature or full-width digit is folded, the original distinction is lost. Stripping diacritics is also one-way.

Form	chars	bytes	= input?
NFC	17	32	yes
NFD	18	33	no
NFKC	20	21	no
NFKD	21	22	no

Unicode Normalizer (NFC, NFD, NFKC, NFKD)

How to use

Frequently asked questions

Related tools

Markdown Table to CSV Converter

Markdown Table Generator

Text Diff Viewer

Lorem Ipsum Generator

Case Converter

Character & Word Counter