Question 1

Why isn't there a 'detect' library like chardet?

Accepted Answer

Browsers don't ship chardet, and importing a large encoding-detection library (the JS port of ICU's `CharsetDetector` is ~200KB) for a few common cases is overkill. This tool covers the 95% case: BOMs, ASCII-only, valid UTF-8, and UTF-16 by null-byte pattern. For exotic Asian encodings (Shift_JIS, GB2312, EUC-KR) without a BOM, you'll need chardet — but this tool will tell you 'not UTF-8' so you know to look elsewhere.

Question 2

What's the deal with BOMs?

Accepted Answer

Byte Order Marks are 2-4 byte prefixes that explicitly mark the encoding. UTF-8 BOM is `EF BB BF` (technically unnecessary, controversial — Microsoft adds them, Unix tools usually strip them). UTF-16/32 BOMs (`FF FE` etc.) are useful because they also signal endianness. If a file has a BOM, trust it absolutely.

File Encoding Detector

How to use

Frequently asked questions

Related tools

Image → PDF Converter

ZIP Inspector

File Splitter

Text Encoding Converter

File Hash (Checksum)

CSV ↔ JSON Converter