✦ SymbolFYI

🇺🇸 English 🇰🇷 한국어 🇯🇵 日本語 🇨🇳 中文简体 🇪🇸 Español 🇧🇷 Português 🇮🇳 हिन्दी 🇸🇦 العربية 🇫🇷 Français 🇷🇺 Русский 🇩🇪 Deutsch 🇹🇷 Türkçe 🇻🇳 Tiếng Việt 🇮🇩 Bahasa Indonesia 🇹🇭 ไทย

Encoding Survival Guide

A 6-part practical series on character encoding — UTF-8 byte structure, mojibake diagnosis, encoding detection, and the Unicode sandwich.

1
UTF-8: The Complete Guide to the Web's Dominant Encoding

Everything about UTF-8 — how it works, why it won, byte patterns, BOM handling, validation, and common pitfalls for developers.
2
Mojibake: Why Text Turns to Garbage and How to Fix It

Understand mojibake — garbled text from encoding mismatches. Learn to diagnose, fix, and prevent encoding errors in files, databases, and web applications.
3
Character Encoding Detection: How Browsers and Tools Guess Your Encoding

How encoding detection works — the algorithm browsers use, statistical detectors like chardet, BOM sniffing, and why detection is never 100% reliable.
4
UTF-16 and Surrogate Pairs: Why JavaScript Strings Are Complicated

Understand UTF-16 encoding and surrogate pairs — why emoji have .length 2 in JavaScript, how to handle supplementary characters, and when UTF-16 matters.
5
Legacy Encodings: Latin-1, Windows-1252, Shift-JIS, and When You Still Need Them

A practical guide to legacy character encodings — when you'll encounter Latin-1, Windows-1252, Shift-JIS, EUC-KR, and how to convert them to UTF-8.
6
Punycode and IDN: How Unicode Domain Names Work

How Internationalized Domain Names work — Punycode encoding, IDNA 2003 vs 2008, homograph attacks, and implementing IDN support in your applications.