SymbolFYI

指南

The Private Use Area: Custom Characters in Unicode

Explore Unicode's Private Use Areas — how they work, why icon fonts use them, PUA in corporate fonts, and the risks of PUA characters in data exchange.

Reference 八月 27, 2024

Punycode and IDN: How Unicode Domain Names Work

How Internationalized Domain Names work — Punycode encoding, IDNA 2003 vs 2008, homograph attacks, and implementing IDN support in your applications.

Reference 八月 20, 2024

Legacy Encodings: Latin-1, Windows-1252, Shift-JIS, and When You Still Need Them

A practical guide to legacy character encodings — when you'll encounter Latin-1, Windows-1252, Shift-JIS, EUC-KR, and how to convert them to UTF-8.

Reference 八月 6, 2024

UTF-16 and Surrogate Pairs: Why JavaScript Strings Are Complicated

Understand UTF-16 encoding and surrogate pairs — why emoji have .length 2 in JavaScript, how to handle supplementary characters, and when UTF-16 matters.

Reference 七月 23, 2024

Character Encoding Detection: How Browsers and Tools Guess Your Encoding

How encoding detection works — the algorithm browsers use, statistical detectors like chardet, BOM sniffing, and why detection is never 100% reliable.

Reference 七月 9, 2024

Mojibake: Why Text Turns to Garbage and How to Fix It

Understand mojibake — garbled text from encoding mismatches. Learn to diagnose, fix, and prevent encoding errors in files, databases, and web applications.

Reference 六月 25, 2024

UTF-8: The Complete Guide to the Web's Dominant Encoding

Everything about UTF-8 — how it works, why it won, byte patterns, BOM handling, validation, and common pitfalls for developers.

Reference 六月 18, 2024

Diacritical Marks: Understanding Accents, Umlauts, and Combining Characters

A complete guide to diacritical marks in Unicode — precomposed vs combining characters, normalization, typing accented letters, and handling diacritics in code.

Reference 三月 12, 2024

Mathematical Symbols in Unicode: A Complete Reference

The definitive reference for mathematical symbols in Unicode — operators, Greek letters, set theory, logic, arrows, and where to find them by block.

Reference 一月 30, 2024

Bullet (•) vs Middle Dot (·): Small Dots, Big Differences

Compare the bullet (•), middle dot (·), and other dot-like characters — proper usage in lists, navigation separators, and interpuncts.

Reference 十一月 7, 2023

Space Characters in Unicode: 20+ Invisible Characters Compared

Explore Unicode's space characters — regular space, non-breaking space, zero-width space, em space, thin space, and other invisible formatting characters.

Reference 十月 24, 2023

Zero vs Letter O: Unicode Confusables and Homograph Attacks

How 0, O, and О (Cyrillic) create confusion — from font design to IDN homograph attacks, confusable detection, and security implications.

Reference 十月 10, 2023

Minus vs Hyphen vs Dash: Five Characters That Look Like a Line

Navigate the confusing world of horizontal line characters — hyphen-minus, en dash, em dash, minus sign, and horizontal bar.

Reference 九月 26, 2023

Variation Selectors: How Unicode Controls Text vs Emoji Display

Understand Unicode variation selectors — VS15 for text presentation, VS16 for emoji presentation, and how they control whether ☺ or 😊 appears.

Reference 九月 19, 2023

Multiplication Sign (×) vs Letter X: Spot the Difference

Distinguish the multiplication sign (×, U+00D7) from lowercase x and uppercase X — visual comparison, Unicode properties, and proper usage in math.

Reference 九月 12, 2023

Ellipsis (…) vs Three Dots (...): One Character or Three?

Compare the Unicode ellipsis character (…) with three period characters (...) — typographic differences, CSS text-overflow, and when each is appropriate.

Reference 八月 29, 2023

Curly Quotes vs Straight Quotes: Typography's Most Common Mix-Up

Understand the difference between smart quotes (“ ”) and straight quotes (" ") — when to use each, code vs prose, and auto-conversion pitfalls.

Reference 八月 15, 2023

En Dash vs Em Dash: When to Use – and —

Learn the difference between en dash (–) and em dash (—) — usage rules, typing methods, HTML entities, and CSS implementation.

Reference 八月 1, 2023

Grapheme Clusters: Why String Length Is More Complicated Than You Think

Understand grapheme clusters — why 'café' can be 4 or 5 code points, why emoji have .length 2+ in JavaScript, and how to count what users actually see.

Reference 六月 20, 2023

Code Point vs Character vs Glyph: The Three Levels of Text

Understand the three levels of text representation — code points (numbers), characters (abstract identities), and glyphs (visual shapes in fonts).

Reference 五月 2, 2023

What Is a Code Point? Understanding Unicode's U+ Notation

Learn what Unicode code points are — the U+ notation system, how code points differ from characters and glyphs, and how to find any character's code point.

Reference 四月 4, 2023