SymbolFYI

Reference

In-depth reference guides for symbols, encodings, and character comparisons — from confusable character pairs to encoding survival guides.

The Private Use Area: Custom Characters in Unicode

Explore Unicode's Private Use Areas — how they work, why icon fonts use them, PUA in corporate fonts, and the risks of PUA characters in data exchange.

Ago 27, 2024

Punycode and IDN: How Unicode Domain Names Work

How Internationalized Domain Names work — Punycode encoding, IDNA 2003 vs 2008, homograph attacks, and implementing IDN support in your applications.

Ago 20, 2024

Legacy Encodings: Latin-1, Windows-1252, Shift-JIS, and When You Still Need Them

A practical guide to legacy character encodings — when you'll encounter Latin-1, Windows-1252, Shift-JIS, EUC-KR, and how to convert them to UTF-8.

Ago 6, 2024

UTF-16 and Surrogate Pairs: Why JavaScript Strings Are Complicated

Understand UTF-16 encoding and surrogate pairs — why emoji have .length 2 in JavaScript, how to handle supplementary characters, and when UTF-16 matters.

Jul 23, 2024

Character Encoding Detection: How Browsers and Tools Guess Your Encoding

How encoding detection works — the algorithm browsers use, statistical detectors like chardet, BOM sniffing, and why detection is never 100% reliable.

Jul 9, 2024

Mojibake: Why Text Turns to Garbage and How to Fix It

Understand mojibake — garbled text from encoding mismatches. Learn to diagnose, fix, and prevent encoding errors in files, databases, and web applications.

Jun 25, 2024

UTF-8: The Complete Guide to the Web's Dominant Encoding

Everything about UTF-8 — how it works, why it won, byte patterns, BOM handling, validation, and common pitfalls for developers.

Jun 18, 2024

Diacritical Marks: Understanding Accents, Umlauts, and Combining Characters

A complete guide to diacritical marks in Unicode — precomposed vs combining characters, normalization, typing accented letters, and handling diacritics in code.

Mar 12, 2024

Mathematical Symbols in Unicode: A Complete Reference

The definitive reference for mathematical symbols in Unicode — operators, Greek letters, set theory, logic, arrows, and where to find them by block.

Jan 30, 2024

Bullet (•) vs Middle Dot (·): Small Dots, Big Differences

Compare the bullet (•), middle dot (·), and other dot-like characters — proper usage in lists, navigation separators, and interpuncts.

Nov 7, 2023

Space Characters in Unicode: 20+ Invisible Characters Compared

Explore Unicode's space characters — regular space, non-breaking space, zero-width space, em space, thin space, and other invisible formatting characters.

Out 24, 2023

Zero vs Letter O: Unicode Confusables and Homograph Attacks

How 0, O, and О (Cyrillic) create confusion — from font design to IDN homograph attacks, confusable detection, and security implications.

Out 10, 2023

Minus vs Hyphen vs Dash: Five Characters That Look Like a Line

Navigate the confusing world of horizontal line characters — hyphen-minus, en dash, em dash, minus sign, and horizontal bar.

Set 26, 2023

Variation Selectors: How Unicode Controls Text vs Emoji Display

Understand Unicode variation selectors — VS15 for text presentation, VS16 for emoji presentation, and how they control whether ☺ or 😊 appears.

Set 19, 2023

Multiplication Sign (×) vs Letter X: Spot the Difference

Distinguish the multiplication sign (×, U+00D7) from lowercase x and uppercase X — visual comparison, Unicode properties, and proper usage in math.

Set 12, 2023

Ellipsis (…) vs Three Dots (...): One Character or Three?

Compare the Unicode ellipsis character (…) with three period characters (...) — typographic differences, CSS text-overflow, and when each is appropriate.

Ago 29, 2023

Curly Quotes vs Straight Quotes: Typography's Most Common Mix-Up

Understand the difference between smart quotes (“ ”) and straight quotes (" ") — when to use each, code vs prose, and auto-conversion pitfalls.

Ago 15, 2023

En Dash vs Em Dash: When to Use – and —

Learn the difference between en dash (–) and em dash (—) — usage rules, typing methods, HTML entities, and CSS implementation.

Ago 1, 2023

Grapheme Clusters: Why String Length Is More Complicated Than You Think

Understand grapheme clusters — why 'café' can be 4 or 5 code points, why emoji have .length 2+ in JavaScript, and how to count what users actually see.

Jun 20, 2023

Code Point vs Character vs Glyph: The Three Levels of Text

Understand the three levels of text representation — code points (numbers), characters (abstract identities), and glyphs (visual shapes in fonts).

Mai 2, 2023

What Is a Code Point? Understanding Unicode's U+ Notation

Learn what Unicode code points are — the U+ notation system, how code points differ from characters and glyphs, and how to find any character's code point.

Abr 4, 2023