SymbolFYI

Variation Selector

Unicode Standard
Định nghĩa

Unicode characters (U+FE00–U+FE0F) that modify the appearance of the preceding character, including text vs emoji presentation.

What Is a Variation Selector?

A variation selector is a Unicode character used to choose between alternate visual representations (variants) of a preceding character. Variation selectors are invisible — they contribute no visible content themselves — but they signal to the rendering system which specific glyph or presentation style should be used for the character that precedes them.

Variation selectors are used primarily for two purposes: emoji vs. text presentation and standardized glyph variants in CJK and other scripts.

Variation Selector Ranges

Range Count Name Primary Use
U+FE00U+FE0F 16 Variation Selectors 1–16 (VS1–VS16) Emoji/text presentation, CJK glyph variants
U+E0100U+E01EF 240 Variation Selectors Supplement (VS17–VS256) CJK standardized variants
U+180BU+180F 4 Mongolian Free Variation Selectors Mongolian script glyph variants

VS-15 and VS-16: Emoji Presentation

The most commonly encountered variation selectors for web developers are VS-15 (U+FE0E) and VS-16 (U+FE0F):

  • VS-15 (text presentation selector): Requests the text/symbol rendering of the preceding character.
  • VS-16 (emoji presentation selector): Requests the colorful emoji rendering.
print('\u2665')           # ♥  — default presentation (usually text)
print('\u2665\uFE0F')    # ♥️  — emoji presentation
print('\u2665\uFE0E')    # ♥   — explicit text presentation

print('\u2764')           # ❤   — default
print('\u2764\uFE0F')    # ❤️  — emoji presentation

# Detecting variation selectors in a string
text = '\u2665\uFE0F'
for cp in text:
    code = ord(cp)
    if 0xFE00 <= code <= 0xFE0F:
        vs_number = code - 0xFE00 + 1
        print(f'Variation Selector-{vs_number} found (U+{code:04X})')
# Output: Variation Selector-16 found (U+FE0F)
const heart = '\u2665';       // ♥ text
const heartEmoji = '\u2665\uFE0F'; // ♥️ emoji

console.log(heart.length);        // 1
console.log(heartEmoji.length);   // 2 (base + VS-16)

// Strip variation selectors
function stripVariationSelectors(str) {
  return str.replace(/[\uFE00-\uFE0F]/g, '');
}
console.log(stripVariationSelectors(heartEmoji) === heart); // true

CJK Glyph Variants

For CJK characters, VS1–VS16 and VS17–VS256 are used to select among standardized glyph variants — characters that have the same semantic meaning but differ in how strokes are drawn. These variants are documented in the Unicode StandardizedVariants.txt data file.

For example, U+845B (葛) has several standardized variants depending on regional typographic conventions, accessible via specific variation selectors.

Variation Sequences

A variation sequence is the combination of a base character plus a variation selector. Not all base + VS combinations are valid — only those explicitly listed in the Unicode variation sequences registry. Using an unregistered combination is technically a no-op (the variation selector is ignored by conforming renderers).

Practical Notes for Developers

When processing user-generated text or normalizing strings for storage and comparison: 1. Be aware that U+2665 and U+2665 U+FE0F are semantically the same character with different presentation hints. 2. For full-text search and deduplication, consider stripping VS-15 and VS-16 before indexing. 3. For display, respect the original variation selector to honor the user's intended presentation.

Ký hiệu liên quan

Thuật ngữ liên quan

Công cụ liên quan

Hướng dẫn liên quan