SymbolFYI

Variation Selectors: How Unicode Controls Text vs Emoji Display

Reference Sep 19, 2023

Some Unicode characters exist in two visual modes: a simple black-and-white text rendering, and a full-color emoji rendering. The mechanism that controls which version appears is completely invisible — an variation selector, a zero-width code point that modifies the display of the character before it. Understanding variation selectors explains why ☺ and ☺️ look different despite starting with the same code point, and why certain emoji may render unexpectedly depending on platform and context.

What Are Variation Selectors?

A variation selector is a Unicode code point that, when placed immediately after another character, signals a specific presentation preference to the renderer. Variation selectors produce no visible output of their own — they are invisible modifiers.

Unicode defines 256 variation selectors in total across two blocks:

  • Variation Selectors (U+FE00–U+FE0F): VS1 through VS16, 16 selectors in the BMP
  • Variation Selectors Supplement (U+E0100–U+E01EF): VS17 through VS256, 240 selectors in Plane 14

For most developers, the two that matter most are the last two in the first block:

Selector Code Point Name Effect
VS15 U+FE0E VARIATION SELECTOR-15 Request text presentation
VS16 U+FE0F VARIATION SELECTOR-16 Request emoji presentation

Text vs Emoji Presentation

Many Unicode characters that predate emoji have both a text and emoji rendering. The text form is a monochrome glyph from the character's original encoding context. The emoji form is a colorful, platform-styled image.

Examples

Take U+2603 SNOWMAN (☃). On its own, most platforms render it as a monochrome text glyph. Add VS16 and you may get a colorful emoji snowman:

☃        U+2603                  — text glyph (monochrome)
☃️       U+2603 U+FE0F           — emoji presentation requested
☃︎       U+2603 U+FE0E           — text presentation explicitly requested

Similarly, U+263A WHITE SMILING FACE:

☺        U+263A alone            — text form
☺️       U+263A + VS16           — emoji presentation

And U+2764 HEAVY BLACK HEART:

        U+2764 alone             text form (simple black heart)
❤️       U+2764 + U+FE0F          emoji presentation (colored)
❤️‍🔥    U+2764 + VS16 + ZWJ + 🔥  heart on fire (ZWJ sequence)

Default presentation

Every emoji-capable character has a default presentation — what it looks like when no variation selector is present. The Unicode Emoji data specifies which characters have an "emoji default" vs a "text default":

  • Characters marked Emoji_Presentation=Yes display as emoji by default (e.g., 😀 U+1F600)
  • Characters marked Emoji_Presentation=No but Emoji=Yes display as text by default and can opt into emoji with VS16 (e.g., ☃ U+2603)

In practice, platform behavior varies. iOS and Android tend to render emoji-capable characters as emoji even without VS16. Desktop operating systems and web browsers are more conservative and respect the default presentation more strictly.

How Variation Selectors Appear in Data

Because variation selectors are invisible, they are one of the most common sources of mysterious text handling bugs. A string that looks 4 characters long may actually be 6 code points if it contains two variation selectors.

Detection in JavaScript

const heart = '❤️';  // U+2764 + U+FE0F

heart.length           // 2 (two UTF-16 code units)
[...heart].length      // 2 (two code points)

// Check if a character has a variation selector
function hasVariationSelector(str) {
  for (const cp of str) {
    const n = cp.codePointAt(0);
    if ((n >= 0xFE00 && n <= 0xFE0F) || (n >= 0xE0100 && n <= 0xE01EF)) {
      return true;
    }
  }
  return false;
}

hasVariationSelector('❤️')   // true
hasVariationSelector('❤')    // false
hasVariationSelector('A')    // false

// Strip variation selectors
function stripVariationSelectors(str) {
  return str.replace(/[\uFE00-\uFE0F\u{E0100}-\u{E01EF}]/gu, '');
}

stripVariationSelectors('❤️')  // '❤'

Detection in Python

import unicodedata

def has_variation_selector(s: str) -> bool:
    for ch in s:
        cp = ord(ch)
        if 0xFE00 <= cp <= 0xFE0F or 0xE0100 <= cp <= 0xE01EF:
            return True
    return False

def strip_variation_selectors(s: str) -> str:
    return ''.join(
        ch for ch in s
        if not (0xFE00 <= ord(ch) <= 0xFE0F or 0xE0100 <= ord(ch) <= 0xE01EF)
    )

has_variation_selector('❤️')   # True
strip_variation_selectors('❤️')  # '❤'

cjk-ideographic-variation-sequences">CJK Ideographic Variation Sequences

Beyond emoji, the largest systematic use of variation selectors is for CJK Ideographic Variation Sequences (IVS). Chinese, Japanese, and Korean characters (kanji, hanzi) often have multiple historically distinct glyph forms — different stroke patterns, radical forms, or stylistic traditions — that are semantically the same character but visually distinct.

Unicode's standard unification policy merges characters with the same meaning across languages into a single code point (called Han unification). But for uses requiring a specific glyph variant — legal documents, personal names, historical texts — this is insufficient. IVS solves this by allowing a CJK character to be followed by VS17–VS256 to specify a particular variant.

The collection of permitted IVS pairs is maintained by the Ideographic Variation Database (IVD), organized into collections:

Collection Registered by Purpose
Adobe-Japan1 Adobe Print and typographic variants
Hanyo-Denshi Japan Japanese government use
Moji_Joho Japan Ministry of Economy variants
UTC Unicode Consortium Unicode-standardized variants

For example, U+8FBA (辺) followed by U+E0100 (VS17) selects a specific Japanese variant of that character used in personal names.

IVS is primarily relevant to Japanese publishing, government systems, and applications that handle personal names (which in Japan may use rare or archaic kanji variants). Most web applications do not need to handle IVS explicitly, but storing and transmitting text without stripping supplementary plane characters is important if you work with Japanese data.

Variation Selectors and CSS

CSS provides the font-variant-emoji property to control emoji presentation in web typography:

/* Request emoji presentation */
.emoji {
  font-variant-emoji: emoji;
}

/* Request text presentation */
.text-symbol {
  font-variant-emoji: text;
}

/* Use platform default */
.normal {
  font-variant-emoji: normal;
}

/* Respect Unicode variation selectors in the text */
.unicode-controlled {
  font-variant-emoji: unicode;
}

Browser support for font-variant-emoji is partial as of 2025. The unicode value is designed to preserve what the text itself specifies via variation selectors, rather than overriding it with CSS. This is generally the most semantically correct setting for content that may contain explicit variation selectors.

The emoji and text values are useful for decorative contexts where you always want one presentation regardless of what variation selectors are in the text.

Practical Impact on String Handling

Database storage

If you are storing user-submitted text in a database, variation selectors will be stored as part of the string. Two strings that look identical on screen may differ in the database if one has a VS16 and the other does not. This affects:

  • Exact-match queries: WHERE content = '❤' will not match '❤️'
  • UNIQUE constraints: Two rows could have ostensibly "identical" content
  • Full-text search: Indexing behavior varies by database and FTS engine

Normalize by stripping or standardizing variation selectors if exact matching is important.

Regular expressions

Regex patterns that match emoji may or may not account for variation selectors:

// Matches ❤ but not ❤️
/❤/.test('❤️')     // true (matches the base character, ignores VS16)

// VS16 aware match
/❤\uFE0F?/.test('❤️')  // true (optional VS16)
/❤\uFE0F?/.test('❤')   // true

// Unicode property escapes (modern environments)
/\p{Emoji}/u.test('❤')   // true
/\p{Emoji}/u.test('❤️')  // true

Clipboard behavior

When users copy text containing variation selectors from one application and paste it into another, the variation selectors are typically preserved. An application that does not understand variation selectors will receive and store them transparently (they are just code points), but may display the character unexpectedly if it renders them as text or emoji regardless.

Variation Selectors as Invisible Data

One security-adjacent concern: because variation selectors are invisible, they can be used to embed hidden data in text. A string that looks like "hello" could contain variation selectors between each visible character, encoding arbitrary data. This technique has been used for watermarking text.

For applications that process untrusted input, consider normalizing or stripping variation selectors from text if the specific glyph variant has no semantic importance in your context.

Use the SymbolFYI Unicode Lookup tool to inspect any character and see whether it has an emoji presentation, and use the Encoding Converter tool to examine the individual code units in a string that may contain invisible variation selectors.

Simbol Terkait

Glosarium Terkait

Alat Terkait

Panduan Lainnya