SymbolFYI

Emoji

Unicode Standard
التعريف

Pictographic symbols defined in Unicode, originally from Japanese mobile phones, now a universal visual communication system.

What Is Emoji?

Emoji (from Japanese 絵文字, e = picture, moji = character) are pictographic symbols encoded in Unicode that render as colorful images on modern platforms and operating systems. Originally developed by Shigetaka Kurita for NTT DoCoMo in 1999, emoji were standardized in Unicode 6.0 (2010) and have since grown to become one of the most actively expanded areas of the Unicode standard.

As of Unicode 16.0 (2024), the standard includes over 3,700 emoji across a variety of categories including faces, animals, food, travel, activities, objects, symbols, and flags.

Emoji Encoding

Emoji are not a separate encoding system — they are regular Unicode code points. Most emoji reside in supplementary planes (above U+FFFF):

  • Emoticons block: U+1F600U+1F64F (faces, hand gestures)
  • Misc Symbols and Pictographs: U+1F300U+1F5FF
  • Transport and Map: U+1F680U+1F6FF
  • Supplemental Symbols: U+1F900U+1F9FF
  • Symbols and Pictographs Extended-A: U+1FA00U+1FA6F

Some older emoji are in the BMP, including ✈ (U+2708), ☀ (U+2600), and ♥ (U+2665).

Emoji Presentation vs. Text Presentation

Many characters can be rendered as either a text glyph or a colorful emoji. Variation selectors control this:

  • U+FE0E (VS-15): Force text presentation
  • U+FE0F (VS-16): Force emoji presentation
print('\u2665')           # ♥ (text presentation)
print('\u2665\uFE0F')    # ♥️ (emoji presentation, colorful heart)
print('\u2665\uFE0E')    # ♥ (explicit text presentation)

Emoji Sequences

Many visible emoji are not single code points but sequences of multiple code points:

// Skin tone modifier
const thumbsUp = '\u{1F44D}';           // 👍
const thumbsUpDark = '\u{1F44D}\u{1F3FF}'; // 👍🏿 (dark skin tone)

// ZWJ sequence (family)
const family = '\u{1F468}\u200D\u{1F469}\u200D\u{1F467}';
// 👨‍👩‍👧

// Flag (regional indicator letters)
const japan = '\u{1F1EF}\u{1F1F5}';    // 🇯🇵

// Grapheme-cluster-aware length
const segmenter = new Intl.Segmenter();
console.log([...segmenter.segment(family)].length);  // 1
console.log(family.length);  // 8 (code units)
import emoji  # pip install emoji

text = 'I love Python! '
print(emoji.demojize(text))   # 'I love Python! :snake:'
print(emoji.emojize(':snake:'))  # snake emoji

# Count emoji in a string
print(emoji.emoji_count('  '))  # 3

Emoji Versions and Platform Differences

New emoji are added with each Unicode release, but platform support lags behind. Apple, Google, Microsoft, and others maintain their own emoji designs and update on their own schedules. A new Unicode emoji may appear as an empty box (tofu) on older devices.

The RGI (Recommended for General Interchange) emoji set defined by Unicode identifies which sequences are most widely supported across platforms. Developers building emoji pickers or text processors should reference the emoji-test.txt data file published with each Unicode release.

الرموز ذات الصلة

المصطلحات ذات الصلة

الأدوات ذات الصلة

الأدلة ذات الصلة