SymbolFYI

Unicode Plane

Unicode Standard
Tanım

A group of 65,536 consecutive code points. Unicode has 17 planes (0–16), with Plane 0 being the BMP.

What Is a Unicode Plane?

A plane in Unicode is a contiguous group of 65,536 (2¹⁶) code points. Unicode is organized into 17 planes numbered 0 through 16, providing a total of 17 × 65,536 = 1,114,112 code points. Each plane spans a range from U+XX0000 to U+XXFFFF, where XX is the plane number in hexadecimal (00 through 10).

The plane structure reflects the historical evolution of Unicode from a 16-bit system (one plane) to the current 21-bit system (17 planes).

The 17 Planes

Plane Range Name Contents
0 U+0000–FFFF Basic Multilingual Plane (BMP) Most living scripts, common symbols
1 U+10000–1FFFF Supplementary Multilingual Plane (SMP) Historic scripts, emoji, musical notation
2 U+20000–2FFFF Supplementary Ideographic Plane (SIP) Rare and historic CJK ideographs
3 U+30000–3FFFF Tertiary Ideographic Plane (TIP) CJK Extension G–H
4–13 U+40000–DFFFF Unassigned Reserved for future use
14 U+E0000–EFFFF Supplementary Special-purpose Plane Language tags, variation selectors
15 U+F0000–FFFFF Supplementary Private Use Area-A Application-specific
16 U+100000–10FFFF Supplementary Private Use Area-B Application-specific

Plane 0: Basic Multilingual Plane

The BMP contains the overwhelming majority of characters used in modern text worldwide. It was the original scope of Unicode before the standard was extended. All BMP characters are represented as single 16-bit code units in UTF-16.

Plane 1: Supplementary Multilingual Plane

The SMP is the most actively developed supplementary plane. It contains: - Historic scripts: Linear B, Cuneiform, Phoenician, Egyptian Hieroglyphs - Emoji: Emoticons, pictographs, and symbol blocks - Musical notation: Western musical symbols - Mathematical Alphanumeric Symbols: Bold, italic, script mathematical letters - Playing cards, dominos, chess symbols

Plane 2: Supplementary Ideographic Plane

The SIP houses CJK Extension B through F — rare, archaic, and dialect-specific Chinese characters needed by scholars and for processing historical texts.

Encoding Across Planes

# Python handles all planes natively
bmp_char = chr(0x4E2D)          # U+4E2D '中' — Plane 0
smp_emoji = chr(0x1F600)        # U+1F600 '😀' — Plane 1
sip_rare = chr(0x20000)         # U+20000 '𠀀' — Plane 2

print(f'Plane 0: {bmp_char}  (code point {hex(ord(bmp_char))})')
print(f'Plane 1: {smp_emoji}  (code point {hex(ord(smp_emoji))})')
print(f'Plane 2: {sip_rare}  (code point {hex(ord(sip_rare))})')

# Determine which plane a code point is in
cp = ord('😀')   # 128512
plane = cp >> 16
print(f'Plane: {plane}')  # 1
// JavaScript UTF-16: BMP chars are 1 code unit, others are 2 (surrogate pair)
const bmp = '\u4E2D';          // 1 code unit
const emoji = '\u{1F600}';    // 2 code units (surrogate pair)
console.log(bmp.length);       // 1
console.log(emoji.length);     // 2

// Determine plane from code point
const cp = emoji.codePointAt(0);  // 128512
const plane = cp >> 16;
console.log(plane);  // 1

The 17-Plane Limit

The maximum code point U+10FFFF is not arbitrary — it is the highest value that can be encoded using UTF-16 surrogate pairs without collision. The high surrogate range is U+D800U+DBFF (1,024 values) and the low surrogate range is U+DC00U+DFFF (1,024 values), providing 1,024 × 1,024 = 1,048,576 supplementary code points — exactly 16 additional planes above the BMP.

İlgili Semboller

İlgili Terimler

İlgili Araçlar

İlgili Kılavuzlar