SymbolFYI

Unicode Version

Unicode Standard
Định nghĩa

Numbered releases of the Unicode Standard (e.g., 16.0), each adding new characters, scripts, and emoji.

What Is a Unicode Version?

A Unicode version is a numbered, dated release of the Unicode Standard that may add new characters, introduce new scripts, update character properties, define new algorithms, and publish revised data files. Each version represents a stable snapshot of the standard — once a version is released, its character assignments are permanent and immutable. Future versions may add new characters, but existing code point assignments never change or disappear.

Versioning follows a simple major.minor numbering scheme: Unicode 1.0, 2.0, 3.0, and so on for major releases, with minor point releases (e.g., 6.3, 8.0) for smaller additions.

Version History and Milestones

Version Year Highlights
1.0 1991 Initial standard; ~7,000 characters
2.0 1996 Surrogate pairs introduced; Hangul restructured
3.0 1999 Extended to BMP capacity
4.0 2003 Supplementary planes opened; Linear B
5.0 2006 N'Ko, Balinese, Phoenician
6.0 2010 First emoji encoded (722 emoji)
7.0 2014 Emoji skin tone modifiers
8.0 2015 5 skin tone modifiers; Linear A
9.0 2016 Emoji 3.0 (72 new emoji)
10.0 2017 Bitcoin sign (₿); 56 new emoji
11.0 2018 Copyleft symbol; 66 new emoji
12.0 2019 61 new emoji; Elymaic script
13.0 2020 55 new emoji; Chorasmian script
14.0 2021 37 new emoji; Vithkuqi script
15.0 2022 20 new emoji; 4,489 CJK Extension I
15.1 2023 627 CJK Extension I
16.0 2024 New scripts; additional emoji

Querying Unicode Version in Code

import unicodedata
import sys

# Python's Unicode version
print(unicodedata.unidata_version)  # e.g., '15.1.0'
print(sys.version)                  # Python version

# Check when a character was introduced (requires external data)
# The 'unicodedata' module doesn't expose version per character,
# but 'age' property is available via the 'regex' module
import regex
# Age property: the Unicode version in which the character was first assigned
pattern = regex.compile(r'\p{Age=6.0}')  # Characters assigned in Unicode 6.0+
// JavaScript does not expose the Unicode version directly,
// but you can test for feature support

// ES2018: Unicode property escapes (requires engine supporting Unicode 10+)
try {
  const re = /\p{Emoji}/u;
  console.log('Unicode property escapes supported');
} catch (e) {
  console.log('Not supported');
}

// Test if a specific character is recognized
const bitcoinSign = '\u20BF';  // ₿ added in Unicode 10.0
console.log(bitcoinSign);  // ₿ (if platform supports Unicode 10+)

Version Stability and Forward Compatibility

The Unicode stability policy guarantees: - No character removal: Once assigned, a code point is permanent. - No code point reassignment: A code point's meaning never changes. - Character name immutability: Formal character names are fixed (corrections go to aliases). - Normalization stability: NFC/NFD/NFKC/NFKD of existing text never changes between versions.

This stability enables developers to safely hardcode Unicode code points in software without fear of breakage across platform or library updates.

Version Detection for Emoji Support

Emoji support detection is a practical concern for apps targeting diverse devices. New emoji in Unicode 16.0 will not render on devices running older OS versions. Libraries like emoji-data and unicode-emoji-json provide per-emoji version metadata for building graceful fallbacks.

Ký hiệu liên quan

Thuật ngữ liên quan

Công cụ liên quan

Hướng dẫn liên quan