What Is a Unicode Version?
A Unicode version is a numbered, dated release of the Unicode Standard that may add new characters, introduce new scripts, update character properties, define new algorithms, and publish revised data files. Each version represents a stable snapshot of the standard — once a version is released, its character assignments are permanent and immutable. Future versions may add new characters, but existing code point assignments never change or disappear.
Versioning follows a simple major.minor numbering scheme: Unicode 1.0, 2.0, 3.0, and so on for major releases, with minor point releases (e.g., 6.3, 8.0) for smaller additions.
Version History and Milestones
| Version | Year | Highlights |
|---|---|---|
| 1.0 | 1991 | Initial standard; ~7,000 characters |
| 2.0 | 1996 | Surrogate pairs introduced; Hangul restructured |
| 3.0 | 1999 | Extended to BMP capacity |
| 4.0 | 2003 | Supplementary planes opened; Linear B |
| 5.0 | 2006 | N'Ko, Balinese, Phoenician |
| 6.0 | 2010 | First emoji encoded (722 emoji) |
| 7.0 | 2014 | Emoji skin tone modifiers |
| 8.0 | 2015 | 5 skin tone modifiers; Linear A |
| 9.0 | 2016 | Emoji 3.0 (72 new emoji) |
| 10.0 | 2017 | Bitcoin sign (₿); 56 new emoji |
| 11.0 | 2018 | Copyleft symbol; 66 new emoji |
| 12.0 | 2019 | 61 new emoji; Elymaic script |
| 13.0 | 2020 | 55 new emoji; Chorasmian script |
| 14.0 | 2021 | 37 new emoji; Vithkuqi script |
| 15.0 | 2022 | 20 new emoji; 4,489 CJK Extension I |
| 15.1 | 2023 | 627 CJK Extension I |
| 16.0 | 2024 | New scripts; additional emoji |
Querying Unicode Version in Code
import unicodedata
import sys
# Python's Unicode version
print(unicodedata.unidata_version) # e.g., '15.1.0'
print(sys.version) # Python version
# Check when a character was introduced (requires external data)
# The 'unicodedata' module doesn't expose version per character,
# but 'age' property is available via the 'regex' module
import regex
# Age property: the Unicode version in which the character was first assigned
pattern = regex.compile(r'\p{Age=6.0}') # Characters assigned in Unicode 6.0+
// JavaScript does not expose the Unicode version directly,
// but you can test for feature support
// ES2018: Unicode property escapes (requires engine supporting Unicode 10+)
try {
const re = /\p{Emoji}/u;
console.log('Unicode property escapes supported');
} catch (e) {
console.log('Not supported');
}
// Test if a specific character is recognized
const bitcoinSign = '\u20BF'; // ₿ added in Unicode 10.0
console.log(bitcoinSign); // ₿ (if platform supports Unicode 10+)
Version Stability and Forward Compatibility
The Unicode stability policy guarantees: - No character removal: Once assigned, a code point is permanent. - No code point reassignment: A code point's meaning never changes. - Character name immutability: Formal character names are fixed (corrections go to aliases). - Normalization stability: NFC/NFD/NFKC/NFKD of existing text never changes between versions.
This stability enables developers to safely hardcode Unicode code points in software without fear of breakage across platform or library updates.
Version Detection for Emoji Support
Emoji support detection is a practical concern for apps targeting diverse devices. New emoji in Unicode 16.0 will not render on devices running older OS versions. Libraries like emoji-data and unicode-emoji-json provide per-emoji version metadata for building graceful fallbacks.