SymbolFYI

ASCII

Encoding
Definição

American Standard Code for Information Interchange — a 7-bit encoding for 128 characters including English letters, digits, and control characters.

ASCII (American Standard Code for Information Interchange) is the foundational character encoding standard that underpins virtually all modern text encoding systems. Established in 1963 and finalized in 1968, ASCII defines a 7-bit encoding covering 128 characters -- enough to represent English letters, decimal digits, common punctuation, and a set of control characters.

The ASCII Table

ASCII assigns an integer value from 0 to 127 to each character, divided into three groups:

Control Characters (0-31 and 127): Non-printable characters used to control devices and formatting. Key examples:

Decimal Hex Name Purpose
0 0x00 NUL Null character
9 0x09 HT Horizontal tab
10 0x0A LF Line feed (Unix newline)
13 0x0D CR Carriage return
27 0x1B ESC Escape

Printable Characters (32-126):

Range Decimal Content
Space 32
Punctuation & symbols 33-47, 58-64, 91-96, 123-126 ! # $ % & ' ( ) * + , - . / etc.
Digits 48-57 0-9
Uppercase letters 65-90 A-Z
Lowercase letters 97-122 a-z

Note that uppercase A is 65 (0x41) and lowercase a is 97 (0x61) -- a difference of exactly 32, which is why toggling bit 5 flips case.

Working with ASCII in Code

# Python: ord() and chr()
print(ord('A'))   # 65
print(chr(65))    # 'A'
print(ord('a') - ord('A'))  # 32 (case offset)

# Check if a string is pure ASCII
text = 'Hello'
print(text.isascii())    # True
print('cafe'.isascii())  # depends on content

# Encode to ASCII bytes
print('Hello'.encode('ascii'))  # b'Hello'
// JavaScript: charCodeAt and fromCharCode
console.log('A'.charCodeAt(0));        // 65
console.log(String.fromCharCode(65));  // 'A'

// ASCII range check
function isAscii(str) {
  return /^[\x00-\x7F]*$/.test(str);
}
console.log(isAscii('Hello')); // true

ASCII's Role in Modern Encodings

ASCII's 128-character set maps directly onto the first 128 code points of Unicode (U+0000-U+007F). UTF-8, the dominant web encoding, encodes these 128 characters using a single byte identical to the ASCII value. This means any pure ASCII document is also valid UTF-8, UTF-16, and Latin-1 -- a key reason ASCII compatibility remains so important.

When working with network protocols (HTTP, SMTP, FTP), configuration files, and source code, you are almost always working with ASCII-compatible content, even if the encoding is officially declared as UTF-8.

Limitations

ASCII's 128-character limit excludes virtually all non-English text. It has no accented characters, no currency symbols beyond $, and no characters from any non-Latin script. This limitation drove the creation of extended encodings like Latin-1 (ISO 8859-1) and ultimately the Unicode standard, which aims to encode every character used by every human writing system.

Símbolos relacionados

Termos relacionados

Ferramentas relacionadas

Guias relacionados