SymbolFYI

Tofu (Missing Glyph)

Typography
Định nghĩa

The empty rectangle (□) displayed when a font cannot render a character, named for its tofu-like appearance.

Tofu is a colloquial term in typography and font engineering for the blank rectangle that appears when a font cannot render a particular character -- the glyph has no entry in the font's character map. The name is a playful reference to the visual similarity between the rectangular blank shape and a block of tofu. Google named their comprehensive Unicode font family "Noto" (for "no tofu") specifically to eliminate these boxes.

The .notdef Glyph

Every valid OpenType/TrueType font must contain a special glyph at index 0, conventionally named .notdef. This is the glyph that appears when the font is asked to render a character it does not have. The .notdef glyph is typically a rectangle or a rectangle containing a question mark or X. It is what you see when a font truly cannot fall back any further.

The appearance is not standardized -- different fonts implement their .notdef glyph differently. Common forms include: - Plain empty rectangle - Rectangle with an X or question mark - Small box with the hex code of the missing character - Just whitespace (invisible)

Why Tofu Appears

Primary font lacks the glyph
  -> System fallback chain consulted
    -> No fallback font found for this Unicode block
      -> "Last Resort" font consulted
        -> .notdef glyph rendered = TOFU

Common situations that produce tofu: - Emoji on older operating systems without emoji fonts - CJK characters when no East Asian fonts are installed - Mathematical symbols in fonts without math coverage - Rare scripts (Tibetan, Buginese, Cuneiform) on systems without specialized fonts - Newly assigned Unicode characters before font vendors add support

The Replacement Character vs Tofu

U+FFFD  REPLACEMENT CHARACTER

The replacement character (U+FFFD) is often confused with tofu but is distinct. U+FFFD is an actual Unicode character used to indicate decoding errors -- when a byte sequence cannot be interpreted as valid Unicode. Tofu (the .notdef glyph) indicates a rendering failure -- valid Unicode that the current font cannot display.

// U+FFFD appears when decoding fails
const decoder = new TextDecoder('utf-8', { fatal: false });
const badBytes = new Uint8Array([0xFF, 0xFE]);
decoder.decode(badBytes); // Contains U+FFFD

// A valid code point that renders as tofu is different
const rareChar = String.fromCodePoint(0x1F9CC); // Valid emoji, may show as tofu on old systems

Detecting Missing Glyphs

// Canvas-based detection: compare against known-missing character
function mightShowTofu(char, fontFamily) {
  const canvas = document.createElement('canvas');
  canvas.width = 50;
  canvas.height = 50;
  const ctx = canvas.getContext('2d');

  ctx.font = `20px "${fontFamily}"`;
  ctx.fillText(char, 0, 30);
  const charData = ctx.getImageData(0, 0, 50, 50).data;

  // Clear and draw a character from a very rare Unicode block
  ctx.clearRect(0, 0, 50, 50);
  ctx.fillText(String.fromCodePoint(0x100000), 0, 30);
  const missingData = ctx.getImageData(0, 0, 50, 50).data;

  // Low pixel difference = same glyph rendered = likely tofu
  let diff = 0;
  for (let i = 0; i < charData.length; i++) {
    diff += Math.abs(charData[i] - missingData[i]);
  }
  return diff < 100;
}

Preventing Tofu in Web Design

/* Use system emoji font stack for emoji */
.emoji {
  font-family: 'Twemoji Mozilla', 'Apple Color Emoji',
               'Segoe UI Emoji', 'Noto Color Emoji',
               'Android Emoji', sans-serif;
}

/* Use Noto for broad Unicode coverage */
/* @import url('https://fonts.googleapis.com/css2?family=Noto+Sans&display=swap'); */

/* Test specific scripts with unicode-range */
@font-face {
  font-family: 'SafeFont';
  src: local('Noto Sans'), url('noto-sans.woff2');
  unicode-range: U+0000-FFFF; /* BMP coverage */
}

The Last Resort Font

Apple ships a font called "Last Resort" and Unicode.org maintains a reference implementation with the same purpose: display a representative glyph for every Unicode block when no other font covers that character. Rather than a blank box, it shows a small pictogram representing the script category of the missing character -- allowing readers to understand what kind of character is missing even when it cannot be fully rendered.

Ký hiệu liên quan

Thuật ngữ liên quan

Công cụ liên quan

Hướng dẫn liên quan