SymbolFYI

Font Fallback and Tofu: Why Characters Display as Empty Boxes

You have seen it before: a perfectly composed page suddenly interrupted by a row of small empty rectangles where characters should be. Or a symbol that renders beautifully on your MacBook, only to appear as a plain question mark on a colleague's Windows machine. These blank boxes have a name — tofu — and understanding why they appear is the first step to eliminating them.

What Is Tofu?

"Tofu" is typographic slang for the .notdef glyph: the fallback glyph a font renders when it has no specific glyph for a requested character. The name comes from the shape — a small rectangular block that resembles a cube of firm tofu.

Every font file contains a .notdef glyph (it is the only glyph that the OpenType spec mandates). When the rendering engine asks a font for the glyph corresponding to U+1F44D (👍), and that font has no emoji support, it returns .notdef. The result on screen is a rectangle, a question mark in a box, or a diamond with a question mark — the exact appearance depends on the font and operating system.

The Unicode standard itself provides a recommended .notdef shape, but font designers customize it. Some fonts render an elegant open box; others render a filled rectangle; some display a small hex code indicating the missing code point.

How Font Fallback Works

When the browser renders text, it does not ask a single font for every glyph. It walks through the font-family stack in order, trying each font for each character:

  1. The browser requests the glyph for character U+0041 (A) from the first font in the stack.
  2. If the font has that glyph, use it.
  3. If not (or if the glyph is .notdef), try the next font in the stack.
  4. If no font in the stack has the glyph, the operating system's font fallback mechanism is engaged.
  5. If the OS fallback also fails, the .notdef glyph from the first font in the stack is rendered — producing tofu.

This cascade means that a font-family declaration like font-family: "Helvetica Neue", Arial, sans-serif cannot render CJK characters, emoji, or mathematical symbols that none of those fonts contain. The OS fallback handles many cases transparently — macOS will fall back to Hiragino for Japanese, Windows to Meiryo — but OS fallback is inconsistent across platforms and unpredictable for unusual characters.

The Operating System Fallback Table

Script / Category macOS Windows Linux
Latin, Greek, Cyrillic System font (excellent) System font (excellent) Varies by distro
CJK (Chinese) PingFang SC/TC (excellent) MS YaHei / JhengHei (good) Noto (if installed)
CJK (Japanese) Hiragino (excellent) Meiryo UI (good) Noto (if installed)
CJK (Korean) Apple SD Gothic Neo (excellent) Malgun Gothic (good) Noto (if installed)
Arabic Geeza Pro (good) Arial (adequate) Varies
Devanagari Kohinoor Devanagari (good) Mangal (adequate) Varies
Emoji Apple Color Emoji (excellent) Segoe UI Emoji (good) Noto Color Emoji (if installed)
Math symbols STIX Two Math (good) Cambria Math (good) Varies
Box drawing Terminal fonts Consolas (good) Varies
Symbols & Dingbats SF Pro (good) Segoe UI Symbol (good) Varies

The critical observation: Linux has no guaranteed system fonts beyond basic Latin. If your audience includes Linux desktop users — developers, power users — explicit font loading for any non-Latin script is important.

Building Robust Font Stacks

A robust font stack declares fonts from most-preferred to least-preferred, with multiple fallbacks covering each target platform:

/* Maximum coverage body text stack */
body {
  font-family:
    /* Primary: your web font (loaded via @font-face) */
    "Inter",
    /* macOS / iOS system */
    -apple-system,
    BlinkMacSystemFont,
    /* Windows system */
    "Segoe UI",
    /* Android */
    Roboto,
    /* Older macOS */
    "Helvetica Neue",
    Arial,
    /* Final generic */
    sans-serif;
}

The -apple-system and BlinkMacSystemFont values activate the macOS and iOS system font (San Francisco), which provides excellent Latin, Greek, and Cyrillic coverage. They must be listed before other fonts because only WebKit/Blink honors them.

Emoji Font Stack

Emoji require special handling because emoji fonts are large and their platform-specific naming varies:

/* Include emoji fonts explicitly for consistent rendering */
.emoji-compatible {
  font-family:
    "Inter",
    -apple-system,
    BlinkMacSystemFont,
    "Segoe UI",
    Roboto,
    /* Emoji-specific fallbacks */
    "Apple Color Emoji",
    "Segoe UI Emoji",
    "Segoe UI Symbol",
    "Noto Color Emoji",
    sans-serif;
}

Place emoji fonts after your text fonts — you want Latin text rendered in your text face, with the emoji font stepping in only when an emoji code point is requested.

Symbol and Dingbat Stack

For pages that use Unicode symbols (arrows, mathematical operators, geometric shapes, miscellaneous symbols):

.symbol-text {
  font-family:
    "Fira Code",         /* Many symbols + code */
    "Symbola",           /* Extensive Unicode symbol coverage */
    "Segoe UI Symbol",   /* Windows symbols */
    "Apple Symbols",     /* macOS symbols */
    "Noto Sans Symbols", /* Google Noto symbols */
    "Noto Sans Symbols 2",
    sans-serif;
}

@font-face unicode-range for Targeted Loading

The unicode-range descriptor in @font-face tells the browser to use a font file only for characters in the specified ranges. This is the mechanism behind Google Fonts' efficient CJK loading and is directly available to you in self-hosted setups:

/* Load different files for different scripts */
@font-face {
  font-family: "MySiteFont";
  src: url("myfont-latin.woff2") format("woff2");
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC,
    U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074, U+20AC,
    U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD;
}

@font-face {
  font-family: "MySiteFont";
  src: url("myfont-latin-extended.woff2") format("woff2");
  unicode-range: U+0100-024F, U+0259, U+1E00-1EFF,
    U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F,
    U+A720-A7FF;
}

@font-face {
  font-family: "MySiteFont";
  src: url("myfont-cyrillic.woff2") format("woff2");
  unicode-range: U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}

@font-face {
  font-family: "MySiteFont";
  src: url("myfont-greek.woff2") format("woff2");
  unicode-range: U+0370-03FF;
}

When the browser encounters a character, it checks which @font-face rule covers that code point's unicode-range. It downloads only the font files needed for characters actually present on the page, not the entire set. This is how Google Fonts delivers fast CJK fonts — the file for a full Japanese subset would be multi-megabytes, but pages that use only a few dozen kanji receive tiny on-demand subsets.

Practical unicode-range Values

Script unicode-range
Basic Latin U+0000-007F
Latin Extended U+0080-024F
Cyrillic U+0400-04FF
Greek U+0370-03FF
Arabic U+0600-06FF
Devanagari U+0900-097F
CJK Unified Ideographs (common) U+4E00-9FFF
Hiragana U+3040-309F
Katakana U+30A0-30FF
Hangul U+AC00-D7A3
Emoji (basic) U+1F300-1F9FF
Mathematical Operators U+2200-22FF
Box Drawing U+2500-257F
Arrows U+2190-21FF

Detecting Missing Glyphs in JavaScript

You can detect whether a font has a glyph for a specific character by rendering it to a canvas and comparing the result to the .notdef rendering:

function hasGlyph(font, char) {
  const canvas = document.createElement('canvas');
  canvas.width = 24;
  canvas.height = 24;
  const ctx = canvas.getContext('2d');

  // Render the character with the font under test
  ctx.font = `16px "${font}"`;
  ctx.fillText(char, 2, 18);
  const withFont = canvas.toDataURL();

  // Render the same character in a font definitely missing it
  // (using a tiny nonsense font name as the fallback triggers .notdef)
  ctx.clearRect(0, 0, 24, 24);
  ctx.font = `16px "definitely-missing-font-xyz", "${font}"`;
  ctx.fillText(char, 2, 18);
  const withFallback = canvas.toDataURL();

  // If they look the same, the font is using .notdef
  return withFont !== withFallback;
}

// Example: check if the system has support for a specific character
if (!hasGlyph('sans-serif', '𝕳')) {
  // Load a supplemental font
  document.documentElement.classList.add('needs-math-font');
}

A more robust approach uses the FontFaceSet.check() API, but it only confirms whether a font is loaded, not whether it contains a specific glyph. The canvas comparison technique, while imperfect (it can produce false positives for some characters that happen to render identically to .notdef), is the most practical client-side glyph detection method available without third-party libraries.

Alternatively, the CSS @supports selector-based font checking:

/* This approach is limited — only works for font technology support, not glyph coverage */
@supports (font-variant-emoji: text) {
  /* Browser supports font-variant-emoji */
}

For content auditing, use our Character Counter tool to identify which Unicode blocks your text uses — you can then make deliberate decisions about which blocks your font stack needs to cover.

The Noto Fonts Project

Google's Noto (short for "No Tofu") project aims to build fonts that cover every Unicode character, eliminating tofu entirely. The project is essentially complete:

  • Noto Sans — sans-serif, 100+ scripts
  • Noto Serif — serif, major scripts
  • Noto Sans Mono — monospace, Latin + supplemental
  • Noto Color Emoji — full emoji coverage on Linux
  • Noto Music — musical notation
  • Noto Sans Math — mathematical symbols

Noto fonts are available through Google Fonts and as downloadable packages. For applications that must handle arbitrary Unicode input (user-generated content, multilingual databases, document viewers), including Noto as a final fallback in your font stack provides a reliable safety net:

body {
  font-family:
    "Inter",
    -apple-system,
    BlinkMacSystemFont,
    "Segoe UI",
    Roboto,
    "Noto Sans",        /* Broad Unicode coverage fallback */
    sans-serif;
}

Note that loading all Noto fonts is impractical — the complete set is several gigabytes. For web use, load only the Noto subsets that correspond to scripts you actually expect to encounter. Google Fonts' Noto offerings use automatic subsetting, so @importing Noto Sans from Google Fonts loads only the subsets needed for the current page.

Debugging with Browser DevTools

Chrome DevTools provides the most detailed font rendering information:

Elements panel — Computed — Rendered Fonts: Inspect any text element to see exactly which font(s) were used to render it, broken down by glyph count. If you see "system-ui" or "Arial" rendering glyphs you expected your custom font to handle, a fallback is occurring.

Sources panel: Check the Network tab for font file requests — missing fonts appear as 404 errors, and unicode-range subsetting shows which subset files were requested vs. loaded.

Console → document.fonts.check():

// Check if a font is loaded and ready
document.fonts.check('16px "Inter"');  // true/false

// List all loaded font faces
document.fonts.forEach(font => {
  console.log(font.family, font.style, font.weight, font.status);
});

// Wait for all fonts to load before rendering
document.fonts.ready.then(() => {
  console.log('All fonts loaded');
  // Safe to measure text dimensions now
});

Firefox Font Inspector: Firefox has a dedicated Fonts panel in DevTools that shows the exact font file and glyph used for any selected text, with a character map view. Particularly useful for diagnosing fallback in mixed-script content.

Preventing Common Tofu Scenarios

Emoji in Page Titles and Metadata

Emoji in <title> elements, Open Graph og:title, and <meta> descriptions rely on OS-level emoji font support. These render correctly on modern platforms but may appear as tofu in older email clients that preview page titles. Test in your specific deployment contexts.

Mathematical and Technical Symbols

Unicode mathematical operators (U+2200–U+22FF), arrows (U+2190–U+21FF), and technical symbols (U+2300–U+23FF) are present in many system fonts but inconsistently styled. For mathematical content, STIX Two Math or Cambria Math (Windows) provide consistent rendering. For web-published mathematics, MathML with a dedicated math font is preferable to Unicode approximations.

Private Use Area (PUA) Characters

Characters in the Private Use Area (U+E000–U+F8FF, U+F0000–U+FFFFF) are undefined by Unicode — they are intentionally left for applications to assign custom meanings. Icon fonts like Font Awesome use PUA code points for their icons. If the icon font fails to load, PUA characters render as tofu or are invisible. This is a fundamental fragility of PUA icon fonts — SVG icons or emoji are more robust alternatives.

Regional Indicator Sequences and Flag Emoji

Country flag emoji are encoded as pairs of Regional Indicator letters (U+1F1E6–U+1F1FF). Not all platforms render flag emoji — Windows does not display them (most flags show as two letter codes instead). If flag emoji are semantically important in your content, use a <img> or SVG flag alongside the emoji.

Summary: Font Stack Strategy

Content Type Strategy
Latin body text System fonts + web font, generic fallback
Multilingual content lang-attribute targeted stacks + Noto fallback
Emoji Place emoji fonts after text fonts in stack
CJK text Platform fonts + Google Fonts Noto subsetting
Mathematical STIX / Cambria Math / MathML
Icon fonts Migrate to SVG or emoji where possible
User-generated content Noto as final fallback; audit with Character Counter

Font fallback is ultimately about building a safety net: your primary font handles the expected cases beautifully, and a well-constructed stack of fallbacks catches everything else before the browser has to resort to the tofu box. With unicode-range subsetting, the Noto project, and modern DevTools diagnostics, the conditions that produce tofu are increasingly avoidable — but they require deliberate planning rather than luck.


This concludes the Typography for the Web series. The five articles together cover the full landscape of advanced Unicode typography for frontend developers: ligatures, whitespace, CJK text, box drawing, and font fallback. Each topic connects to the others — a robust font stack (Part 5) is what makes your ligature CSS (Part 1) and CJK font choices (Part 3) actually render.

For hands-on exploration of any Unicode character discussed in this series, use our Character Counter tool to analyze code points, properties, and script assignments in any text you provide.

Verwandte Symbole

Verwandte Glossareinträge

Verwandte Werkzeuge

Weitere Anleitungen