Unicode Accessibility Checklist: 15 Checks for Inclusive Text

How-To Accessibility & Symbols Feb 4, 2025

○ 1. How Screen Readers Handle Unicode Symbols and Emoji
○ 2. ARIA and Unicode: Making Decorative Symbols Accessible
○ 3. WCAG and Special Characters: Meeting Accessibility Standards
○ 4. Accessible Emoji: How to Use Emoji Without Excluding Users
● 5. Unicode Accessibility Checklist: 15 Checks for Inclusive Text

Daftar Isi

Accessibility for Unicode symbols is not a single setting or a one-time audit. It is a set of practices applied consistently across the design and development lifecycle. This checklist distills the principles from the entire Accessibility & Symbols series into 15 concrete, testable checks — each one something you can verify in code review, automated testing, or a manual screen reader session.

Work through these in order. The earlier items address the most common and highest-impact problems. The later items cover edge cases that matter for specific types of content. You may not encounter all 15 in every project, but knowing them all prevents the kind of gradual regression that turns an accessible codebase into an inaccessible one.

Check 1: Decorative Symbols Are Hidden

Every Unicode symbol used for visual decoration — and only decoration — must be explicitly removed from the accessibility tree with aria-hidden="true". Do not rely on screen reader verbosity settings to suppress decorative content.

Test: Search your codebase for Unicode symbols outside the ASCII range (U+0080 and above) that appear in HTML without aria-hidden. For each one, ask: does removing this symbol cause any user to miss information? If no, add aria-hidden.

<!-- Fail: decorative separator announced by screen readers -->
<p class="section-break">✦✦✦</p>

<!-- Pass: explicitly hidden -->
<p class="section-break" aria-hidden="true">✦✦✦</p>

WCAG: 1.1.1 (A) — decorative content must be implementable such that it can be ignored by assistive technology.

Check 2: Meaningful Symbols Have Accessible Labels

Any symbol that conveys information — status, action, category, rating, sentiment — must have a text alternative accessible to screen readers. The symbol alone is never sufficient.

Test: Identify every symbol that you would describe in alt text if it were an image. Apply aria-label with role="img", or use visually hidden text alongside an aria-hidden symbol.

<!-- Fail: status conveyed only by symbol -->
<span class="status">⚡</span>

<!-- Pass: label communicates the status -->
<span class="status" role="img" aria-label="High priority">⚡</span>

WCAG: 1.1.1 (A), 1.3.1 (A).

emoji-used-as-images-have-roleimg">Check 3: Emoji Used as Images Have `role="img"`

Emoji that function as standalone images — conveying a discrete unit of meaning outside of flowing prose — should use the role="img" pattern with an aria-label that describes their meaning in context, not just their CLDR name.

Test: Find every emoji that would be described differently in a sentence than its CLDR name. The 🔥 on a "Trending" badge should be labeled "Trending" or "Popular," not "fire."

<!-- Provides only the CLDR name -->
<span>🔥</span>
<!-- Screen reader: "fire" — confusing out of context -->

<!-- Provides contextual meaning -->
<span role="img" aria-label="Trending">🔥</span>
<!-- Screen reader: "Trending" -->

WCAG: 1.1.1 (A).

Check 4: `lang` Attributes Are Set for Non-Latin Scripts

Any Unicode content in a script different from the page's primary language must carry a lang attribute. This allows screen readers to switch to an appropriate TTS voice and pronounce the characters correctly.

Test: Search for CJK characters (U+4E00–U+9FFF), Arabic (U+0600–U+06FF), Cyrillic (U+0400–U+04FF), or any other non-Latin script on pages declared as a Latin-primary language. Each such segment needs a lang attribute.

<!-- Page is English, but contains Japanese -->
<html lang="en">

<!-- Fail: Japanese text without lang attribute -->
<p>The symbol <span>円</span> means yen.</p>

<!-- Pass: lang attribute enables correct pronunciation -->
<p>The symbol <span lang="ja">円</span> means yen.</p>

WCAG: 3.1.2 (AA).

Check 5: RTL Scripts Use Correct `dir` Attributes

Right-to-left scripts — Arabic, Hebrew, Persian, Urdu — require explicit dir="rtl" on their containing elements when embedded in a primarily LTR page. Without this, bidirectional text rendering (Unicode BiDi algorithm) may produce correct visual output but the accessibility tree reading order may be wrong.

Test: Inspect any Arabic, Hebrew, or Persian text on the page. Verify that the containing element or a close ancestor has dir="rtl" and lang set appropriately.

<!-- Fail: RTL text with no direction attribute -->
<p>In Arabic: مرحبا</p>

<!-- Pass: direction and language explicitly set -->
<p>In Arabic: <span lang="ar" dir="rtl">مرحبا</span></p>

Screen readers use dir to determine the reading order of mixed-direction text. Without it, punctuation and numerals embedded in RTL text may be announced in the wrong order.

WCAG: 1.3.2 (A) — meaningful sequence; 3.1.2 (AA) — language of parts.

Check 6: Proper Quotation Marks Are Used

Straight quotation marks (" and ') are ambiguous and can confuse text-to-speech engines, which may announce them as "inch" or "foot" marks depending on context. Use proper typographic quotation marks (U+201C/U+201D, U+2018/U+2019) for prose, or the correct marks for the page's language.

Test: Search HTML templates for straight quote characters used in prose content (not in code samples, attribute values, or <code> elements). Replace with typographic equivalents or HTML entities.

<!-- Ambiguous straight quotes -->
<p>"Typography matters," she said.</p>

<!-- Proper curly quotes -->
<p>"Typography matters," she said.</p>

<!-- Or via HTML entities -->
<p>&ldquo;Typography matters,&rdquo; she said.</p>

For non-English content, use the correct quotation style for the language — German uses „…", French uses «…», Japanese uses「…」.

WCAG: 3.1.1 (A) — language of page affects expected punctuation.

Check 7: Interactive Symbols Have Visible Focus Indicators

Any clickable, tappable, or keyboard-activatable symbol — an icon button, a symbol link, an emoji reaction toggle — must have a visible focus indicator when keyboard focus lands on it. The browser default :focus outline is often removed by CSS resets.

Test: Tab through the page without a mouse. Verify that every interactive element containing a symbol shows a clear, visible focus ring.

/* Never do this universally */
* { outline: none; }
:focus { outline: none; }

/* Instead, provide a custom focus style */
button:focus-visible {
  outline: 2px solid #005fcc;
  outline-offset: 2px;
  border-radius: 4px;
}

The focus indicator itself must meet a contrast ratio of at least 3:1 against adjacent colors (WCAG 2.4.11, Level AA, introduced in WCAG 2.2).

WCAG: 2.4.7 (AA) — focus visible; 2.4.11 (AA) — focus appearance (WCAG 2.2).

Check 8: Symbol-Containing Text Meets Contrast Requirements

Unicode symbols rendered as text must meet the same contrast requirements as any other text: 4.5:1 for normal-size text, 3:1 for large text (≥ 18pt or ≥ 14pt bold).

Test: Sample the computed color and background color of symbol-heavy UI elements. Run them through a contrast checker. Common failure cases: gray checkmarks on white, light-colored emoji descriptions, muted icon colors in form controls.

Note that emoji rendered by the OS emoji font use full-color rendering and are exempt from text contrast requirements — but they must still meet non-text contrast requirements (3:1) when used as UI components (WCAG 1.4.11, AA).

WCAG: 1.4.3 (AA) — contrast (text); 1.4.11 (AA) — non-text contrast.

Check 9: No Symbol-Only Content Without a Text Alternative

No user-facing content should convey information exclusively through a symbol when that information is not also available as text somewhere on the page. This includes table headers, form labels, navigation items, status indicators, and error messages.

Test: View your page with CSS disabled (Firefox: View → Page Style → No Style). Everything critical to understanding the page should still be readable as text. Every symbol-only element should reveal its meaning via aria-label, visually hidden text, or the title attribute (as a last resort — title tooltip behavior is inconsistent and not keyboard-accessible on touch devices).

<!-- Fail: column meaning visible only in symbol -->
<th>✓</th>
<th>✗</th>

<!-- Pass: symbol supplemented with accessible text -->
<th>
  <span aria-hidden="true">✓</span>
  <span class="sr-only">Included</span>
</th>
<th>
  <span aria-hidden="true">✗</span>
  <span class="sr-only">Not included</span>
</th>

WCAG: 1.1.1 (A), 1.3.1 (A).

Check 10: Mathematical Notation Uses MathML or Has Text Alternatives

Mathematical symbols and equations require careful treatment. Screen readers handle mathematical Unicode inconsistently, and complex expressions rendered as plain Unicode text often produce garbled or meaningless speech output.

Test: Identify any mathematical content on the page. For simple expressions (x², m/s, ∑), verify that the screen reader announces something meaningful. For complex expressions, check whether MathML is used.

<!-- Potentially ambiguous for screen readers -->
<p>E = mc²</p>

<!-- Explicit with MathML for broader support -->
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <mi>E</mi>
  <mo>=</mo>
  <mi>m</mi>
  <msup>
    <mi>c</mi>
    <mn>2</mn>
  </msup>
</math>

<!-- Or: accessible title on a code/abbr element -->
<abbr title="E equals m times c squared">E = mc²</abbr>

WCAG: 1.1.1 (A).

Check 11: Special Whitespace Characters Are Avoided in Content

The Unicode standard includes many whitespace-like characters beyond the regular space (U+0020): non-breaking space (U+00A0), thin space (U+2009), zero-width space (U+200B), en space (U+2002), em space (U+2003), and others. These can appear in copy-pasted content and cause unexpected screen reader behavior — the non-breaking space is sometimes announced as "non-breaking space," and zero-width spaces can disrupt word boundary detection.

Test: Use our Character Counter tool to inspect content strings for non-standard whitespace characters. In particular, check imported content, CMS-pasted text, and developer-authored template strings.

# Python: detect non-standard whitespace
import unicodedata

def find_unusual_whitespace(text):
    unusual = []
    for i, char in enumerate(text):
        if unicodedata.category(char) in ('Zs', 'Cf') and char != ' ':
            unusual.append((i, char, unicodedata.name(char)))
    return unusual

Replace non-standard whitespace with regular spaces in content strings, or use CSS for spacing effects (letter-spacing, word-spacing, padding).

WCAG: 1.3.1 (A) — information conveyed through presentation.

Check 12: Font Fallback Stacks Cover Symbol Characters

When a Unicode symbol is not in the primary font, the browser falls back to another font in the stack — or renders a missing-character box (□ or ▯). Missing characters may be announced by screen readers as "white square" or similar, which is confusing. They also signal to sighted users that something is broken.

Test: Audit the symbol characters used in your UI and verify they are covered by at least one font in your stack. Use browser DevTools to inspect which font actually renders each character.

/* Symbol-aware font stack */
body {
  font-family:
    'Inter',           /* Primary UI font */
    'Segoe UI Symbol', /* Windows symbol coverage */
    'Apple Color Emoji',/* macOS/iOS emoji */
    'Noto Sans Symbols', /* Broad Unicode coverage */
    sans-serif;
}

For pages that use many technical or mathematical Unicode characters, consider loading the Google Noto font family, which is designed to cover the entire Unicode standard.

WCAG: 1.1.1 (A) — content must be perceivable; invisible or replaced characters may not be.

Check 13: Braille-Aware Output Is Considered

Screen readers can output to refreshable Braille displays as well as TTS engines. Braille output has different characteristics: it is spatial rather than temporal, and the user reads it by moving their fingers across cells. Long ARIA labels that work well as speech can be unnecessarily verbose on a Braille display.

This does not mean you should write separate content for Braille users — that is neither practical nor necessary. It means:

Keep aria-label text concise and informative (it benefits both speech and Braille)
Do not use aria-label strings that read like spoken sentences ("Click this button to submit the form") — prefer short, noun-phrase labels ("Submit")
Test with a Braille display emulator if your audience is likely to include Braille users (this is common in government, education, and accessibility-focused products)

No direct WCAG criterion requires Braille testing, but WCAG 4.1.2 (AA) requires that names, roles, and values be programmatically determinable — which enables both TTS and Braille output.

WCAG: 4.1.2 (AA).

Check 14: Automated Accessibility Testing Is Integrated

Automated tools catch a significant subset of Unicode accessibility problems — missing accessible names on icon buttons, empty link text, and some ARIA misuse — at near-zero marginal cost when integrated into a CI pipeline.

Recommended tools:

Tool	Integration	What It Catches
axe-core	Jest, Playwright, CI	~30% of WCAG issues including missing names
Lighthouse	Chrome DevTools, CI	Accessibility score including contrast, missing alt
Pa11y	CI/CD pipeline	WCAG 2.2 checks via axe and HTML_CodeSniffer
WAVE	Browser extension	Visual overlay of issues, good for spot checking
eslint-plugin-jsx-a11y	Build-time (React)	Missing alt, bad ARIA patterns in JSX

A reasonable CI configuration runs axe-core against key pages on every pull request:

// playwright + axe-core example
const { chromium } = require('playwright');
const { checkA11y, injectAxe } = require('axe-playwright');

test('homepage has no accessibility violations', async () => {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto('http://localhost:3000');
  await injectAxe(page);
  await checkA11y(page, null, {
    detailedReport: true,
    runOnly: { type: 'tag', values: ['wcag2a', 'wcag2aa', 'wcag22aa'] }
  });
  await browser.close();
});

WCAG: Automated testing does not map to a specific criterion but supports conformance with A and AA requirements broadly.

No automated tool can fully replicate the experience of a screen reader user. The final check on this list is the most important: test your symbol-containing content with a real screen reader.

Minimum viable screen reader test matrix:

Screen Reader	Browser	Platform	Coverage
NVDA (free)	Firefox or Chrome	Windows	~40% of global screen reader users
VoiceOver (built-in)	Safari	macOS	Mac users, iOS users
VoiceOver (built-in)	Safari	iOS	Mobile screen reader users

What to test manually:

Navigate by headings (NVDA: H key, VoiceOver: VO+Command+H). Every heading containing a symbol should announce its level and a meaningful name.
Navigate by landmarks (NVDA: D key). Confirm landmark regions have appropriate labels.
Tab through interactive elements. Every button, link, and form control that uses a symbol should have a clear accessible name.
Read through key body content using continuous reading mode. Listen for unexpected symbol announcements that interrupt the reading flow.
Test at two verbosity levels — default and one level lower — to understand how your symbols behave as verbosity changes.

Document your findings and include screen reader test results in your accessibility audit reports. Retesting after major feature additions is essential — accessibility regressions are common when teams add new UI patterns without reviewing prior decisions.

Using This Checklist in Practice

The most effective way to use this checklist is not as a post-launch audit tool but as a design review and code review checklist. Many of the problems it covers are easiest to fix at the point of implementation — adding aria-hidden to a decorative symbol takes seconds; retrofitting an entire design system's icon set takes weeks.

Suggested integration points:

Design review: Check 3, 8, 9 during design review (emoji semantics, contrast, symbol-only states)
Development: Checks 1, 2, 7, 11, 12 during implementation
Code review: Checks 4, 5, 6, 10 when reviewing markup templates
CI pipeline: Check 14 on every PR
Pre-launch audit: All 15, with Check 15 (manual testing) as the final gate
Post-launch: Checks 14 and 15 on a quarterly schedule

Use our Character Counter tool as part of your content review process — it identifies the Unicode categories, code points, and properties of characters in any text, making it easier to spot unexpected symbols in user-generated or imported content before it reaches production.

This concludes the Accessibility & Symbols series. Return to Series Overview: Accessibility & Symbols to review all five articles, or explore related guides in the Symbols for Developers series.