SymbolFYI

Soft Hyphen: Controlling Line Breaks in Web Typography

Web Development พ.ค. 7, 2024

Long words, URLs, technical identifiers, and compound terms in German or Finnish can wreak havoc on web typography. They overflow containers, break layouts, and force ugly horizontal scrollbars. The soft hyphen — a Unicode character specifically designed for this problem — has existed since the earliest days of character encoding, but its behavior across browsers and operating systems is nuanced enough that many developers reach for CSS alternatives instead. This guide covers both options and explains when each is appropriate.

What Is the Soft Hyphen?

The soft hyphen (U+00AD, SOFT HYPHEN) is an invisible character that marks a valid word-break point with hyphenation. Its behavior:

  • When the word fits on one line: The soft hyphen is completely invisible. It renders as zero width and has no effect on the text.
  • When the line needs to break at or near the soft hyphen's position: A visible hyphen (-) appears at the break point, and the word splits across lines.

This makes the soft hyphen a "conditional hyphen" — it is advisory information to the renderer about where hyphenation is permissible.

The HTML entity for the soft hyphen is ­ (from "shy hyphen"):

<p>
  This is a very long word: anti&shy;dis&shy;establish&shy;ment&shy;arian&shy;ism
</p>

If "antidisestablishmentarianism" fits on one line, you see it whole. If the viewport is narrow, it might break as "antidisestablish-\nmentarianism" or "antidisestablishment-\narianism" depending on where the line fills.

The code point U+00AD can also be included directly in HTML source or in JavaScript strings:

<!-- HTML entity -->
antiestablish&shy;mentarianism

<!-- Direct Unicode (UTF-8 source) -->
antiestablish­mentarianism

<!-- JavaScript string -->
const word = 'antiestablish\u00ADmentarianism';

The Soft Hyphen in Unicode History

U+00AD has a slightly complicated Unicode history. Originally, its behavior was specified as "display as a hyphen only when used for line breaking" — which is the behavior described above. However, there was a period where some interpretations defined it as always visible, or as invisible regardless of line-breaking context.

The current Unicode standard (since Unicode 4.0) clearly specifies U+00AD as a format character with Soft_Dotted=No and line-break class BA (Break After), meaning: - It is not visible when it does not cause a break - It may cause a visible hyphen when it does cause a break

Modern browsers all follow this specification, so the historical confusion is no longer a practical concern for web development.

The CSS hyphens Property

The CSS hyphens property provides automatic hyphenation without requiring manual soft hyphen placement:

p {
  hyphens: auto;          /* Browser handles hyphenation automatically */
  hyphens: manual;        /* Only break at soft hyphens and <wbr> elements */
  hyphens: none;          /* No hyphenation, even at soft hyphens */
}

hyphens: none

Disables all hyphenation. Long words will not break even if they overflow the container. Soft hyphens in the text are ignored. Use this when you need to prevent any automatic line-breaking behavior — for example, in code blocks or proper names that should not be hyphenated.

hyphens: manual

Only breaks at explicitly marked positions: &shy; characters in the HTML and <wbr> elements. The browser does not apply any automatic hyphenation algorithm. This gives you full control at the cost of needing to manually mark all valid break points.

hyphens: auto

The browser applies a language-aware hyphenation algorithm to determine valid break points, in addition to honoring any &shy; marks in the text. This requires the lang attribute on the element (or its ancestor) to be set correctly:

<html lang="en">
  <!-- hyphens: auto will use English hyphenation rules -->
</html>
p {
  hyphens: auto;
  -webkit-hyphens: auto;  /* Required for Safari */
}

hyphens: auto is the most powerful option but has important caveats (see Browser Support below).

<wbr>: The Soft Hyphen's Visual-Free Cousin

The <wbr> element (Word Break Opportunity) marks a valid break point where no hyphen character should appear when the line breaks:

<!-- Break a URL without showing a hyphen -->
<a href="...">https://www.example.com/<wbr>very-long-path/<wbr>with-many-segments/</a>

<!-- Break a camelCase identifier without hyphen -->
getUser<wbr>Profile<wbr>Settings

<wbr> is ideal for: - URLs: Breaking at slash boundaries is natural, and adding a hyphen would imply it is part of the URL - Technical identifiers: CamelCase names, file paths, configuration keys - Email addresses: Breaking between the local part and domain, or within the domain

The soft hyphen is better for: - Natural language words: Where a hyphen is typographically expected when a word breaks - Compound words: German, Dutch, and Finnish compound words that need hyphenation - Long single words: Medical, scientific, or legal terminology

Browser Support and Behavior Differences

Soft hyphen (&shy;) browser support

Support for &shy; is universal across modern browsers, but there are subtle rendering differences:

Browser/Platform ­ (manual) hyphens: auto
Chrome (Windows) ✅ (most languages)
Chrome (macOS)
Firefox
Safari ✅ (requires -webkit-hyphens)
Edge
Chrome (Android)
Safari (iOS) ✅ (requires -webkit-hyphens)

hyphens: auto language support

hyphens: auto requires the browser to have a hyphenation dictionary for the document's language. Support varies:

  • English: Universally supported
  • German, French, Spanish: Well supported
  • Finnish, Dutch: Supported in most modern browsers
  • CJK languages: Not applicable (CJK text has different line-breaking rules)
  • Rare languages: May not be supported; falls back to hyphens: manual behavior

Always test hyphens: auto with your actual content language, and set the lang attribute correctly:

<p lang="de">
  Donaudampfschifffahrtsgesellschaft ist ein langes deutsches Wort.
</p>
p[lang="de"] {
  hyphens: auto;
  -webkit-hyphens: auto;
}

The "always visible" bug

Some older browsers (primarily IE and early Edge) rendered U+00AD as always visible — showing a hyphen glyph even when the word was not broken. If you are supporting very old browsers, test &shy; carefully. All modern browsers have correct behavior.

Practical Typography Patterns

Long technical words

For documentation sites or any page with specialized vocabulary:

<p>
  The process involves de&shy;serial&shy;ization of the
  hyper&shy;text transfer protocol buffer.
</p>

Or with CSS automation:

.documentation p {
  hyphens: auto;
  -webkit-hyphens: auto;
  lang: en;  /* Reinforce language for hyphenation */
}

URLs in flowing text

URLs never want a visible hyphen when broken (a hyphen looks like part of the URL):

<p>
  Visit the documentation at
  <a href="https://example.com/docs/api/reference">
    https://example.com/<wbr>docs/<wbr>api/<wbr>reference
  </a>
</p>

German compound words

German is the canonical use case for soft hyphens. Compound words like "Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz" (a real but repealed German law about beef labeling) need break points:

<span lang="de">
  Rind&shy;fleisch&shy;etikettierungs&shy;überwachungs&shy;aufgaben&shy;übertragungsgesetz
</span>

With hyphens: auto and lang="de", the browser's German hyphenation dictionary handles this automatically.

Narrow containers (cards, table cells)

When text must fit in narrow containers like dashboard cards or table cells:

.card-title {
  /* Strategy 1: Auto hyphenation */
  hyphens: auto;
  -webkit-hyphens: auto;

  /* Strategy 2: Break anywhere as last resort */
  overflow-wrap: break-word;

  /* Strategy 3: Truncate with ellipsis */
  white-space: nowrap;
  overflow: hidden;
  text-overflow: ellipsis;
}

word-break and overflow-wrap as Alternatives

When you cannot control the content (user-generated text, dynamic data), CSS provides blunt-force alternatives to soft hyphens:

overflow-wrap: break-word

Breaks long words at arbitrary points (without a hyphen) only when they would otherwise overflow:

p {
  overflow-wrap: break-word;  /* Previously: word-wrap: break-word */
}

This is the safest option for user-generated content — it prevents overflow without affecting normal text, only kicking in when a word truly cannot fit.

word-break: break-all

Breaks at any character boundary, even within normal words, to ensure no overflow. This is aggressive and usually produces poor typography:

/* Only for extreme cases like debug output or code */
.log-output {
  word-break: break-all;
}

Comparison table

Technique Adds hyphen? Language-aware? Manual control? Use case
&shy; Yes No (manual) Full Long known words
<wbr> No No (manual) Full URLs, identifiers
hyphens: auto Yes Yes None Body text, articles
hyphens: manual Yes No Via ­ Controlled content
overflow-wrap: break-word No No None User content fallback
word-break: break-all No No None Last resort

Detecting Soft Hyphens in Code

Because U+00AD is invisible, it can be confusing in text processing:

// Soft hyphens are invisible but present
const word = 'anti\u00ADdis\u00ADestablish';
word.length          // 21 (includes 2 soft hyphens)
word.includes('\u00AD')  // true

// Strip soft hyphens for processing
const clean = word.replace(/\u00AD/g, '');
clean.length         // 19

// Count visible characters (excluding soft hyphens)
function visibleLength(str) {
  return str.replace(/\u00AD/g, '').length;
}
# Python
word = 'anti\u00ADdis\u00ADestablish'
len(word)              # 21
clean = word.replace('\u00AD', '')
len(clean)             # 19

If you are storing user-submitted content and want consistent behavior, either preserve soft hyphens (they carry typographic intent) or strip them (if you plan to handle hyphenation via CSS). Be consistent — do not accidentally strip them during some processing steps but not others, which leads to some users seeing hyphens and others not.

Use the SymbolFYI Encoding Converter tool to inspect the exact bytes in a string that may contain soft hyphens, and the Character Counter tool to count code points and identify invisible characters.

สัญลักษณ์ที่เกี่ยวข้อง

อภิธานศัพท์ที่เกี่ยวข้อง

เครื่องมือที่เกี่ยวข้อง

คู่มือเพิ่มเติม