문자열 길이 계산기

문자, 바이트, 자소 클러스터 등으로 텍스트 길이를 분석하세요.

JS 코드 단위 (UTF-16)

Unicode 코드 포인트

자소 클러스터

UTF-8 바이트

단어

줄

왜 숫자가 다른가요?

JS 코드 단위 — .length JavaScript에서. UTF-16 코드 단위를 셉니다 (이모지 = 2).
Unicode 코드 포인트 — [...str].length. 각 코드 포인트는 하나의 항목입니다. '실제' 문자에 더 가깝습니다.
자소 클러스터 — 사람이 단일 문자로 인식하는 것. 국기, 피부 톤 이모지, 결합 기호가 하나로 셉니다.
UTF-8 바이트 — UTF-8로 인코딩했을 때의 저장 크기. ASCII = 1바이트, 이모지 = 4바이트.

자주 묻는 질문

The String Length Calculator reports six distinct length metrics for any text: JavaScript code units (UTF-16 .length), Unicode codepoints, grapheme clusters (perceived characters), UTF-8 byte count, word count, and line count.

JavaScript measures string length in UTF-16 code units, so characters above U+FFFF (like most emoji) count as 2. Python 3 counts Unicode codepoints, so each character counts as 1. The String Length Calculator shows both metrics simultaneously so you can see the discrepancy for any input.

A grapheme cluster is what a human perceives as a single character — for example, a flag emoji (two regional indicator letters) or a skin-tone emoji (base emoji plus a modifier) both appear as one character visually but span multiple codepoints. The String Length Calculator uses the browser's Intl.Segmenter API to count grapheme clusters accurately.

Paste your text into the String Length Calculator and read the UTF-8 Bytes metric. ASCII characters cost 1 byte each, most Latin-extended and common symbols cost 2, CJK characters cost 3, and emoji cost 4 bytes in UTF-8.

Many systems enforce byte-based limits rather than character limits — database column sizes, HTTP header values, message queue payloads, and API fields are common examples. The String Length Calculator makes it easy to check whether your text fits within such constraints.

JavaScript strings are encoded as UTF-16, and characters above U+FFFF require two code units called a surrogate pair. The .length property counts code units, not characters, so a single emoji can return 2. Unicode codepoints count actual characters using the spread operator [...str].length.

Yes — the calculator correctly handles all Unicode text including emoji, family sequences, flag sequences, and combining characters. The grapheme cluster count uses the Intl.Segmenter API for browser-native accuracy.

Yes — all whitespace characters (spaces, tabs, newlines) are included in the character and byte counts. Words are counted by splitting on whitespace, and lines are counted by newline characters, matching typical editor behavior.

UTF-16 byte size equals the JavaScript code unit count multiplied by two, since each UTF-16 code unit is 2 bytes. Characters in the supplementary planes (above U+FFFF) use two code units and therefore cost 4 bytes in UTF-16.

Yes, with caveats. Twitter counts in Unicode codepoints with special rules for URLs; SMS uses UTF-16 code units with different limits for GSM-7 versus Unicode messages. The calculator shows all relevant metrics so you can apply the platform's specific counting rules.

문자열 길이 계산기

왜 숫자가 다른가요?

자주 묻는 질문

관련 용어