시퀀스 분석기
ZWJ 시퀀스, 피부톤 수정자, 키캡 시퀀스, 국기 쌍을 개별 구성 요소로 분해합니다.
CheckerHow to Use
-
1
Paste an emoji sequence
Paste any emoji — including complex ZWJ sequences, skin-tone-modified emoji, flag sequences, or keycap sequences — into the analyzer field. The tool accepts raw emoji characters, Unicode escape sequences (\u{1F468}), and mixed strings.
-
2
Read the codepoint breakdown
Review the full decomposition showing each codepoint in the sequence, its Unicode name, its hex value, its UTF-8 byte representation, and its role in the sequence (base character, modifier, ZWJ, variation selector, or tag character).
-
3
Check sequence type and RGI status
Verify whether the sequence is classified as fully qualified, minimally qualified, or unqualified in the Unicode emoji-test.txt file, and whether it appears in the official RGI (Recommended for General Interchange) list. Non-RGI sequences may not render as a single glyph on all platforms.
About
Modern emoji are frequently not single codepoints but sequences — ordered combinations of multiple Unicode characters that a text rendering engine combines into a single displayed glyph. Understanding emoji at the sequence level is essential for any application that processes, stores, counts, or manipulates emoji-containing text. The Unicode Standard defines four main sequence types: modifier sequences (emoji + skin tone modifier), flag sequences (two Regional Indicator letters), tag sequences (black flag + tag characters), and ZWJ sequences (emoji + Zero Width Joiner + emoji, repeated as needed).
The Zero Width Joiner (U+200D) is the most powerful and flexible composition mechanism in the emoji system. Its original purpose was to request ligature formation in scripts like Arabic and Indic, but the emoji ecosystem adopted it to create composite meanings from existing characters without requiring new codepoints. The Unicode Emoji Subcommittee maintains a curated list of RGI (Recommended for General Interchange) ZWJ sequences in emoji-zwj-sequences.txt; only these combinations are guaranteed to render as unified glyphs on compliant platforms. Unofficial ZWJ combinations are technically valid Unicode but will render as separate emoji on most systems.
For developers, correct sequence handling requires a Unicode-aware grapheme cluster segmentation implementation (defined in Unicode Technical Report #29). Naively splitting emoji strings by codepoint or UTF-16 code unit will incorrectly fragment sequences, producing broken emoji or incorrect character counts. A string containing the rainbow flag 🏳️🌈 has 4 codepoints but should be treated as exactly 1 grapheme cluster for purposes of cursor movement, selection, copy/paste, and character counting in user-facing interfaces.