Code Unit
Embed This Widget
Add the script tag and a data attribute to embed this widget.
Embed via iframe for maximum compatibility.
<iframe src="https://emojifyi.com/iframe/glossary/code-unit/" width="420" height="400" frameborder="0" style="border:0;border-radius:10px;max-width:100%" loading="lazy"></iframe>
Paste this URL in WordPress, Medium, or any oEmbed-compatible platform.
https://emojifyi.com/glossary/code-unit/
Add a dynamic SVG badge to your README or docs.
[](https://emojifyi.com/glossary/code-unit/)
Use the native HTML custom element.
The minimum bit combination used for encoding a character: 8-bit for UTF-8, 16-bit for UTF-16, and 32-bit for UTF-32.
A code unit is the fundamental building block of a Unicode encoding form. It's important to distinguish code units from code points — a single code point may require multiple code units depending on the encoding.In UTF-8, a code unit is 8 bits (1 byte). The emoji 😀 requires 4 code units. In UTF-16, a code unit is 16 bits (2 bytes). The same emoji requires 2 code units (a surrogate pair). In UTF-32, it's 1 code unit (4 bytes).
Many programming language string APIs operate on code units rather than code points, which is why string length calculations can be confusing with emoji.