Emojis in Python
Python 3's native str type is UnicodeUnicode
Universal character encoding standard that assigns a unique number to every character across all writing systems and symbol sets, including emoji., which means emojis are first-class string citizens. You can print them, store them, search for them, and process them with no special setup โ as long as you understand a few key concepts about how Unicode and emojiEmoji
A Japanese word (็ตตๆๅญ) meaning 'picture character' โ small graphical symbols used in digital communication to express ideas, emotions, and objects. encoding work.
This guide covers every aspect of working with emojis in Python, from simple printing to complex sequence handling and the emoji library.
Basic Emoji Output
The simplest way to include an emoji in Python is to paste the character directly into a string literal:
print("Hello ๐")
print("Build complete โ
")
message = f"Deployment successful ๐ โ {version}"
Python 3 strings are Unicode by default. Every emoji character is a valid Unicode code pointCode Point
A unique numerical value assigned to each character in the Unicode standard, written in the format U+XXXX (e.g., U+1F600 for ๐). and can appear in any string literal, variable, or f-string as long as your source file is saved as UTF-8UTF-8
A variable-width Unicode encoding that uses 1 to 4 bytes per character, dominant on the web (used by 98%+ of websites). (the default for Python 3).
Unicode Escape Sequences
You can also reference emojis by their Unicode code point:
# U+1F525 FIRE
fire = "\U0001F525"
print(fire) # ๐ฅ
# U+1F600 GRINNING FACE
grin = "\U0001F600"
print(grin) # ๐
# U+2764 HEAVY BLACK HEART (followed by U+FE0F variation selectorVariation Selector (VS)
Unicode characters (VS-15 U+FE0E and VS-16 U+FE0F) that modify whether a character renders in text (monochrome) or emoji (colorful) presentation.)
heart = "\u2764\uFE0F"
print(heart) # โค๏ธ
The format is \UXXXXXXXX for code points above U+FFFF (8 hex digits) and \uXXXX for code points at or below U+FFFF (4 hex digits).
The emoji Library
For emoji-heavy applications, the third-party emoji library provides a convenient set of utilities:
pip install emoji
Shortcode to Emoji
import emoji
# Convert shortcodes to emoji characters
text = emoji.emojize("I love :fire: and :heart:")
print(text) # I love ๐ฅ and โค๏ธ
# Language options: 'alias' (Slack/GitHub style), 'en', 'es', etc.
text = emoji.emojize(":thumbs_up:", language="alias")
print(text) # ๐
Emoji to Shortcode (Demojize)
import emoji
text = emoji.demojize("Hello ๐ from Python ๐")
print(text) # Hello :globe_showing_Europe-Africa: from Python :snake:
# Custom delimiters
text = emoji.demojize("๐ฅ", delimiters=("[", "]"))
print(text) # [fire]
Getting Emoji Metadata
import emoji
info = emoji.emoji_list("I ๐ฅ love Python ๐")
# Returns a list of dicts with emoji characters and positions
for item in info:
print(item)
# {'match_start': 2, 'match_end': 3, 'emoji': '๐ฅ'}
# {'match_start': 14, 'match_end': 15, 'emoji': '๐'}
Counting Emojis
import emoji
text = "Great job! ๐๐โจ"
count = emoji.emoji_count(text)
print(count) # 3
Checking if a String Contains Emojis
import emoji
def has_emoji(text: str) -> bool:
return emoji.emoji_count(text) > 0
print(has_emoji("Hello ๐")) # True
print(has_emoji("Hello World")) # False
String Length and Emoji Grapheme Clusters
One of the most common Python emoji pitfalls: len() counts code points, not visible characters. Many emojis consist of multiple code points joined by invisible characters.
# Simple emoji: 1 code point
fire = "๐ฅ"
print(len(fire)) # 1 โ
# Skin tone modifierSkin Tone Modifier
Five Unicode modifier characters based on the Fitzpatrick scale that change the skin color of human emoji (U+1F3FB to U+1F3FF).: 2 code points (base + modifier)
thumbs_up = "๐๐ฝ"
print(len(thumbs_up)) # 2 โ but renders as 1 visible emoji
# ZWJZero Width Joiner (ZWJ)
An invisible Unicode character (U+200D) used to join multiple emoji into a single composite emoji, such as combining people and objects into profession emoji. sequence: 3 code points (woman + ZWJ + laptop)
woman_tech = "๐ฉโ๐ป"
print(len(woman_tech)) # 3 โ but renders as 1 visible emoji
# Family emoji: 7 code points
family = "๐จโ๐ฉโ๐งโ๐ฆ"
print(len(family)) # 7 โ but renders as 1 visible emoji
To count by grapheme clusters (visible characters), use the grapheme library:
pip install grapheme
import grapheme
text = "Hi ๐ฉโ๐ป!"
print(len(text)) # 8 (code points)
print(grapheme.length(text)) # 5 (visible characters: H, i, space, ๐ฉโ๐ป, !)
Unicode Normalization
Some emoji can be represented in multiple equivalent Unicode forms. Normalization ensures consistent behavior when comparing or storing emoji text:
import unicodedata
# NFC normalization (recommended for storage and comparison)
text = "Hello ๐ฅ"
normalized = unicodedata.normalize("NFC", text)
For database storage, always normalize to NFC before inserting emoji text. Python's unicodedata.normalize("NFC", s) handles this.
Regex with Emojis
Python's re module works with emoji characters, but you need to use the re.UNICODE flag (the default in Python 3) and be careful with multi-codepoint sequences.
Matching Any Emoji
import re
# Match basic emoji in the Miscellaneous Symbols and Pictographs block
emoji_pattern = re.compile(
"[\U0001F300-\U0001F9FF" # Misc symbols and pictographs
"\U0001FA00-\U0001FA9F" # Chess symbols
"\U0001FAA0-\U0001FAFF" # Symbols and pictographs extended
"\u2600-\u26FF" # Misc symbols
"\u2700-\u27BF" # Dingbats
"]+",
flags=re.UNICODE
)
text = "Hello ๐ฅ world! โญ How are you? ๐"
emojis = emoji_pattern.findall(text)
print(emojis) # ['๐ฅ', 'โญ', '๐']
Removing All Emojis
import re
def remove_emojis(text: str) -> str:
emoji_pattern = re.compile(
"[\U0001F600-\U0001F64F"
"\U0001F300-\U0001F5FF"
"\U0001F680-\U0001F6FF"
"\U0001F1E0-\U0001F1FF"
"\U00002500-\U00002BEF"
"\U00002702-\U000027B0"
"\U000024C2-\U0001F251"
"]+",
flags=re.UNICODE
)
return emoji_pattern.sub("", text)
print(remove_emojis("Hello ๐ฅ World! ๐")) # "Hello World! "
For production use, the emoji library's approach is more accurate than regex because it uses the actual Unicode emoji list:
import emoji
import re
def remove_emojis_accurate(text: str) -> str:
return emoji.replace_emoji(text, replace="")
Encoding and Decoding Emojis
When reading from or writing to files, databases, or APIs, you may need to handle encoding explicitly.
Writing Emoji to a File
# Always use UTF-8 when writing files with emoji
with open("output.txt", "w", encoding="utf-8") as f:
f.write("Hello ๐\n")
f.write("Fire: ๐ฅ\n")
# Reading back
with open("output.txt", "r", encoding="utf-8") as f:
content = f.read()
print(content)
JSON with Emojis
Python's json module handles emojis correctly but escapes them as Unicode by default:
import json
data = {"message": "Hello ๐", "emoji": "๐ฅ"}
# Default: escapes non-ASCII
json_str = json.dumps(data)
print(json_str) # {"message": "Hello \ud83c\udf0d", "emoji": "\ud83d\udd25"}
# ensure_ascii=False: keeps literal emoji characters
json_str = json.dumps(data, ensure_ascii=False)
print(json_str) # {"message": "Hello ๐", "emoji": "๐ฅ"}
Both are valid JSON โ parsers will decode the escaped form back to the emoji character automatically.
Emoji in Django / SQLAlchemy
PostgreSQL handles emoji in UTF-8 columns natively. MySQL requires utf8mb4 character set (the standard utf8 in MySQL only supports 3-byte UTF-8, which can't store most emojis).
# Django: ensure your database uses UTF-8 / utf8mb4
# In settings.py for MySQL:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"OPTIONS": {"charset": "utf8mb4"},
}
}
For PostgreSQL (the recommended database for emoji support), no special settings are needed.
Practical Examples
Adding Emoji to Log Messages
import logging
logging.basicConfig(level=logging.INFO, format="%(message)s")
logger = logging.getLogger(__name__)
logger.info("๐ Server starting...")
logger.info("โ
Database connected")
logger.warning("โ ๏ธ Rate limit approaching")
logger.error("โ Payment processing failed")
Emoji-Based Status Output in CLI Tools
def print_status(message: str, status: str) -> None:
icons = {
"success": "โ
",
"error": "โ",
"warning": "โ ๏ธ",
"info": "โน๏ธ",
"running": "๐",
}
icon = icons.get(status, "โข")
print(f"{icon} {message}")
print_status("Tests passed", "success") # โ
Tests passed
print_status("Build failed", "error") # โ Build failed
print_status("Cache cleared", "info") # โน๏ธ Cache cleared
Explore More on EmojiFYI
- Look up Unicode code points for any emoji with the Sequence Analyzer
- Browse and copy emojis for your Python strings with the Emoji Keyboard
- Search emojis by name at EmojiFYI Search
- Learn about grapheme clusters and ZWJ sequences in the Glossary