2 回答
TA贡献1895条经验 获得超7个赞
函数(remove_emoji)
尝试
安装第一个emoji
库 -pip install emoji
import re
import emoji
df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0 nice
1 Insane3
2 NaN
3 @bertelsen1986
4 Luckily I have one to 🔥💪🏻
Name: a, dtype: object
TA贡献1942条经验 获得超3个赞
您可以通过迭代每行中的 unicode 字符来检测仅包含表情符号的行(使用emoji和unicodedata包):
df = {}
df['Comments'] = ["Test", "Hello 😉", "😉😉😉"]
import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
pure_emoji = True
for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
if unicode_char not in UNICODE_EMOJI:
pure_emoji = False
break
if pure_emoji:
df['Comments'][i] = np.NaN
print(df['Comments'])
添加回答
举报