4 回答
TA贡献1872条经验 获得超3个赞
你可以做
splits = [str(p).split(".") for p in df["Category"]]
df["Digits"] = [p[1] if len(p)>1 else "-" for p in splits]
IE
df = pd.DataFrame({"Category":["5050.88","5051.90","B5050.97","5051.23B","5051.78E",
"B5050.11","5051.09","Z5052"]})
#df
# Category
# 0 5050.88
# 1 5051.90
# 2 B5050.97
# 3 5051.23B
# 4 5051.78E
# 5 B5050.11
# 6 5051.09
# 7 Z5052
splits = [str(p).split(".") for p in df["Category"]]
splits
# [['5050', '88'],
# ['5051', '90'],
# ['B5050', '97'],
# ['5051', '23B'],
# ['5051', '78E'],
# ['B5050', '11'],
# ['5051', '09'],
# ['Z5052']]
df["Digits"] = [p[1] if len(p)>1 else "-" for p in splits]
df
# Category Digits
# 0 5050.88 88
# 1 5051.90 90
# 2 B5050.97 97
# 3 5051.23B 23B
# 4 5051.78E 78E
# 5 B5050.11 11
# 6 5051.09 09
# 7 Z5052 -
不太漂亮,但很有效
编辑:
添加了“-”而不是 NaN 和代码片段
TA贡献1793条经验 获得超6个赞
试试下面:
df['Category'].apply(lambda x : x.split(".")[-1] if "." in list(x) else "-")
检查下面的代码
TA贡献1825条经验 获得超4个赞
其他方式
df.Category.str.split('[\.]').str[1]
0 88
1 90
2 97Q
3 23B
4 78E
5 11
6 09
7 NaN
或者
df.Category.str.extract('((?<=[.])(\w+))')
TA贡献1936条经验 获得超6个赞
你需要逃避你的第一个.并做fillna:
df["Digits"] = df["Category"].astype(str).str.extract("\.(.*)").fillna("-")
print(df)
输出:
Category Digits
0 B5050.88 88
1 5051.90 90
2 B5050.97Q 97Q
3 5051.23B 23B
4 5051.78E 78E
5 B5050.11 11
6 5051.09 09
7 Z5052 -
添加回答
举报