1 回答
TA贡献1784条经验 获得超8个赞
pd.cut 产生一个由区间对象填充的类别对象。我们使用它们的 .left 和 .right 属性来创建指定的字符串。
import numpy as np, pandas as pd
import string
# test data:
df=pd.DataFrame({"year":[1911,1923,1935,1911],"other_cols":["abc","def","ghi","jkl"]})
Out:
year other_cols
0 1911 abc
1 1923 def
2 1935 ghi
3 1911 jkl
#create the intervals:
cats=pd.cut(df.year,10)
Out: cats.dtypes.categories
IntervalIndex([(1910.976, 1913.4], (1913.4, 1915.8], (1915.8, 1918.2],...
# char generator:
gchar=(ch for ch in string.ascii_uppercase)
dlbls= { iv:next(gchar) for iv in cats.dtypes.categories } #EDIT1
# get the intervals and convert them to the specified strings:
df["year_gr"]=[ f"{dlbls[iv]} ({int(np.round(iv.left))} - {int(np.round(iv.right))})" for iv in cats ] #EDIT1
Out:
year other_cols year_gr
0 1911 abc A (1911 - 1913)
1 1923 def B (1921 - 1923)
2 1935 ghi C (1933 - 1935)
3 1911 jkl A (1911 - 1913)
# align the columns:
df= df.reindex(["year_gr","year","other_cols"], axis=1)
Out:
year_gr year other_cols
0 A (1911 - 1913) 1911 abc
1 B (1921 - 1923) 1923 def
2 C (1933 - 1935) 1935 ghi
3 A (1911 - 1913) 1911 jkl
添加回答
举报