1 回答
TA贡献1809条经验 获得超8个赞
这是使用一些 pandas 内置工具的一种方法:
# Set random number seeed and create a dummy datafame with two columns
np.random.seed(123)
df = pd.DataFrame({'activity':np.random.choice([*'ABCDE'], 40),
'TOTAL_ED_LDS':np.random.randint(50, 500, 40)})
# Reshape dataframe to get activit per column
# then use the output from describe and transpose
df_out = df.set_index([df.groupby('activity').cumcount(),'activity'])['TOTAL_ED_LDS']\
.unstack().describe().T
#Calculate percent count of total count
df_out['% of Total'] = df_out['count'] / df_out['count'].sum() * 100.
df_out
输出:
count mean std min 25% 50% 75% max % of Total
activity
A 8.0 213.125000 106.810162 93.0 159.50 200.0 231.75 421.0 20.0
B 10.0 308.200000 116.105125 68.0 240.75 324.5 376.25 461.0 25.0
C 6.0 277.666667 117.188168 114.0 193.25 311.5 352.50 409.0 15.0
D 7.0 370.285714 124.724649 120.0 337.50 407.0 456.00 478.0 17.5
E 9.0 297.000000 160.812002 51.0 233.00 294.0 415.00 488.0 22.5
添加回答
举报