3 回答
![?](http://img1.sycdn.imooc.com/533e4d470001a00a02000200-100-100.jpg)
TA贡献1815条经验 获得超6个赞
在链接的问题中,性能更高的解决方案是:
df.apply(lambda row: row.value_counts(dropna=False), axis=1).fillna(0)
这可能已经足以满足您的目的;但是,如果您只需要几个值,则可能会更快:
counts = pd.Series({(df == key).values.sum() for key in ['yes_1', 'no_51']})
![?](http://img1.sycdn.imooc.com/54584f850001c0bc02200220-100-100.jpg)
TA贡献1802条经验 获得超5个赞
我不知道它是否比你的技术更好,但我建议将其作为测试的解决方案:
(
pd
.melt(df,id_vars=['ID'])
.assign(yes_1 = lambda x: np.where(x['value']=='yes_1',1,0))
.assign(no_51 = lambda x: np.where(x['value']=='no_51',1,0))
.sum()
)
![?](http://img1.sycdn.imooc.com/5458626a0001503602200220-100-100.jpg)
TA贡献1829条经验 获得超6个赞
df.set_index('ID', inplace=True)#Set ID as index
df[~df.isin(['yes_1', 'no_51'])] = np.nan#Set anything not in the set as nan
pd.get_dummies(df.stack().unstack())#get dummies from a datframe that has dropped anycolumns with NaNS
class1_yes_1 class3_no_51 class3_yes_1
ID
xyz_1 1 0 0
xyz_2 0 1 0
xyz_3 0 0 1
添加回答
举报