1 回答
TA贡献1812条经验 获得超5个赞
万一你只想依靠pivot_table。您可以这样做:
# Use a temporary column with values one, pivot and fill nan with 0
new = df.assign(val=1).pivot_table(columns='answer_id',index=['cluster_id','movie_id'],values='val',fill_value=0).reset_index()
或者,您可以选择get_dummies它,因为它比pivot_tableie更快:
new = pd.concat([df[['movie_id','cluster_id']],pd.get_dummies(df['answer_id'])],1)
movie_id cluster_id 1 2 3 4 5
0 73 1 0 0 0 1 0
1 80 1 0 0 0 0 1
4 81 1 0 1 0 0 0
7 84 1 1 0 0 0 0
10 88 1 1 0 0 0 0
11 83 1 0 0 0 1 0
13 85 1 1 0 0 0 0
16 54 1 1 0 0 0 0
22 79 1 0 0 1 0 0
23 87 1 1 0 0 0 0
添加回答
举报