我有一个数据集:In:import pandas as pddf = pd.DataFrame({'id': [23, 23, 23, 43, 43], 'data_1': ['20170503', '20170503', '20170503', '20170602', '20170602'], 'units' : [10,10,10,5,5], 'data_2' : ['20170104', '20170503', '20170503', '20170605', '20170602'], 'code': ["s", "r", "s", "s", "r"], 'units_2': [20,10, 10, 8, 5 ]})print(df)出去: id data_1 units data_2 code units_20 23 20170503 10 20170104 s 201 23 20170503 10 20170503 r 102 23 20170503 10 20170503 s 103 43 20170602 5 20170605 s 84 43 20170602 5 20170602 r 5我需要按“id”分组并检查date_2和“s”中是否有对应于date_1的日期。可以添加一列来勾选这些匹配项,因此最终输出将如下所示: id data_1 units data_2 code units_2 new_column0 23 20170503 10 20170104 s 20 01 23 20170503 10 20170503 r 10 02 23 20170503 10 20170503 s 10 13 43 20170602 5 20170605 s 8 04 43 20170602 5 20170602 r 5 0感谢您的任何帮助
1 回答

摇曳的蔷薇
TA贡献1793条经验 获得超6个赞
这groupby不是必需的,因为值不会更改或按组计数。
用:
df['new_column']=(df.data_1.eq(df.data_2)&df.code.eq('s')).astype(int)
# or df['new_column']=(df.data_1.eq(df.data_2)&df.code.eq('s')).map({True:1,False:0})
# or df['new_column'] = np.where((df.data_1.eq(df.data_2)&df.code.eq('s')),1,0)
print(df)
id data_1 units data_2 code units_2 new_column
0 23 20170503 10 20170104 s 20 0
1 23 20170503 10 20170503 r 10 0
2 23 20170503 10 20170503 s 10 1
3 43 20170602 5 20170605 s 8 0
4 43 20170602 5 20170602 r 5 0
添加回答
举报
0/150
提交
取消