1 回答
TA贡献1752条经验 获得超4个赞
对由 生成的每一列使用GroupBy.transform
with :GroupBy.last
Index.difference
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%m/%d/%y')
for c in df.columns.difference(['project_id','timestamp']):
df[c] = df.groupby(['project_id',c], sort=False)['timestamp'].transform('last')
print (df)
project_id project_name region style effect representative \
0 1 2020-09-05 2020-04-06 2019-10-15 2020-09-05 2020-04-06
1 1 2020-09-05 2020-04-06 2019-10-15 2020-09-05 2020-04-06
2 1 2020-08-20 2020-04-06 2019-10-15 2019-10-15 2020-04-06
3 1 2019-10-15 2020-04-06 2019-10-15 2019-10-15 2020-04-06
4 1 2019-10-15 2019-10-15 2019-10-15 2019-10-15 2019-10-15
5 1 2019-10-15 2019-10-15 2019-10-15 2019-10-15 2019-10-15
timestamp
0 2020-10-01
1 2020-09-05
2 2020-08-20
3 2020-04-06
4 2019-12-31
如果需要原始格式添加Series.dt.strftime
:
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%m/%d/%y')
for c in df.columns.difference(['project_id','timestamp']):
df[c] = (df.groupby(['project_id',c], sort=False)['timestamp'].transform('last')
.dt.strftime('%m/%d/%y'))
print (df)
project_id project_name region style effect representative \
0 1 09/05/20 04/06/20 10/15/19 09/05/20 04/06/20
1 1 09/05/20 04/06/20 10/15/19 09/05/20 04/06/20
2 1 08/20/20 04/06/20 10/15/19 10/15/19 04/06/20
3 1 10/15/19 04/06/20 10/15/19 10/15/19 04/06/20
4 1 10/15/19 10/15/19 10/15/19 10/15/19 10/15/19
5 1 10/15/19 10/15/19 10/15/19 10/15/19 10/15/19
timestamp
0 2020-10-01
1 2020-09-05
2 2020-08-20
3 2020-04-06
4 2019-12-31
5 2019-10-15
编辑:fillna按最小时间戳添加:
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%m/%d/%y')
min1 = df['timestamp'].min()
for c in df.columns.difference(['project_id','timestamp']):
df[c] = df.groupby(['project_id',c], sort=False)['timestamp'].transform('last').fillna(min1)
print (df)
project_id project_name region style effect representative \
0 1 2020-09-05 2020-04-06 2019-10-15 2020-09-05 2020-04-06
1 1 2020-09-05 2020-04-06 2019-10-15 2020-09-05 2020-04-06
2 1 2020-08-20 2020-04-06 2019-10-15 2019-10-15 2020-04-06
3 1 2019-10-15 2020-04-06 2019-10-15 2019-10-15 2020-04-06
4 1 2019-10-15 2019-10-15 2019-10-15 2019-10-15 2019-10-15
5 1 2019-10-15 2019-10-15 2019-10-15 2019-10-15 2019-10-15
lazy timestamp
0 2020-09-05 2020-10-01
1 2020-09-05 2020-09-05
2 2019-10-15 2020-08-20
3 2019-10-15 2020-04-06
4 2019-10-15 2019-12-31
5 2019-10-15 2019-10-15
添加回答
举报