2 回答
TA贡献1850条经验 获得超11个赞
首先制作间隔列,然后使用pandas.DataFrame.groupby:
import pandas as pd
df['date_time'] = pd.to_datetime(df['date_time'])
df = df.set_index('date_time', drop= True).sort_index()
df['intervals'] = ["%s - %s" % (i, i+1)
for i in pd.date_range('2005-08-20', '2005-08-21', freq='10 min')
for d in df.index if i<= d <= (i+1)]
df.groupby(['id1', 'intervals'])['adress'].count().reset_index()
输出:
id1 intervals adress
0 penr 2005-08-20 17:00:00 - 2005-08-20 17:10:00 1
1 penr 2005-08-20 19:10:00 - 2005-08-20 19:20:00 1
2 reom 2005-08-20 11:20:00 - 2005-08-20 11:30:00 2
3 reom 2005-08-20 22:50:00 - 2005-08-20 23:00:00 2
TA贡献1856条经验 获得超5个赞
第一个聚合计数GroupBy.sizewith Series.dt.floor:
df['date_time'] = pd.to_datetime(df['date_time'])
df = df.groupby(['id1', df['date_time'].dt.floor('10Min')]).size().reset_index(name='adress')
print (df)
id1 date_time adress
0 penr 2005-08-20 17:00:00 1
1 penr 2005-08-20 19:10:00 1
2 reom 2005-08-20 11:20:00 2
3 reom 2005-08-20 22:50:00 2
Series.dt.strftime然后用 next改变日期时间的格式10 Min:
df['date_time'] = (df['date_time'].dt.strftime('%Y-%m-%d %H:%M:%S') +
(df['date_time'] + pd.Timedelta(10, unit='min')).dt.strftime(' - %H:%M:%S'))
print (df)
id1 date_time adress
0 penr 2005-08-20 17:00:00 - 17:10:00 1
1 penr 2005-08-20 19:10:00 - 19:20:00 1
2 reom 2005-08-20 11:20:00 - 11:30:00 2
3 reom 2005-08-20 22:50:00 - 23:00:00 2
df['date_time'] = (df['date_time'].dt.strftime('%Y-%m-%d %H:%M:%S') +
(df['date_time'] + pd.Timedelta(10, unit='min')).
dt.strftime(' - %Y-%m-%d %H:%M:%S'))
print (df)
id1 date_time adress
0 penr 2005-08-20 17:00:00 - 2005-08-20 17:10:00 1
1 penr 2005-08-20 19:10:00 - 2005-08-20 19:20:00 1
2 reom 2005-08-20 11:20:00 - 2005-08-20 11:30:00 2
3 reom 2005-08-20 22:50:00 - 2005-08-20 23:00:00 2
添加回答
举报