1 回答
TA贡献1828条经验 获得超3个赞
这本质上是另一个需要“diff-cumsum”技巧来累积负面变化数量的问题。然而,在这种情况下,.diff()不支持日期时间差异,因此做起来会更加棘手。
这是一个快速而肮脏的展示df["Date_and_time"]。您应该对自己涉及的其他列执行类似操作。
from datetime import timedelta
#df = pd.read_clipboard(sep=r"\s{2,}")
df["Date_and_time"] = pd.to_datetime(df["Date_and_time"])
# get timestamp in nanoseconds
df["ns"] = df["Date_and_time"].values.astype(np.int64)
# detect reversed time change and accumulate days
df["days"] = (df["ns"].diff() < 0).cumsum()
# add the days found
df["Date_and_time_new"] = df.apply(lambda row: row["Date_and_time"] + timedelta(days=row["days"]), axis=1)
df
Out[76]:
Date Date_and_time ... days Date_and_time_new
0 2020/08/02 2020-08-02 21:21:46.000000 ... 0 2020-08-02 21:21:46.000000
1 2020/08/02 2020-08-02 21:21:46.082191 ... 0 2020-08-02 21:21:46.082191
2 2020/08/02 2020-08-02 21:21:46.164383 ... 0 2020-08-02 21:21:46.164383
3 2020/08/02 2020-08-02 21:21:46.246575 ... 0 2020-08-02 21:21:46.246575
4 2020/08/02 2020-08-02 21:21:46.328767 ... 0 2020-08-02 21:21:46.328767
5 2020/08/02 2020-08-02 00:00:00.000000 ... 1 2020-08-03 00:00:00.000000
6 2020/08/02 2020-08-02 00:00:00.082191 ... 1 2020-08-03 00:00:00.082191
7 2020/08/02 2020-08-02 00:00:00.164383 ... 1 2020-08-03 00:00:00.164383
8 2020/08/02 2020-08-02 00:00:00.246575 ... 1 2020-08-03 00:00:00.246575
9 2020/08/03 2020-08-03 00:00:16.082191 ... 1 2020-08-04 00:00:16.082191
10 2020/08/03 2020-08-03 03:00:33.000000 ... 1 2020-08-04 03:00:33.000000
11 2020/08/03 2020-08-03 03:00:33.040513 ... 1 2020-08-04 03:00:33.040513
12 2020/08/03 2020-08-03 03:00:33.081026 ... 1 2020-08-04 03:00:33.081026
13 2020/08/03 2020-08-03 03:00:33.121539 ... 1 2020-08-04 03:00:33.121539
14 2020/08/03 2020-08-03 03:00:33.162052 ... 1 2020-08-04 03:00:33.162052
[15 rows x 5 columns]
添加回答
举报