我正在尝试使用 loc 方法更新 pandas 数据框中选定的 datetime64 值来选择满足条件的行。但是,它不会分配新的日期时间值,而是生成 NaT。这是我的代码的简化,显示了问题:import pandas as pdimport numpy as npdatetime = (np.datetime64('1899-12-30'), np.datetime64('1989-12-30'), np.datetime64('2199-12-30'))select = (0, 1, 0)df = pd.DataFrame(list(zip(datetime, select)), columns=['date_time', 'select'])# create a new column by subtracting 180 daysdf['new_date'] = df['date_time'] - pd.Timedelta(180, unit='d')# replace datetime with new date where select is truedf.loc[(df['select'] == 1), ['date_time']] = df.loc[(df['select'] == 1), ['new_date']]print(df)# the second element of the date_time column is "NaT", but this is not the desired outcome.# the desired behaviour is for it to be the same as the second element in the new_date column.关于如何做到这一点或为什么这没有按预期工作的任何想法?
1 回答
慕森卡
TA贡献1806条经验 获得超8个赞
您应该[]
在列名称周围放置:
df.loc[(df['select'] == 1), 'date_time'] = df.loc[(df['select'] == 1), 'new_date']
您还可以删除第二个布尔索引:
df.loc[(df['select'] == 1), 'date_time'] = df['new_date']
另外,np.where
:
df['date_time'] = np.where(df['select']==1, df['new_date'], df['date_time'])
说明:对数据帧df.loc[s, ['col_name']]
进行切片,对系列进行切片。当你这样做时:df.loc[s, 'col_name']
dataframe_slice = another_dataframe_slice
Pandas 将尝试对齐两个数据帧的索引/列。在这种情况下,两个切片没有公共列,因此更新后的数据帧的NaN
值为select==1
。
添加回答
举报
0/150
提交
取消