对于给定的数据框列,我想按天随机选择大约 60% 并添加到新列,将剩余的 40% 添加到另一列,将 40% 列乘以 (-1),然后创建一个新列每天将这些重新合并在一起(这样每天我的比例为 60/40):我在没有每日规范的情况下问了同样的问题:Randomly selection rows from dataframe column下面的示例说明了这一点(尽管我的比率不完全是 60/40):dict0 = {'date':[1/1/2019,1/1/2019,1/1/2019,1/2/2019,1/1/2019,1/2/2019],'x1': [1,2,3,4,5,6]}df = pd.DataFrame(dict0)### df['date'] = pd.to_datetime(df['date']).dt.date dict1 = {'date':[1/1/2019,1/1/2019,1/1/2019,1/2/2019,1/1/2019,1/2/2019],'x1': [1,2,3,4,5,6],'x2': [1,'nan',3,'nan',5,6],'x3': ['nan',2,'nan',4,'nan','nan']}df = pd.DataFrame(dict1)### df['date'] = pd.to_datetime(df['date']).dt.date dict2 = {'date':[1/1/2019,1/1/2019,1/1/2019,1/2/2019,1/1/2019,1/2/2019],'x1': [1,2,3,4,5,6],'x2': [1,'nan',3,'nan',5,6],'x3': ['nan',-2,'nan',-4,'nan','nan']}df = pd.DataFrame(dict2)### df['date'] = pd.to_datetime(df['date']).dt.date dict3 = {'date':[1/1/2019,1/1/2019,1/1/2019,1/2/2019,1/1/2019,1/2/2019],'x1': [1,2,3,4,5,6],'x2': [1,'nan',3,'nan',5,6],'x3': ['nan',-2,'nan',- 4,'nan','nan'],'x4': [1,-2,3,-4,5,6]}df = pd.DataFrame(dict3)### df['date'] = pd.to_datetime(df['date']).dt.date
1 回答
千万里不及你
TA贡献1784条经验 获得超9个赞
您可以使用groupbyandsample获取index值,然后使用 loc 创建列 x4,并fillna使用 -1 相乘的列,如:
idx= df.groupby('date').apply(lambda x: x.sample(frac=0.6)).index.get_level_values(1)
df.loc[idx, 'x4'] = df.loc[idx, 'x1']
df['x4'] = df['x4'].fillna(-df['x1'])
添加回答
举报
0/150
提交
取消