2 回答
TA贡献1890条经验 获得超9个赞
不幸的是,示例输入数据不包含na
将被计算值替换的值(或大于一项的组)。因此,新列是原始列的简单副本。
第一个条件可以测试并np.where
应用于每一行transform
df[['delivery_ratio_filled','case_ratio_filled']] = (
df.groupby(['Node', 'DAY', 'COMMODITY_CODE'])[['deliveries_ratio','case_ratio']]
.transform(
lambda x: np.where(x.isna(), x.mean(), x)))
第二个条件不需要分组
df['delivery_ratio_filled'] = (
np.where(df['delivery_ratio_filled'].isna(),
1 / df['window_count'],
df['delivery_ratio_filled']))
df['case_ratio_filled'] = (
np.where(df['case_ratio_filled'].isna(),
1 / df['window_count'],
df['case_ratio_filled']))
df
出去:
Node COMMODITY_CODE ... delivery_ratio_filled case_ratio_filled
0 7014.0 SCFZ ... 0.354839 0.357854
1 7014.0 SCFZ ... 0.423077 0.457945
2 7014.0 SCFZ ... 0.258621 0.283379
3 7030.0 SCDD ... 0.300000 0.316505
4 7030.0 SCDD ... 0.232558 0.236513
TA贡献1772条经验 获得超5个赞
也可以用dplyrpython的方式实现:
>>> from datar.all import f, tribble, group_by, mutate, if_else, is_na, mean
>>>
>>> df = tribble(
... f.Node, f.COMMODITY_CODE, f.DAY, f.Capacity_Case, f.Capacity_Delivery, f.case_ratio, f.delivery_ratio, f.window_count,
... 7014.0, "SCFZ", 1, 26610.0, 12.0, 0.357854, 0.354839, 3,
... 7014.0, "SCFZ", 2, 25551.0, 11.0, 0.457945, 0.423077, 3,
... 7014.0, "SCFZ", 3, 30669.0, 13.0, 0.283379, 0.258621, 3,
... 7030.0, "SCDD", 1, 34244.0, 16.0, 0.316505, 0.300000, 4,
... 7030.0, "SCDD", 2, 25954.0, 13.0, 0.236513, 0.232558, 4,
... )
>>>
>>>
>>> df >> \
... group_by(f.Node, f.DAY, f.COMMODITY_CODE) >> \
... mutate(delivery_ratio_filled = if_else(~is_na(f.delivery_ratio),
... f.delivery_ratio,
... mean(f.delivery_ratio)),
... case_ratio_filled = if_else(~is_na(f.case_ratio),
... f.case_ratio,
... mean(f.case_ratio))) >> \
... mutate(delivery_ratio_filled = if_else(~is_na(f.delivery_ratio_filled),
... f.delivery_ratio_filled,
... 1.0 / f.window_count),
... case_ratio_filled = if_else(~is_na(f.case_ratio_filled),
... f.case_ratio_filled,
... 1.0 / f.window_count))
Node COMMODITY_CODE DAY Capacity_Case Capacity_Delivery case_ratio delivery_ratio window_count delivery_ratio_filled case_ratio_filled
<float64> <object> <int64> <float64> <float64> <float64> <float64> <int64> <float64> <float64>
0 7014.0 SCFZ 1 26610.0 12.0 0.357854 0.354839 3 0.354839 0.357854
1 7014.0 SCFZ 2 25551.0 11.0 0.457945 0.423077 3 0.423077 0.457945
2 7014.0 SCFZ 3 30669.0 13.0 0.283379 0.258621 3 0.258621 0.283379
3 7030.0 SCDD 1 34244.0 16.0 0.316505 0.300000 4 0.300000 0.316505
4 7030.0 SCDD 2 25954.0 13.0 0.236513 0.232558 4 0.232558 0.236513
[Groups: Node, DAY, COMMODITY_CODE (n=5)]
添加回答
举报