首页猿问如何用两个列分组值的中值替换数据框...

如何用两个列分组值的中值替换数据框中的空值？

Python

慕妹3242003 2022-04-23 21:50:27

我在 Python 中有一个数据框，其中包括个人在一周内使用某些食物的频率。我想清理我的数据框并用每个人使用的每个食物类别的中值频率替换空值。如何用每个人的每个食物类别的 meidan 替换空值？user ffq food food-category 1 1 apple fruit 1 3 banana fruit 1 2 tomato vegetables 1 nan carrot vegetables 1 3 potato vegetables 1 nan peach fruit 2 3 apple fruit 2 nan banana fruit 2 2 tomato vegetables 2 nan carrot vegetables 2 3 peach fruit结果应该是这样的：user ffq food food-category 1 1 apple fruit 1 3 banana fruit 1 2 tomato vegetables 1 **2.5** carrot vegetables 1 3 potato vegetables 1 **2** peach fruit 2 3 apple fruit 2 **3** banana fruit 2 2 tomato vegetables 2 **2** carrot vegetables 2 3 peach fruit如果有人可以提供帮助，我将不胜感激

查看完整描述

2 回答

侃侃无极

TA贡献2051条经验获得超10个赞

我猜你想用组的平均值而不是中位数填充缺失值。我们可以使用.fillna()with.groupby()和.transform()函数来通过一行代码完成此操作。首先，让我们创建包含所需列的 DataFrame。

# Create a DataFrame

df = pd.DataFrame({'user':['1','1','1','1','1','1', '2', '2', '2', '2', '2'],

'ffq':[1, 3, 2, np.nan, 3, np.nan, 3, np.nan, 2, np.nan, 3],

'food-category':['fruit', 'fruit', 'vegetables', 'vegetables',

'vegetables', 'fruit', 'fruit', 'fruit', 'vegetables',

'vegetables', 'fruit']})

我们现在可以使用所需的插补方法填充缺失值，例如均值、中位数或众数。下面的插补是用平均值完成的，以获得问题中提到的结果。

# Apply fillna function within each group

df['ffq'] = df.groupby(['user', 'food-category']).transform(lambda x: x.fillna(x.mean()))

user ffq food-category

0 1 1.0 fruit

1 1 3.0 fruit

2 1 2.0 vegetables

3 1 2.5 vegetables

4 1 3.0 vegetables

5 1 2.0 fruit

6 2 3.0 fruit

7 2 3.0 fruit

8 2 2.0 vegetables

9 2 2.0 vegetables

10 2 3.0 fruit

该.transform()方法用于执行特定于组的计算，在这个例子中是平均值，它返回一个类似索引的对象。有关详细信息，请参阅用户指南。

反对回复 2022-04-23

梦里花落0921

TA贡献1772条经验获得超6个赞

这是你如何做到的。首先，我们需要对值进行排序，以便在使用 groupby 时它们以正确的顺序出现。接下来我们计算平均值，然后我们需要NaN用我们提取的序列填充 s。

df = df.sort_values(['user','food-category'])

srs = df.dropna().groupby(['user','food-category']).agg({'ffq':'mean'})['ffq']

srs.index = df[df['ffq'].isnull()].index

df['ffq'] = df['ffq'].fillna(value=srs)

结果

df.sort_index()

user ffq food food-category

0 1 1.0 apple fruit

1 1 3.0 banana fruit

2 1 2.0 tomato vegetables

3 1 2.5 carrot vegetables

4 1 3.0 potato vegetables

5 1 2.0 peach fruit

6 2 3.0 apple fruit

7 2 3.0 banana fruit

8 2 2.0 tomato vegetables

9 2 2.0 carrot vegetables

10 2 3.0 peach fruit

反对回复 2022-04-23

2 回答
0 关注
101 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何用两个列分组值的中值替换数据框中的空值？

如何用两个列分组值的中值替换数据框中的空值？

2 回答

添加回答