我在以下视图中有 2 个数据框: dogs数据框是:DogID PuppyName1 PuppyName2 PuppyName3 PuppyName4 DogWeightDog1 Nick NaN NaN NaN 12.7Dog2 Jack Fox Rex NaN 15.5 Dog3 Snack NaN NaN NaN 10.2Dog4 Yosee Petty NaN NaN 16.9puppyWeights数据框是:PuppyName Jan17 Jun18 Dec18 April19 Nick 0.8 1.7 3.7 4.6Jack 0.6 1.3 2.8 3.5 Fox 0.9 1.7 3.4 4.3Rex 1.0 2.3 3.0 4.2Snack 0.8 1.7 2.8 4.4Yosee 0.6 1.2 3.1 4.3Petty 0.5 1.3 2.8 3.5 Dogs我需要根据数据框将有关幼犬体重的信息按月添加到数据PuppyWeights框中。如果 Dog 有超过 1 个孩子,例如:Dog2, Dog3-> 我需要对每个月的体重值取平均值PuppyName。例如: Dog2应该是表Jack和表Fox中的值之间的平均值PuppyWeights:DogID Jan17 Jun18 Dec18 April19 DogWeightDog2 0.75 1.5 3.1 3.9 15.5 我尝试使用melt函数将['PuppyName1', 'PuppyName2', 'PuppyName3', 'PuppyName4']列转换为行。dogs但是,当狗有多个孩子时,我不知道如何通过聚合将月份信息添加到数据帧中。df2 = dogs.melt(id_vars=['DogID','DogWeight'], var_name="Puppies", value_name='PuppyName')期望的输出是:DogID Jan17 Jun18 Dec18 April19 DogWeightDog1 0.8 1.7 3.7 4.6 12.7 Dog2 0.75 1.5 3.1 3.9 15.5 Dog3 0.8 1.7 2.8 4.4 10.2Dog4 0.55 1.25 2.95 3.9 16.9 如何按月份将体重信息添加到dogs数据框中?我会很感激任何想法。谢谢)
1 回答
慕工程0101907
TA贡献1887条经验 获得超5个赞
这是melt, dogsthenmerge和groupby
df2 = dogs.melt(id_vars=['DogID','DogWeight'], var_name="Puppies", value_name='PuppyName').dropna()
df2.merge(df,on='PuppyName',how='left').groupby('DogID').mean()
Out[423]:
DogWeight Jan17 Jun18 Dec18 April19
DogID
Dog1 12.7 0.800000 1.700000 3.700000 4.6
Dog2 15.5 0.833333 1.766667 3.066667 4.0
Dog3 10.2 0.800000 1.700000 2.800000 4.4
Dog4 16.9 0.550000 1.250000 2.950000 3.9
添加回答
举报
0/150
提交
取消