我有一个如下所示的数据框+---+-------------+---------+---------------+---------------+---------+------+--------------------------+-----+----------+| | Pregnancies | Glucose | BloodPressure | SkinThickness | Insulin | BMI | DiabetesPedigreeFunction | Age | Outcome |+---+-------------+---------+---------------+---------------+---------+------+--------------------------+-----+----------+| 0 | 6 | 148.0 | 72.0 | 35.0 | 125.0 | 33.6 | 0.627 | 50 | 1 || 1 | 1 | 85.0 | 66.0 | 29.0 | 125.0 | 26.6 | 0.351 | 31 | 0 || 2 | 8 | 183.0 | 64.0 | 29.0 | 125.0 | 23.3 | 0.672 | 32 | 1 || 3 | 1 | 89.0 | 66.0 | 23.0 | 94.0 | 28.1 | 0.167 | 21 | 0 || 4 | 0 | 137.0 | 40.0 | 35.0 | 168.0 | 43.1 | 2.288 | 33 | 1 |+---+-------------+---------+---------------+---------------+---------+------+--------------------------+-----+----------+看了每个变量的箱线图后,我发现它们中有异常值。所以在每一列中,除了Outcome我想替换该特定列的值greater than 95 percentile with value at 75 percentile和值less than 5 percentile with 25 percentile例如,在Glucose高于 95 个百分位的列值中,我想用Glucose列的75 个百分位的值替换它们我怎样才能用熊猫过滤器和百分位函数来做到这一点对此的任何帮助将不胜感激
添加回答
举报
0/150
提交
取消