2 回答

TA贡献1820条经验 获得超9个赞
干得好:
import pandas as pd
df = pd.DataFrame({'Model': ['audi', 'audi', 'bmw', 'bmw', 'ford', 'ford'],'Age':[1,2,1,2,1,2] , 'Fraud': [1,1,0,0,1,0]})
# group df by Age
grouped_age = df.groupby('Age', as_index=False).mean()
merged_df = pd.merge(df, grouped_age, on=['Age'], how='inner')
df = merged_df.rename({'Age': 'x', 'Fraud_x': 'Fraud', 'Fraud_y':'Age'}, axis='columns')
df = df.drop('x', axis=1)
# group df by Model
grouped_df = df.groupby('Model', as_index=False).mean()
merged_df = pd.merge(df, grouped_df, on=['Model'], how='inner')
# some display corrections
df = merged_df.rename({'Model': 'x', 'Fraud_x': 'Fraud', 'Fraud_y':'Model', 'Age_x':'Age'}, axis='columns')
df = df.drop(['x', 'Age_y'], axis=1)
df = df[['Model', 'Age', 'Fraud']]
df['Model'] = df['Model'] * 100
df['Age'] = (df['Age'] * 100).round(0)
输出:
Model Age Fraud
0 100.0 67.0 1
1 100.0 33.0 1
2 0.0 67.0 0
3 0.0 33.0 0
4 50.0 67.0 1
5 50.0 33.0 0

TA贡献1829条经验 获得超9个赞
我不确定我是否理解您的代码,但在这里我将如何做到这一点:
for col in df.iloc[:, :-1]:
group_df = df.groupby(col).mean()*100
df[col] = df[col].map(group_df['Fraud'])
结果
Model Age Fraud
0 100.0 66.666667 1
1 100.0 33.333333 1
2 0.0 66.666667 0
3 0.0 33.333333 0
4 50.0 66.666667 1
5 50.0 33.333333 0
它假定欺诈 col 将是最后一个 col
添加回答
举报