我有一张“借款人个人ID”和“贷款ID”表。BwrPersonld LoanId113225 16330113225 27073113225 68842113253 16341113269 16348113285 16354113289 26768113297 16360113299 16361113319 16369113418 16403113418 26854我想知道哪些贷款属于同一借款人。所以我“groupby”“BwrPersonalId”和“LoanId”,如下所示。现在我就这样期待着。这是我的代码,但它不起作用。grouped = pd.DataFrame()unique = loan['BwrPersonId'].unique()grouped['BwrPersonId'] = ''*len(loan['BwrPersonId'].unique())grouped['Loan1'] = ''grouped['Loan2'] = ''grouped['Loan3'] = ''grouped['Loan4'] = ''grouped['Loan5'] = ''grouped.iloc[:,0] = uniquefor i in grouped.index: idloan = loan.loc[loan['BwrPersonId'] == unique[i], 'LoanId'] grouped.iloc[i,1:len(idloan)+1] = idloan print(i)我现在该怎么做呢?还有其他方法可以简化代码吗?非常感谢你的帮助。
1 回答
一只甜甜圈
TA贡献1836条经验 获得超5个赞
基本上,您需要做的是创建一个临时变量,该临时变量将使用要排序的数据,以及负责 Id 的名称,以便根据贷款对 Id 进行排序。
import pandas as pd
import numpy as np
from collections import defaultdict
from itertools import count
dict = defaultdict(count)
id, name = pd.factorize([*zip(grouped.id, grouped.name)])
joined = np.array([next(dict[x]) for x in id])
lenOfr, Max = len(name), joined.max() + 1
temp = np.empty((lenOfr, Max), dtype=np.object)
temp[id, joined] = grouped.LoanId
df1 = pd.DataFrame(name.tolist(), columns=['BwrPersonId'])
df2 = pd.DataFrame(temp, columns=['Loan1', 'Loan2', 'Loan3', 'Loan4', 'Loan5'])
final = df1.join(df2)
添加回答
举报
0/150
提交
取消