1 回答
TA贡献1853条经验 获得超6个赞
罪魁祸首似乎是 pandas 试图将值转换为浮点数。
[int(''.join(row)) for row in columns.itertuples(index=False)]
有效,但将其转换为系列却pd.Series
无效。我不知道为什么 pandas 试图将
ints
tofloats
解决方法是,以熊猫没有机会尝试将其转换
ints
为floats
.dfg[0]
是list
一个int
以下代码也适用于
'ServiceSubCodeKey'
等于99999
import pandas as pd
# this will create codes
codes_values = [int(''.join(r)) for r in columns.itertuples(index=False)]
codes = pd.Series({'test': codes_values}).explode()
codes.index = df.index
# groupby and aggregate the values into lists
dfg = codes.groupby(df.Id).agg(list).reset_index()
# sum the lists; doing this with a pandas function also does not work, so no .sum or .apply
summed_lists = list()
for r, v in dfg.iterrows():
summed_lists.append(str(sum(v[0])))
# assign the list of strings to a column
dfg['sums'] = summed_lists
# perform the remainder of the functions on the sums column
dfg['final'] = dfg.sums.str.pad(width=columns.shape[1], fillchar='0').str.rstrip('0')
# display(dfg.final)
0 0101
1 01
2 000000001
3 01
4 1011001
5 0000000000000000000000000000000000000000000000...
Name: final, dtype: object
# merge df and dfg.final
dfm = pd.merge(df, dfg[['Id', 'final']], on='Id')
# display(dfm)
Id ServiceSubCodeKey PrintDate final
0 1895650 2 2018-07-27 0101
1 1895650 4 2018-08-13 0101
2 1896355 2 2018-08-10 01
3 1897675 9 2018-08-13 000000001
4 1897843 2 2018-08-10 01
5 2178737 3 2019-06-14 1011001
6 2178737 4 2019-06-14 1011001
7 2178737 7 2019-06-14 1011001
8 2178737 1 2019-06-14 1011001
9 2178750 99999 2019-06-14 ...000000001
添加回答
举报