首页猿问为什么我在 python...

为什么我在 python 脚本中收到整数太大而无法转换为浮点数的错误？

Python

白板的微信 2023-04-11 15:49:54

我是 python 的新手。我正在尝试解决错误我有一个数据框（reprex）-import pandas as pd df Out[29]: Id ServiceSubCodeKey PrintDate 0 1895650 2 2018-07-27 1 1895650 4 2018-08-13 2 1896355 2 2018-08-10 3 1897675 9 2018-08-13 4 1897843 2 2018-08-10 5 2178737 3 2019-06-14 6 2178737 4 2019-06-14 7 2178737 7 2019-06-14 8 2178737 1 2019-06-14 9 2178750 699 2019-06-14columns = ( pd.get_dummies(df["ServiceSubCodeKey"]) .reindex(range(df.ServiceSubCodeKey.min(), df.ServiceSubCodeKey.max()+1), axis=1, fill_value=0) # now it has all digits .astype(str) )codes = pd.Series( [int(''.join(row)) for row in columns.itertuples(index=False)], index=df.index)codes = ( codes.groupby(df.Id).transform('sum').astype('str') .str.pad(width=columns.shape[1], fillchar='0') .str.rstrip('0') # this will remove trailing 0's )print(codes)df = df.assign(one_hot_ssc=codes)OverflowError: int too large to convert to float当我尝试对其进行故障排除时，此错误发生在该部分codes = pd.Series( [int(''.join(row)) for row in columns.itertuples(index=False)], index=df.index)如果我将最后一个服务子代码更改为 60 或更低的数字而不是 699，此错误就会消失。这个错误有什么解决办法吗？我希望它甚至可以用于 5 位数字。寻找永久解决方案

查看完整描述

1 回答

墨色风雨

TA贡献1853条经验获得超6个赞

罪魁祸首似乎是 pandas 试图将值转换为浮点数。

[int(''.join(row)) for row in columns.itertuples(index=False)]有效，但将其转换为系列却pd.Series无效。
我不知道为什么 pandas 试图将intstofloats

解决方法是，以熊猫没有机会尝试将其转换ints为floats.
dfg[0]是list一个int
以下代码也适用于'ServiceSubCodeKey'等于99999

import pandas as pd

# this will create codes

codes_values = [int(''.join(r)) for r in columns.itertuples(index=False)]

codes = pd.Series({'test': codes_values}).explode()

codes.index = df.index

# groupby and aggregate the values into lists

dfg = codes.groupby(df.Id).agg(list).reset_index()

# sum the lists; doing this with a pandas function also does not work, so no .sum or .apply

summed_lists = list()

for r, v in dfg.iterrows():

summed_lists.append(str(sum(v[0])))

# assign the list of strings to a column

dfg['sums'] = summed_lists

# perform the remainder of the functions on the sums column

dfg['final'] = dfg.sums.str.pad(width=columns.shape[1], fillchar='0').str.rstrip('0')

# display(dfg.final)

0 0101

1 01

2 000000001

3 01

4 1011001

5 0000000000000000000000000000000000000000000000...

Name: final, dtype: object

# merge df and dfg.final

dfm = pd.merge(df, dfg[['Id', 'final']], on='Id')

# display(dfm)

Id ServiceSubCodeKey PrintDate final

0 1895650 2 2018-07-27 0101

1 1895650 4 2018-08-13 0101

2 1896355 2 2018-08-10 01

3 1897675 9 2018-08-13 000000001

4 1897843 2 2018-08-10 01

5 2178737 3 2019-06-14 1011001

6 2178737 4 2019-06-14 1011001

7 2178737 7 2019-06-14 1011001

8 2178737 1 2019-06-14 1011001

9 2178750 99999 2019-06-14 ...000000001

反对回复 2023-04-11

1 回答
0 关注
120 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

为什么我在 python 脚本中收到整数太大而无法转换为浮点数的错误？

为什么我在 python 脚本中收到整数太大而无法转换为浮点数的错误？

1 回答

添加回答