我正在阅读一个没有标题的管道分隔文件,我正在使用 Pandas 0.24.2 版。这是公开数据,所以不用担心机密性。数据如下:999778247820|R|JPMORGAN CHASE BANK, NATIONAL ASSOCIATION|7.375|113000|360|02/2001|04/2001|95|95|1|52|665|Y|P|SF|1|P|IL|601|30|FRM||1|N999783196683|R|OTHER|7.25|59000|360|01/2001|04/2001|97|97|2|43|682|Y|P|PU|1|P|HI|967|30|FRM|676|1|N999783470376|C|BANK OF AMERICA, N.A.|7.875|110000|360|12/2000|02/2001|74|74|2|26|700|N|P|SF|1|P|NY|125||FRM|698||N999786911479|C|BANK OF AMERICA, N.A.|7.5|57000|360|12/2000|02/2001|90|90|1|28|699|N|P|SF|1|P|TX|781|25|FRM||1|N999786913710|R|JPMORGAN CHASE BANK, NA|7.125|114000|360|01/2001|04/2001|73|73|2|16|745|N|C|SF|1|P|WA|992||FRM|||N999788833695|B|OTHER|9|50000|360|10/2000|12/2000|90|90|2|40|674|N|P|SF|2|I|WI|535|25|FRM|737|1|N这是我正在使用的代码:orig_files_fnma = glob.glob("/...1/Acquisition*.txt")col_names = ["loan_id", "origination_channel","seller_name","original_interest_rate","original_upb","original_loan_term","origination_date","first_payment_date","original_ltv","original_cltv","number_of_borrowers","original_dti", "borrower_fico_at_origination","first_time_home_buyer_indicator", "loan_purpose","property_type","number_of_units","occupancy_type","property_state","zip_code_short","primary_mortgage_insurance_percent", "product_type","coborrower_fico_at_origination","mortgage_insurance_type","relocation_mortgage_indicator"]总是出现以下错误:Filled 1 NA values in column original_ltvFilled 52 NA values in column original_cltvValueError: Unable to convert column number_of_borrowers to type int我确实发现我是否没有预先定义 dtype 和 .astype 以在加载后更改数据类型。但是问我是否可以像上面的代码一样先预定义数据类型。另外,我想将对象的长度定义为 20 长度。这样做的正确代码是什么?
1 回答

慕桂英3389331
TA贡献2036条经验 获得超8个赞
我得到了一个不同的错误:
ValueError: Unable to convert column coborrower_fico_at_origination to type int
import 将数据导入 Excel,您会看到此列中有 3 行是空白的。该int
类型不能处理空白。您应该将其更改为浮动,其上的空白变为nan
:
col_type = {..., "coborrower_fico_at_origination": "float", ...}
之后命令成功。
添加回答
举报
0/150
提交
取消