2 回答
TA贡献1777条经验 获得超10个赞
尝试使用 代替 。。df_final != None
len(df_final) == 0
另外,在命令中,尝试将参数作为列表传递,即pd.concat
df_final = pd.concat([df_final, df])
TA贡献1809条经验 获得超8个赞
从萨扬的萨格格特len(df_final) == 0
我有一个想法,如果我最初将df_final值设置为无或具有相同列的空数据帧,会有所不同吗?
原来是的
这是新代码
def visit_table_links():
links = grab_initial_links()
df_final = pd.DataFrame(columns=['Year', 'Population', 'Municipality'])
for obi in links:
resp = requests.get(obi[1])
tree = html.fromstring(resp.content)
dflist = []
for attr in tree.xpath('//th[contains(normalize-space(text()), "sometext")]/ancestor::table/tbody/tr'):
population = attr.xpath('normalize-space(string(.//td[2]))')
try:
population = population.replace(',', '')
population = int(population)
year = attr.xpath('normalize-space(string(.//td[1]))')
year = re.findall(r'\d+', year)
year = ''.join(year)
year = int(year)
dflist.append([year, population, obi[0]])
except Exception as e:
pass
df = pd.DataFrame(dflist, columns = ['Year', 'Population', 'Municipality'])
df_final = pd.concat([df_final, df])
visit_table_links()
由于某种原因,设置使熊猫抛出该错误,即使在第一次迭代中我分配的时间为无df_final = Nonedf_final = dfdf_final
因此,在下一次迭代中,最初是什么应该无关紧要df_final
出于某种原因,它确实很重要
所以这行而不是这个解决了这个问题。df_final = pd.DataFrame(columns=['Year', 'Population', 'Municipality'])df_final = None
这是合并的数据帧
Year Population Municipality
0 1970 10193 Cape Coral
1 1980 32103 Cape Coral
2 1990 74991 Cape Coral
3 2000 102286 Cape Coral
4 2010 154305 Cape Coral
5 2018 189343 Cape Coral
0 1900 383 Clearwater
1 1910 1171 Clearwater
2 1920 2427 Clearwater
3 1930 7607 Clearwater
4 1940 10136 Clearwater
5 1950 15581 Clearwater
6 1960 34653 Clearwater
7 1970 52074 Clearwater
8 1980 85170 Clearwater
9 1990 98669 Clearwater
10 2000 108787 Clearwater
11 2010 107685 Clearwater
12 2018 116478 Clearwater
0 1970 1489 Coral Springs
1 1980 37349 Coral Springs
2 1990 79443 Coral Springs
3 2000 117549 Coral Springs
4 2010 121096 Coral Springs
5 2018 133507 Coral Springs
添加回答
举报