2 回答
data:image/s3,"s3://crabby-images/1ebaf/1ebaff80860caf259e486797c07cce498c1b6e82" alt="?"
TA贡献1875条经验 获得超3个赞
我下载了该文件并将其作为 csv 文件保存在我的计算机中。然后我运行了以下代码。
import pandas as pd
df = pd.read_csv('parse_this.csv')
#remove characters and convert to list
df.tracts = df.tracts.apply(lambda x: x.strip('][').split(','))
#explode tracts series
df = df.explode('tracts')
#resetting index and renaming columns
df.reset_index(drop = True, inplace = True)
df.rename(columns={"tracts": "geoid"} , inplace = True)
#removing extra characters to keep only the geoid number
df.geoid = df.geoid.apply(lambda x: x.strip('geoid{}:""'))
df
data:image/s3,"s3://crabby-images/cda2d/cda2dec0537c809a7fa12cc23aa6b72a6c449b80" alt="?"
TA贡献1851条经验 获得超5个赞
我希望这个例子有帮助:
#creating a dataframe for example:
d = [{'A':3,'B':[{'id':'001'},{'id':'002'}]},
{'A':4,'B':[{'id':'003'},{'id':'004'}]},
{'A':5,'B':[{'id':'005'},{'id':'006'}]},
{'A':6,'B':[{'id':'007'},{'id':'008'}]}]
df = pd.DataFrame(d)
df
A B
0 3 [{'id': '001'}, {'id': '002'}]
1 4 [{'id': '003'}, {'id': '004'}]
2 5 [{'id': '005'}, {'id': '006'}]
3 6 [{'id': '007'}, {'id': '008'}]
#apply an explode to the column B and reset index
df = df.explode('B')
df.reset_index(drop = True, inplace = True)
df
# now it looks like this
A B
0 3 {'id': '001'}
1 3 {'id': '002'}
2 4 {'id': '003'}
3 4 {'id': '004'}
4 5 {'id': '005'}
5 5 {'id': '006'}
6 6 {'id': '007'}
7 6 {'id': '008'}
# now we need to remove the extra text and rename the column from B to id
df.B = df.B.apply(lambda x: x['id'])
df.rename(columns={"B": "id"} , inplace = True)
# this is the final product:
df
A id
0 3 001
1 3 002
2 4 003
3 4 004
4 5 005
5 5 006
6 6 007
7 6 008
添加回答
举报