首页猿问解析dataframe中json类...

解析dataframe中json类型格式的元素

Python

Smart猫小萌 2023-04-11 14:42:39

我有这个带有大地水准面的数据框，看起来像这样我想要做的是将每个 msaid 的每个大地水准面编号放入列表中。理想情况下，我希望有一个看起来像这样的数据框我希望这是有道理的。任何帮助，将不胜感激。这里有两个例子：159 [{"geoid":"02020000101"},{"geoid":"02020000204"},{"geoid":"02020000300"},{"geoid":"02020000400"},{"geoid":"02020000500"},{"geoid":"02020000600"},{"geoid":"02020000802"},{"geoid":"02020000901"},{"geoid":"02020000902"},{"geoid":"02020001000"},{"geoid":"02020001500"},{"geoid":"02020001601"},{"geoid":"02020001602"},{"geoid":"02020001701"},{"geoid":"02020001802"},{"geoid":"02020001900"},{"geoid":"02020002000"},{"geoid":"02020002100"},{"geoid":"02020002201"},{"geoid":"02020002400"},{"geoid":"02020002501"},{"geoid":"02020002502"},{"geoid":"02020002601"},{"geoid":"02020002712"},{"geoid":"02020002811"},{"geoid":"02020002812"},{"geoid":"02020002813"},{"geoid":"02122000100"},{"geoid":"02122000300"},{"geoid":"02170001300"},{"geoid":"02170000300"},{"geoid":"02170001100"},{"geoid":"02170000800"},{"geoid":"02261000300"},{"geoid":"02290000400"},{"geoid":"02240000400"},{"geoid":"02170000102"},{"geoid":"02170000402"},{"geoid":"02170000101"},{"geoid":"02170001201"},{"geoid":"02170001001"},{"geoid":"02170000706"},{"geoid":"02170001202"},{"geoid":"02170001004"},{"geoid":"02170000705"},{"geoid":"02170000603"},{"geoid":"02020000102"},{"geoid":"02020000201"},{"geoid":"02020000202"},{"geoid":"02020000203"},{"geoid":"02020000701"},{"geoid":"02020000702"},{"geoid":"02020000703"},{"geoid":"02020000801"},{"geoid":"02020001100"},{"geoid":"02020001200"},

查看完整描述

2 回答

翻过高山走不出你

TA贡献1875条经验获得超3个赞

我下载了该文件并将其作为 csv 文件保存在我的计算机中。然后我运行了以下代码。

import pandas as pd

df = pd.read_csv('parse_this.csv')

#remove characters and convert to list

df.tracts = df.tracts.apply(lambda x: x.strip('][').split(','))

#explode tracts series

df = df.explode('tracts')

#resetting index and renaming columns

df.reset_index(drop = True, inplace = True)

df.rename(columns={"tracts": "geoid"} , inplace = True)

#removing extra characters to keep only the geoid number

df.geoid = df.geoid.apply(lambda x: x.strip('geoid{}:""'))

反对回复 2023-04-11

江户川乱折腾

TA贡献1851条经验获得超5个赞

我希望这个例子有帮助：

#creating a dataframe for example:

d = [{'A':3,'B':[{'id':'001'},{'id':'002'}]},

{'A':4,'B':[{'id':'003'},{'id':'004'}]},

{'A':5,'B':[{'id':'005'},{'id':'006'}]},

{'A':6,'B':[{'id':'007'},{'id':'008'}]}]

df = pd.DataFrame(d)

A B

0 3 [{'id': '001'}, {'id': '002'}]

1 4 [{'id': '003'}, {'id': '004'}]

2 5 [{'id': '005'}, {'id': '006'}]

3 6 [{'id': '007'}, {'id': '008'}]

#apply an explode to the column B and reset index

df = df.explode('B')

df.reset_index(drop = True, inplace = True)

# now it looks like this

A B

0 3 {'id': '001'}

1 3 {'id': '002'}

2 4 {'id': '003'}

3 4 {'id': '004'}

4 5 {'id': '005'}

5 5 {'id': '006'}

6 6 {'id': '007'}

7 6 {'id': '008'}

# now we need to remove the extra text and rename the column from B to id

df.B = df.B.apply(lambda x: x['id'])

df.rename(columns={"B": "id"} , inplace = True)

# this is the final product:

A id

0 3 001

1 3 002

2 4 003

3 4 004

4 5 005

5 5 006

6 6 007

7 6 008

反对回复 2023-04-11

2 回答
0 关注
132 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

解析dataframe中json类型格式的元素

解析dataframe中json类型格式的元素

2 回答

添加回答