调用交通 api 并使用 python 获取错误格式的数据。#!/usr/bin/env python# make sure to install these packages before running:# pip install pandas# pip install sodapyimport pandas as pdfrom sodapy import Socrata# Unauthenticated client only works with public data sets. Note 'None'# in place of application token, and no username or password:client = Socrata("data.pa.gov", None)# Example authenticated client (needed for non-public datasets):# client = Socrata(data.pa.gov,# MyAppToken,# userame="user@example.com",# password="AFakePassword")# First 2000 results, returned as JSON from API / converted to Python list of# dictionaries by sodapy.results = client.get("dc5b-gebx", limit=50000)# Convert to pandas DataFrameresults_df = pd.DataFrame.from_records(results)results_df.latitude 出来是这样的latitude0 40 36:56.627这显然是不正确的,假设这是由于 api 调用的处理方式造成的?还有另一个 location_1 列,它有这样的字符串数据。 location_1 0 {'latitude': '40.6157', 'longitude': '-75.4621'} 1 {'latitude': '40.4587', 'longitude': '-79.9985'} 2 {'latitude': '39.9328', 'longitude': '-75.2891'} 3 {'latitude': '40.4435', 'longitude': '-80.0046'} 4 {'latitude': '40.5994', 'longitude': '-75.4703'}I need the lat and lon as separate columns对于最好的方法超级困惑,目前我感到很奇怪,我正在考虑简单地像这样处理数据框, list(df.location_1.values)然后循环遍历内部值, dict = {} n = 0 for x in list: n+=1 append(x.strip())
1 回答
精慕HU
TA贡献1845条经验 获得超8个赞
尝试这个:
df["location_1"] = df["location_1"].apply(lambda x : dict(eval(x))) df2 = df['location_1'].apply(pd.Series)
df2 将包含您的 lang 和 lat。然后您可以合并或连接 df2 到 df
添加回答
举报
0/150
提交
取消