我的 Pandas DataFrame 中有一个名为“State”的列。它包含美国州的缩写。我有硬编码的地区,我想为每个州的地区创建一个新列。我使用了 pd.Series.apply(),但我想知道这种类型的映射是否有更好的做法。关于如何改进我的代码的任何建议?这是我当前有效的代码,但我只是愿意就最佳实践提出建议。def get_region(s, *regions): if s in regions[0]: return 'west' elif s in regions[1]: return 'midwest' elif s in regions[2]: return 'south' elif s in regions[3]: return 'northeast' else: return Nonewest = ['WA','OR','CA','ID','NV','MT','WY','UT','AZ','CO','NM']midwest = ['ND','MN','WI','MI','SD','NE','KS','IA','MO','IL','IN','OH']south = ['TX','OK','AR','LA','MS','TN','KY','AL','GA','FL','SC','NC','VA','WV','MD','DE']northeast = ['PA','NJ','NY','CT','MA','RI','VT','NH','ME']regions = [west,midwest,south,northeast]full_df['Region'] = full_df['State'].apply(get_region, args=regions)full_df['Region'].head(15)Out:0 west1 midwest2 south3 south4 midwest5 west6 south7 south8 west9 midwest10 south11 northeast12 northeast13 west14 westName: Region, dtype: object
2 回答
侃侃尔雅
TA贡献1801条经验 获得超16个赞
检查 map
s=pd.DataFrame([west,midwest,south,northeast],index=['west','midwest','south','northeast'])
s=s.reset_index().melt('index')
full_df['Region'] = full_df['State'].map(dict(zip(s['value'],s['index'])))
慕桂英4014372
TA贡献1871条经验 获得超13个赞
您可以尝试创建一个 dict 并将其映射到列:
west_dict = {i:"west" for i in west}
midwest_dict = {i:"midwest" for i in midwest}
south_dict = {i:"south" for i in south}
northeast_dict = {i:"northeast" for i in northeast}
d = {**west_dict, **midwest_dict, **south_dict, **northeast_dict}
full_df['Region'] = full_df['State'].map(d)
添加回答
举报
0/150
提交
取消