1 回答
TA贡献1854条经验 获得超8个赞
这是我的解决方案:我使用库 geopy 来计算距离。
您可以选择在 geodesic() 或 great_circle() 中计算距离,函数 distance = geodesic。
你可以在度量标准更改.km到.miles或m或ft如果你喜欢别的指标
from geopy.distance import lonlat, distance, great_circle,geodesic
dmin=[]
for index, r in df_actual.iterrows():
valmin = df_fleet.apply(lambda x:
distance(lonlat(x['Longitude'], x['Latitude']),
lonlat(r['Longitude'], r['Latitude'])).km,axis=1).min()
dmin.append(valmin)
df_actual['nearest to fleet(km)'] = dmin
print(df_actual)
如果你想要所有舰队点 < 100m 每个实际点,你做
for ai, a in df_actual.iterrows():
actual = lonlat(a['Longitude'], a['Latitude'])
filter = df_fleet.apply(lambda x:
distance(lonlat(x['Longitude'], x['Latitude']), actual).meters < 100 ,axis=1)
print(f"for {(a['Longitude'], a['Latitude'])}"); print(df_fleet[filter])
最后一个解决方案基于树计算,我认为它非常非常快,我正在使用 scipy 空间,它计算空间中的最近点并给出欧几里得距离的结果。我刚刚调整了 x,y,z 空间点中的 lat,lon 以获得正确的结果(测地线或半正弦)。在这里,我生成了 2 个(纬度,经度)15000 和 10000 行的数据帧,我正在为 df2 中的每个 df1 搜索五个最近的数据帧
from random import uniform
from math import radians, sin, cos
from scipy.spatial import cKDTree
import pandas as pd
import numpy as np
def to_cartesian(lat, lon):
lat = radians(lat); lon = radians(lon)
R = 6371
x = R * cos(lat) * cos(lon)
y = R * cos(lat) * sin(lon)
z = R * sin(lat)
return x, y , z
def newpoint():
return uniform(23, 24), uniform(66, 67)
def ckdnearest(gdA, gdB, bcol):
nA = np.array(list(zip(gdA.x, gdA.y, gdA.z)) )
nB = np.array(list(zip(gdB.x, gdB.y, gdB.z)) )
btree = cKDTree(nB)
dist, idx = btree.query(nA,k=5) #search the first 5 (k=5) nearest point df2 for each point of df1
dist = [d for d in dist]
idx = [s for s in idx]
df = pd.DataFrame.from_dict({'distance': dist,
'index of df2' : idx})
return df
#create the first df (actual)
n = 15000
lon,lat = [],[]
for x,y in (newpoint() for x in range(n)):
lon += [x];lat +=[y]
df1 = pd.DataFrame({'lat': lat, 'lon': lon})
df1['x'], df1['y'], df1['z'] = zip(*map(to_cartesian, df1.lat, df1.lon))
#-----------------------
#create the second df (fleet)
n = 10000
lon,lat = [],[]
for x,y in (newpoint() for x in range(n)):
lon += [x];lat +=[y]
id = [x for x in range(n)]
df2 = pd.DataFrame({'lat': lat, 'lon': lon})
df2['x'], df2['y'], df2['z'] = zip(*map(to_cartesian, df2.lat, df2.lon))
#-----------------------
df = ckdnearest(df1, df2, 'unused')
print(df)
如果你只想要 1 个没有笛卡尔坐标的最近点:
def ckdnearest(gdA, gdB, bcol):
nA = np.array(list(zip(gdA.lat, gdA.lon)))
nB = np.array(list(zip(gdB.lat, gdB.lon)))
btree = cKDTree(nB)
dist, idx = btree.query(nA,k=1) #search the first nearest point df2
df = pd.DataFrame.from_dict({'distance': dist, 'index of df2' : idx})
return df
添加回答
举报