如何从我创建的函数创建循环和新数据集？

我有以下房地产数据：neighborhood type_property type_negotiation priceSmallville house rent 2000Oakville apartment for sale 100000King Bay house for sale 250000...我创建了一个函数，可以根据您输入的邻域（如果要出售的房屋）对这些大数据集进行排序，然后返回这些房屋的第10个百分位数和第90个百分位数。我在下面有它：def foo(string): a = df[(df.type_negotiation == 'forsale')&(df.type_property == 'house')&(df.neighborhood == string)] b = pd.DataFrame([[a.price.quantile(0.1), a.price.quantile(0.9), len(a.index)]], columns=('tenthpercentile', 'ninetiethpercentile', 'Quantity')) return bprint(foo('KingBay')) tenthpercentile ninetiethpercentile Quantity0 250000.0 250000.0 1我想编写一个循环，对我拥有的邻居列表执行此操作，然后在新的dat框架中编译每个返回值。看起来像这样： tenthpercentile ninetiethpercentile QuantityKing Bay 250000.0 250000.0 1Smallville 99000.0 120000.0 8Oakville 45000.0 160000.0 6先感谢您。

查看完整描述

1 回答

慕尼黑的夜晚无繁华

TA贡献1864条经验获得超6个赞

通常使用数据框，如果可以的话，最好避免显式循环，并使用提供的优化方法pandas。在您的情况下，可以通过使用groupbywith来消除循环describe，将所需的百分位数传递给parameter percentiles。然后，只需选择所需的列并适当地重命名它们即可：

new_df = (df.groupby('neighborhood')

.describe(percentiles=[0.1,0.9])

['price'][['10%','90%','count']]

.rename(columns={'count':'Quantity',

'10%':'tenthpercentile',

'90%':'ninetiethpercentile'}))

在您的情况下（因为每个邻域只有一个示例）：

>>> new_df

tenthpercentile ninetiethpercentile Quantity

neighborhood

King Bay 250000.0 250000.0 1.0

Oakville 100000.0 100000.0 1.0

Smallville 2000.0 2000.0 1.0

[编辑]：我只是在您的函数中看到您只是在看(df.type_negotiation == 'for sale') & (df.type_property == 'house')。为此，只需添加aloc即可通过以下条件过滤数据框：

new_df = (df.loc[(df.type_negotiation == 'for sale')

& (df.type_property == 'house')]

.groupby('neighborhood')

.describe(percentiles=[0.1,0.9])

['price'][['10%','90%','count']]

.rename(columns={'count':'Quantity',

'10%':'tenthpercentile',

'90%':'ninetiethpercentile'}))

另外，如果您热衷于使用函数和循环（不是我建议的话），则可以执行以下操作：

pd.concat([foo(i) for i in df.neighborhood.unique()])

反对回复 2021-05-25

热搜

最近搜索清空

如何从我创建的函数创建循环和新数据集？

如何从我创建的函数创建循环和新数据集？

1 回答

添加回答