2 回答
TA贡献1799条经验 获得超9个赞
关于您提到的功能,在您使用 a 重构原始数据框后join(),这可能更接近您所追求的:
combined = df[df['variable']!='ref'].set_index('date').join(df[df['variable']=='ref'].set_index('date'), lsuffix='', rsuffix='_ref').drop('variable_ref', axis=1)
def func(series, ref_series):
#As an example
return series.mean()/ref_series.mean()
combined.groupby('variable').rolling(12).apply(lambda x: func(x, combined.loc[x.index]['value_ref']), raw=False).drop('value_ref', axis=1)
此示例产生以下结果(NaN由于您的示例数据中存在差距):
value
variable date
item1 2014-01-31 NaN
2014-02-28 NaN
2014-03-31 NaN
2014-04-30 NaN
2014-05-31 NaN
2014-06-30 NaN
2014-07-31 NaN
2014-08-31 NaN
2014-09-30 NaN
2014-10-31 NaN
2014-11-30 NaN
2014-12-31 NaN
2015-01-31 NaN
2015-02-28 NaN
2015-03-31 NaN
2015-04-30 NaN
2015-05-31 NaN
2015-06-30 NaN
2015-07-31 NaN
2015-08-31 NaN
2015-09-30 1.912186
2015-10-31 1.793184
2015-11-30 2.609254
2015-12-31 3.009455
2016-01-31 3.123833
2016-02-29 2.910599
2016-03-31 2.708132
2016-04-30 2.497878
2016-05-31 2.561760
2016-06-30 2.681712
TA贡献1788条经验 获得超4个赞
您可以在没有正则表达式或导入模块的情况下执行此操作:
text = ['Hello @handle1', '@handle2 @handle3 hello', 'words3', '@handle4']
handles = [word[1:] for word_group in text for word in word_group.split() if word.startswith('@')]
添加回答
举报