根据条件重复数据框行

我正在寻找一种基于值条件插入重复行的方法。输入数据集包含以周为单位的客户价格和价格有效期-'price_start_week'和'price_end_week'。想法是通过添加带有实际星期数的新列来扩展数据框，并根据有效星期数重复行。输入：╔═══════════════╦══════════════════╦════════════════╦═════════════╗║ customer_name ║ price_start_week ║ price_end_week ║ price_value ║╠═══════════════╬══════════════════╬════════════════╬═════════════╣║ A ║ 4 ║ 7 ║ 500 ║║ B ║ 3 ║ 6 ║ 600 ║║ C ║ 2 ║ 4 ║ 700 ║╚═══════════════╩══════════════════╩════════════════╩═════════════╝输出：+---------------+------------------+----------------+-------------+-------------+| customer_name | price_start_week | price_end_week | actual week | price_value |+---------------+------------------+----------------+-------------+-------------+| A | 4 | 7 | 4 | 500 || A | 4 | 7 | 5 | 500 || A | 4 | 7 | 6 | 500 || A | 4 | 7 | 7 | 500 || B | 3 | 6 | 3 | 600 || B | 3 | 6 | 4 | 600 || B | 3 | 6 | 5 | 600 || B | 3 | 6 | 6 | 600 || C | 2 | 2 | 4 | 700 || C | 2 | 3 | 4 | 700 || C | 2 | 4 | 4 | 700 |+---------------+------------------+----------------+-------------+-------------+最好的方法是什么？我在考虑应用功能，像这样：def repeat(a): if (a['price_start_week']>a['price_end_week']): return a['price_start_week']-a['price_end_week'] ...df['actual_week']=df.apply(repeat, axis=0)

查看完整描述

1 回答

梦里花落0921

TA贡献1772条经验获得超6个赞

Index.repeat按周GroupBy.cumcount数之差使用，然后按每组计数：

a = df['price_end_week'] - df['price_start_week'] + 1

df = df.loc[df.index.repeat(a)].reset_index(drop=True)

df['actual week'] = df.groupby('customer_name').cumcount() + df['price_start_week']

print (df)

customer_name price_start_week price_end_week price_value actual week

0 A 4 7 500 4

1 A 4 7 500 5

2 A 4 7 500 6

3 A 4 7 500 7

4 B 3 6 600 3

5 B 3 6 600 4

6 B 3 6 600 5

7 B 3 6 600 6

8 C 2 4 700 2

9 C 2 4 700 3

10 C 2 4 700 4

反对回复 2021-04-27

热搜

最近搜索清空

根据条件重复数据框行

根据条件重复数据框行

1 回答

添加回答