首页猿问从熊猫df中删除行

从熊猫df中删除行

Python

GCT1015 2021-05-09 16:21:45

我正在尝试从中删除所有rows内容pandas df。具体来说，当row下面X的内容Col A为空时。因此，如果inrow底下为空，我想删除所有这些行，直到有底下的值XCol AstringXimport pandas as pdd = ({ 'A' : ['X','','','X','Foo','','X','Fou','','X','Bar'], 'B' : ['Val',1,3,'Val',1,3,'Val',1,3,'Val',1], 'C' : ['Val',2,4,'Val',2,4,'Val',2,4,'Val',2], })df = pd.DataFrame(data=d)输出： A B C0 X Val Val1 1 22 3 43 X Val Val4 Foo 1 25 3 46 X Val Val7 Fou 1 28 3 49 X Val Val10 Bar 1 2我试过了：df = df[~(df['A'] == 'X').shift().fillna(False)]但这会删除X后面的所有内容。我只希望在X下的下一行为空的情况下将其删除。故意的： A B C0 X Val Val1 Foo 1 22 3 43 X Val Val4 Fou 1 25 4 46 X Val Val7 Bar 1 2

查看完整描述

3 回答

慕娘9325324

TA贡献1783条经验获得超4个赞

用：

m1 = df['A'] == 'X'

g = m1.cumsum()

m = (df['A'] == '') | m1

df = df[~m.groupby(g).transform('all')]

print (df)

A B C

3 X Val Val

4 Foo 1 2

5 3 4

6 X Val Val

7 Fou 1 2

8 3 4

9 X Val Val

10 Bar 1 2

详细资料：

m1 = df['A'] == 'X'

g = m1.cumsum()

m = (df['A'] == '') | m1

print (pd.concat([df,

df['A'] == 'X',

m1.cumsum(),

(df['A'] == ''),

m.groupby(g).transform('all'),

~m.groupby(g).transform('all')], axis=1,

keys=['orig','==X','g','==space','m', 'all', 'inverted all']))

orig ==X g ==space m all inverted all

A B C A A A A A A

0 X Val Val True 1 False True True False

1 1 2 False 1 True True True False

2 3 4 False 1 True True True False

3 X Val Val True 2 False True False True

4 Foo 1 2 False 2 False False False True

5 3 4 False 2 True True False True

6 X Val Val True 3 False True False True

7 Fou 1 2 False 3 False False False True

8 3 4 False 3 True True False True

9 X Val Val True 4 False True False True

10 Bar 1 2 False 4 False False False True

说明：

比较依据X并为组创建累积总和，起始X于g
链2布尔型面罩-比较X并留空m
groupby对于transform和仅用于组的DataFrameGroupBy.allreturn TruesTrue
最后反转并过滤 boolean indexing

反对回复 2021-05-18

三国纷争

TA贡献1804条经验获得超7个赞

这是您的解决方案：

(df['A'] == 'X').shift()

0 NaN

1 True

2 False

3 False

4 True

5 False

6 False

7 True

8 False

9 False

10 True

Name: A, dtype: object

In [15]:

(df['A'] == '')

Out[15]:

0 False

1 True

2 True

3 False

4 False

5 True

6 False

7 False

8 True

9 False

10 False

Name: A, dtype: bool

In [14]:

((df['A'] == '') & (df['A'] == 'X').shift())

Out[14]:

0 False

1 True

2 False

3 False

4 False

5 False

6 False

7 False

8 False

9 False

10 False

Name: A, dtype: bool

结果是：

df[~((df['A'] == '') & (df['A'] == 'X').shift())]

Out[16]:

A B C

0 X Val Val

2 3 4

3 X Val Val

4 Foo 1 2

5 3 4

6 X Val Val

7 Fou 1 2

8 3 4

9 X Val Val

10 Bar 1 2

编辑：如果需要，您可以在while循环中进行。old_size_df = df.size new_size_df = 0

while old_size_df != new_size_df:

old_size_df = df.size

df = df[~((df['A'] == '') & (df['A'] == 'X').shift())]

new_size_df = df.size

A B C

0 X Val Val

3 X Val Val

4 Foo 1 2

5 3 4

6 X Val Val

7 Fou 1 2

8 3 4

9 X Val Val

10 Bar 1 2

反对回复 2021-05-18

慕的地8271018

TA贡献1796条经验获得超4个赞

这是具有自定义套用功能的解决方案：

d = ({

'A' : ['X','','','X','Foo','','X','Fou','','X','Bar'],

'B' : ['Val',1,3,'Val',1,3,'Val',1,3,'Val',1],

'C' : ['Val',2,4,'Val',2,4,'Val',2,4,'Val',2],

})

df = pd.DataFrame(data=d)

is_x = False

def fill_empty_a(row):

global is_x

if row['A'] == '' and is_x:

row['A'] = None

else:

is_x = row['A'] == 'X'

return row

(df.apply(fill_empty_a, axis=1)

.dropna()

.reset_index(drop=True))

# A B C

# 0 X Val Val

# 1 X Val Val

# 2 Foo 1 2

# 3 3 4

# 4 X Val Val

# 5 Fou 1 2

# 6 3 4

# 7 X Val Val

# 8 Bar 1 2

反对回复 2021-05-18

3 回答
0 关注
197 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

从熊猫df中删除行

从熊猫df中删除行

3 回答

添加回答