使用Python删除包含字符或字母串的文本文件中的单词

我有几行文本，想删除任何带有特殊字符或固定给定字符串的单词（在 python 中）。例子：in_lines = ['this is go:od', 'that example is bad', 'amp is a word']# remove any word with {'amp', ':'}out_lines = ['this is', 'that is bad', 'is a word']我知道如何从给定的列表中删除单词，但不能删除带有特殊字符或少数字母的单词。请告诉我，我会添加更多信息。这是我用于删除选定单词的内容：def remove_stop_words(lines): stop_words = ['am', 'is', 'are'] results = [] for text in lines: tmp = text.split(' ') for stop_word in stop_words: for x in range(0, len(tmp)): if tmp[x] == stop_word: tmp[x] = '' results.append(" ".join(tmp)) return resultsout_lines = remove_stop_words(in_lines)

查看完整描述

2 回答

慕桂英4014372

TA贡献1871条经验获得超13个赞

in_lines = ['this is go:od',

'that example is bad',

'amp is a word']

def remove_words(in_list, bad_list):

out_list = []

for line in in_list:

words = ' '.join([word for word in line.split() if not any([phrase in word for phrase in bad_list]) ])

out_list.append(words)

return out_list

out_lines = remove_words(in_lines, ['amp', ':'])

print (out_lines)

听起来很奇怪，声明

word for word in line.split() if not any([phrase in word for phrase in bad_list])

一次完成这里所有的艰苦工作。它为应用于单个单词的“坏”列表中的每个短语创建一个True/False值列表。该any函数再次将这个临时列表压缩为单个True/False值，如果是False这样，则可以安全地将单词复制到基于行的输出列表中。

例如，删除所有包含 an 的单词的结果a如下所示：

remove_words(in_lines, ['a'])

>>> ['this is go:od', 'is', 'is word']

（也可以删除该for line in ..行。不过，此时，可读性确实开始受到影响。）

反对回复 2021-07-13

FFIVE

TA贡献1797条经验获得超6个赞

这符合您的预期输出：

def remove_stop_words(lines):

stop_words = ['am', ':']

results = []

for text in lines:

tmp = text.split(' ')

for x in range(0, len(tmp)):

for st_w in stop_words:

if st_w in tmp[x]:

tmp[x] = ''

results.append(" ".join(tmp))

return results

反对回复 2021-07-13

热搜

最近搜索清空

使用Python删除包含字符或字母串的文本文件中的单词

使用Python删除包含字符或字母串的文本文件中的单词

2 回答

添加回答