4 回答
TA贡献1804条经验 获得超2个赞
无需显式创建bad_words列表,您repeater也可以将其设置为变量
repeater = 3
newlist = []
with open('input.txt') as f:
x = f.readlines()
for val in x:
word = val.split('\n')[0]
flag = True
for letter in word:
if letter.upper() * repeater in word:
flag = False
break
if flag:
newlist.append(word)
newlist = list(set(newlist))
with open('output.txt', mode='w', encoding='utf-8') as newfile:
for value in newlist:
newfile.writelines(value+"\n")
TA贡献1811条经验 获得超5个赞
您可以创建一个函数来检查某个字符是否出现超过 3 次,然后在代码中调用它:
def letter_count(str):
counts = dict()
for l in str:
if l in counts:
counts[l] += 1
else:
counts[l] = 1
counts[max(counts, key=lambda x : counts[x])]
return counts[max(counts, key=lambda x : counts[x])] > 3
并在您的代码中这样调用它:
with open('7.csv') as oldfile, open('new7.csv', 'w') as newfile:
for line in oldfile:
if if(letter_count(line)):
newfile.write(line)
TA贡献1824条经验 获得超6个赞
您可以使用 aCounter
检查每行中不同字母的频率,然后仅在它们未通过阈值时才写入此行:
from collections import Counter
threshold = 3
with open('7.csv') as oldfile, open('new7.csv', 'w') as newfile:
for line in oldfile:
counts = Counter(line)
if all(count < threshold for count in counts.values()):
newfile.write(line)
这使用该all()
函数来确保没有字母超过阈值。
TA贡献1963条经验 获得超6个赞
使用单个字符而不是三元组和 的列表string.count()。制作一个小函数来封装过滤逻辑可能也是一个不错的选择。
def f(line, chars, limit):
for char in chars:
if line.count(char) > limit:
return False
return True
bad_chars = ['A','B', ...]
with open('7.csv', 'r') as oldfile, open('new7.csv', 'w') as newfile:
for line in oldfile:
if f(line, bad_chars, 3):
newfile.write(line)
添加回答
举报