3 回答

TA贡献1801条经验 获得超8个赞
txt='132GOasmHOMEwokdslNOWsdwkGO239NOW'
pattern=['GO','HOME','NOW','GO','NOW']
REPLACEMENT=['why','nope','later','aha','genes']
for i,x in enumerate(pattern):
txt = txt.replace(x,REPLACEMENT[i], 1)
有趣的是,这里是时间测试,因为这个问题要求最有效。
pattern=['GO','HOME','NOW','GO','NOW']
REPLACEMENT=['why','nope','later','aha','genes']
t = time.time()
for z in xrange(1000000):
txt = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
for a,b in zip(pattern,REPLACEMENT):
txt=txt.replace(a,b,1)
print time.time() - t
t = time.time()
for z in xrange(1000000):
txt2 = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
for i,x in enumerate(pattern):
txt2 = txt2.replace(x,REPLACEMENT[i], 1)
print time.time() - t
t = time.time()
for z in xrange(1000000):
txt3 = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
x = dict(zip(reversed(pattern), reversed(REPLACEMENT)))
for k in x:
txt3 = txt3.replace(k,x[k], 1)
print time.time() - t
t = time.time()
for z in xrange(1000000):
txt = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
new_d = iter(REPLACEMENT)
new_result = re.sub('\b' + '|'.join(pattern) + '\b', lambda _: next(new_d), txt)
print time.time() - t
结果是:
2.57099986076
2.48500013351
3.50499987602
4.23699998856
如您所见,枚举比zip效率更高,而其他两个不在同一范围内。

TA贡献1873条经验 获得超9个赞
您可以同时遍历两个列表,并且每次仅替换模式的第一个实例:
for a,b in zip(pattern,REPLACEMENT):
txt=txt.replace(a,b,1)

TA贡献1850条经验 获得超11个赞
使用dict减少您需要迭代的项目数量,这对于某些长输入可能是有价值的。
txt = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
pattern = ['GO','HOME','NOW','GO','NOW']
REPLACEMENT = ['why','nope','later','aha','genes']
x = dict(zip(reversed(pattern), reversed(REPLACEMENT)))
for k in x:
txt = txt.replace(k,x[k], 1)
print(txt)
编辑:为了好玩,我为备份添加了一个基准,以说明减少一些需要迭代的项的数量对于某些长输入可能很有价值。当您使用琐碎的测试数据集时,最有效的方法并不总是显而易见的。
#! /usr/bin/env python
# -*- coding: UTF8 -*-
def alpha(pattern, REPLACEMENT, txt):
for a,b in zip(pattern,REPLACEMENT):
txt=txt.replace(a,b,1)
def beta(pattern, REPLACEMENT, txt):
for i,x in enumerate(pattern):
txt = txt.replace(x,REPLACEMENT[i], 1)
def gamma(pattern, REPLACEMENT, txt):
x = dict(zip(reversed(pattern), reversed(REPLACEMENT)))
for k in x:
txt = txt.replace(k,x[k], 1)
def delta(pattern, REPLACEMENT, txt):
new_d = iter(REPLACEMENT)
new_result = re.sub('\b' + '|'.join(pattern) + '\b', lambda _: next(new_d), txt)
if __name__ == '__main__':
import timeit, re
txt = '132GOasmHOMEwokdslNOWsdwkGO239NOW'
pattern = ['GO','HOME','NOW','GO','NOW']
REPLACEMENT = ['why','nope','later','aha','genes']
print("Trivial inputs: len(pattern): {}, len(REPLACEMENT): {}, len(txt): {}".format(len(pattern), len(REPLACEMENT), len(txt)));
print("alpha: ", timeit.timeit("alpha(pattern, REPLACEMENT, txt)", setup="from __main__ import alpha, txt, pattern, REPLACEMENT"))
print("beta: ", timeit.timeit("beta( pattern, REPLACEMENT, txt)", setup="from __main__ import beta, txt, pattern, REPLACEMENT"))
print("gamma: ", timeit.timeit("gamma(pattern, REPLACEMENT, txt)", setup="from __main__ import gamma, txt, pattern, REPLACEMENT"))
print("delta: ", timeit.timeit("delta(pattern, REPLACEMENT, txt)", setup="from __main__ import delta, txt, pattern, REPLACEMENT"))
print("")
txtcopy = txt
patterncopy = pattern.copy()
REPLACEMENTcopy = REPLACEMENT.copy()
for _ in range(3):
txt = txt + txtcopy
pattern.extend(patterncopy)
REPLACEMENT.extend(REPLACEMENTcopy)
print("Small inputs: len(pattern): {}, len(REPLACEMENT): {}, len(txt): {}".format(len(pattern), len(REPLACEMENT), len(txt)));
print("alpha: ", timeit.timeit("alpha(pattern, REPLACEMENT, txt)", setup="from __main__ import alpha, txt, pattern, REPLACEMENT"))
print("beta: ", timeit.timeit("beta( pattern, REPLACEMENT, txt)", setup="from __main__ import beta, txt, pattern, REPLACEMENT"))
print("gamma: ", timeit.timeit("gamma(pattern, REPLACEMENT, txt)", setup="from __main__ import gamma, txt, pattern, REPLACEMENT"))
print("delta: ", timeit.timeit("delta(pattern, REPLACEMENT, txt)", setup="from __main__ import delta, txt, pattern, REPLACEMENT"))
print("")
txt = txtcopy
pattern = patterncopy.copy()
REPLACEMENT = REPLACEMENTcopy.copy()
for _ in range(300):
txt = txt + txtcopy
pattern.extend(patterncopy)
REPLACEMENT.extend(REPLACEMENTcopy)
print("Larger inputs: len(pattern): {}, len(REPLACEMENT): {}, len(txt): {}".format(len(pattern), len(REPLACEMENT), len(txt)));
print("alpha: ", timeit.timeit("alpha(pattern, REPLACEMENT, txt)", setup="from __main__ import alpha, txt, pattern, REPLACEMENT"))
print("beta: ", timeit.timeit("beta(pattern, REPLACEMENT, txt)", setup="from __main__ import beta, txt, pattern, REPLACEMENT"))
print("gamma: ", timeit.timeit("gamma(pattern, REPLACEMENT, txt)", setup="from __main__ import gamma, txt, pattern, REPLACEMENT"))
print("delta: ", timeit.timeit("delta(pattern, REPLACEMENT, txt)", setup="from __main__ import delta, txt, pattern, REPLACEMENT"))
结果:
Trivial inputs: len(pattern): 5, len(REPLACEMENT): 5, len(txt): 33
alpha: 4.60048107800003
beta: 4.169088881999869
gamma: 5.7612637450001785
delta: 11.371387353000046
Small inputs: len(pattern): 20, len(REPLACEMENT): 20, len(txt): 132
alpha: 17.281149661999734
beta: 15.131949634000193
gamma: 7.339897444000144
delta: 26.50896787900001
Larger inputs: len(pattern): 1505, len(REPLACEMENT): 1505, len(txt): 9933
alpha: 18766.660852467998
beta: 17640.960064803
gamma: 64.01868645999639
delta: 901.3577002189995
因此,对于平凡的输入,enumerate解决方案比zip快一点,比zip快很多iter。当输入的长度略微增加时,不删除重复项的成本开始显示出来,并且我的解决方案的运行时间不到一半。当运行包含大量重复项的长输入时,@ eatmeimadanish解决方案完成的时间比删除重复项时要花费27555%。哎哟。
添加回答
举报