1 回答
TA贡献1785条经验 获得超8个赞
通过减少索引数量,您可以节省大约 30% 的时间(取决于数据),但考虑到您生成的组合数量巨大,我不知道如何才能使速度更快:
d = defaultdict(lambda:defaultdict(int))
for (events,features),count in help_d.items():
counts = d[events]
for combo in product(*zip(features, repeat(''))):
counts[combo] += count
但是,根据您之后如何使用该字典,仅在使用时生成计数可能会更有效。您可以通过创建一个类或函数来实现给定事件和功能组合的“按需”计算来实现这一点。
help_events = defaultdict(list) # list of feature patterns for each event pair
for (event,features),count in help_d.items():
help_events[event].append(features)
help_cache = dict() # cached results
def getFeatureCount(events,pattern):
# check cache first
if (events,pattern) in help_cache:
return help_cache[(events,pattern)]
# compute total of matching feature patterns
result = 0
for eventFeatures in help_events[events]:
if all(e==f or f=="" for e,f in zip(eventFeatures,pattern)):
result += help_d[(events,eventFeatures)]
#save to cache and return result
help_cache[(events,pattern)] = result
return result
用法:
getFeatureCount(('Event 1', 'Event 2'),('Feature A', '')) # --> 30
# wich is equivalent to d[(('Event 1', 'Event 2'),('Feature A', ''))]
- 1 回答
- 0 关注
- 89 浏览
添加回答
举报