2 回答

TA贡献1836条经验 获得超4个赞
由于某些原因,人们经常问如何在没有defaultdict的情况下执行此操作
>>> text= "I say what I mean. I mean what I say. i do."
>>> sentences = text.lower().split('.')
>>> dic = {}
>>> for i, sen in enumerate(sentences):
... for word in sen.split():
... if word not in dic: # you just need these
... dic[word] = set() # two extra lines
... dic[word].add(i)
...
>>> dic
{'i': set([0, 1, 2]), 'do': set([2]), 'say': set([0, 1]), 'what': set([0, 1]), 'mean': set([0, 1])}
如果您确实想要列表,可以通过以下修改来做到这一点
>>> text= "I say what I mean. I mean what I say. i do."
>>> sentences = text.lower().split('.')
>>> dic = {}
>>> for i, sen in enumerate(sentences):
... for word in sen.split():
... if word not in dic:
... dic[word] = [i]
... elif dic[word][-1] != i: # this prevents duplicate entries
... dic[word].append(i)
...
>>> dic
{'i': [0, 1, 2], 'do': [2], 'say': [0, 1], 'what': [0, 1], 'mean': [0, 1]}
如果您甚至不被允许使用枚举
>>> text= "I say what I mean. I mean what I say. i do."
>>> sentences = text.lower().split('.')
>>> dic = {}
>>> i = -1
>>> for sen in sentences:
... i += 1
... for word in sen.split():
... if word not in dic:
... dic[word] = [i]
... elif dic[word][-1] != i: # this prevents duplicate entries
... dic[word].append(i)
...
>>> dic
{'i': [0, 1, 2], 'do': [2], 'say': [0, 1], 'what': [0, 1], 'mean': [0, 1]}

TA贡献1853条经验 获得超9个赞
您可以collections.defaultdict在这里使用:
>>> from collections import defaultdict
>>> text= "I say what I mean. I mean what I say. i do."
# convert the text to lower-case and split at `'.'` to get the sentences.
>>> sentences = text.lower().split('.')
>>> dic = defaultdict(set) #sets contain only unique iteme
for i,sen in enumerate(sentences): #use enumerate to get the sentence as well as index
for word in sen.split(): #split the sentence at white-spaces to get words
dic[word].add(i)
>>> dic
defaultdict(<type 'set'>,
{'i': set([0, 1, 2]),
'do': set([2]),
'say': set([0, 1]),
'what': set([0, 1]),
'mean': set([0, 1])})
使用普通字典:
>>> dic = {}
for i,sen in enumerate(sentences):
for word in sen.split():
dic.setdefault(word,set()).add(i)
...
>>> dic
{'i': set([0, 1, 2]),
'do': set([2]),
'say': set([0, 1]),
'what': set([0, 1]),
'mean': set([0, 1])}
没有enumerate:
>>> dic = {}
>>> index = 0
for sen in sentences:
for word in sen.split():
dic.setdefault(word,set()).add(index)
index += 1
...
>>> dic
{'i': set([0, 1, 2]), 'do': set([2]), 'say': set([0, 1]), 'what': set([0, 1]), 'mean': set([0, 1])}
添加回答
举报