3 回答
TA贡献1852条经验 获得超1个赞
好的,我明白了。您df['Text']由文本列表组成。所以你可以这样做:
full_list = [] # list containing all words of all texts
for elmnt in df['Text']: # loop over lists in df
full_list += elmnt # append elements of lists to full list
val_counts = pd.Series(full_list).value_counts() # make temporary Series to count
该解决方案避免使用过多的列表推导式,从而使代码易于阅读和理解。此外,不需要像re或不需要额外的模块collections。
TA贡献1784条经验 获得超9个赞
这是我的版本,我将列值转换为列表,然后我制作一个单词列表,清理它,然后你就有了你的计数器:
your_text_list = df['Text'].tolist()
your_text_list_nan_rm = [x for x in your_text_list if str(x) != 'nan']
flat_list = [inner for item in your_text_list_nan_rm for inner in ast.literal_eval(item)]
counter = collections.Counter(flat_list)
top_words = counter.most_common(100)
添加回答
举报