如何在python中使用for循环从字符串中打印每个唯一单词的频率

Python

Helenr 2021-07-01 14:01:11

该段落旨在包含空格和随机标点符号，我通过执行 .replace 在 for 循环中删除了它们。然后我通过 .split() 将段落放入一个列表中以获得 ['the', 'title', 'etc']。然后我做了两个函数 count words 来计算每个单词，但我不想它计算每个单词，所以我做了另一个函数来创建一个唯一的列表。但是，我需要创建一个 for 循环来打印出每个单词以及它被说了多少次，输出是这样的The word The appears 2 times in the paragraph.The word titled appears 1 times in the paragraph.The word track appears 1 times in the paragraph.我也很难理解 for 循环的本质。我读到我们应该只使用 for 循环进行计数，而 while 循环可以用于其他任何事情，但 while 循环也可以用于计数。 paragraph = """ The titled track “Heart Attack” does not interpret the feelings of being in love in a serious way, but with Chuu’s own adorable emoticon like ways. The music video has references to historical and fictional figures such as the artist Rene Magritte!!.... """for r in ((",", ""), ("!", ""), (".", ""), (" ", "")): paragraph = paragraph.replace(*r)paragraph_list = paragraph.split()def count_words(word, word_list): word_count = 0 for i in range(len(word_list)): if word_list[i] == word: word_count += 1 return word_countdef unique(word): result = [] for f in word: if f not in result: result.append(f) return resultunique_list = unique(paragraph_list)

查看完整描述

2 回答

慕侠2389804

TA贡献1719条经验获得超6个赞

请注意，您的示例文本很简单，但标点规则可能很复杂或未正确遵守。包含 2 个相邻空格的文本是什么（是的，它不正确但很频繁）？如果作者更习惯法语并在冒号或分号前后写空格怎么办？

我认为's构造需要特殊处理。怎么样："""John has a bicycle. Mary says that her one is nicer that John's."""恕我直言，这个词John在这里出现了两次，而你的算法会看到 1John和 1 Johns。

此外，由于 Unicode 文本现在在 WEB 页面上很常见，您应该准备好找到空格和标点符号的高代码等效项：

“ U+201C LEFT DOUBLE QUOTATION MARK

” U+201D RIGHT DOUBLE QUOTATION MARK

’ U+2019 RIGHT SINGLE QUOTATION MARK

‘ U+2018 LEFT SINGLE QUOTATION MARK

U+00A0 NO-BREAK SPACE

此外，根据这个较旧的问题，删除标点符号的最佳方法是translate. 链接问题使用 Python 2 语法，但在 Python 3 中您可以执行以下操作：

paragraph = paragraph.strip() # remove initial and terminal white spaces

paragraph = paragraph.translate(str.maketrans('“”’‘\xa0', '""\'\' ')) # fix high code punctuations

paragraph = re.replace("\w's\s", "", paragraph) # remove 's

paragraph = paragraph.translate(str.maketrans(None, None, string.punctuation) # remove punctuations

words = paragraph.split()

反对回复 2021-07-13

热搜

最近搜索清空

如何在python中使用for循环从字符串中打印每个唯一单词的频率

如何在python中使用for循环从字符串中打印每个唯一单词的频率

2 回答

添加回答