为了账号安全,请及时绑定邮箱和手机立即绑定

从文本文件中计算特定单词的列表-Python

从文本文件中计算特定单词的列表-Python

人到中年有点甜 2022-06-14 17:05:01
我想计算特定单词的出现次数(连词:“also”、“although”、“and”、“as”、“because”、“before”、“but”、“for”、“if” , “nor”, “of”, “or”, “since”, “that”, “though”, “until”, “when”, “whenever”, “whereas”, “which”, “while”, “然而”)以及来自 txt 文件的标点符号这就是我所做的:def count(fname, words_list):if fname:    try:        file = open(str(fname), 'r')        full_text = file.readlines()        file.close()        count_result = dict()        for word in words_list:            for text in full_text:                if word in count_result:                    count_result[word] = count_result[word] + text.count(word)                else:                    count_result[word] = text.count(word)        return count_result    except:        print('Something really bad just happened!')print(count('sample2.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of","or", "since", "that", "though", "until", "when", "whenever", "whereas","which", "while", "yet", ",", ";", "-", "'"]))但它的作用是将“是”计入“作为”,我该如何解决它或者有没有其他方法来归档它?谢谢预期输出类似于:{'also': 0, '虽然': 0, 'and': 27, 'as': 2, 'because': 0, 'before': 2, 'but': 4, 'for': 2, ' if': 2, 'nor': 0, 'of': 13, 'or': 2, 'since': 0, 'that': 10, 'though': 2, 'until': 0, 'when' : 3, 'whenever': 0, 'whereas': 0, 'which': 0, 'while': 0, 'yet': 0, ',': 41, ';': 3, '-': 1 , "'": 17, 'words_per_sentence': 25.4286, 'sentences_per_par': 1.75}
查看完整描述

2 回答

?
料青山看我应如是

TA贡献1772条经验 获得超8个赞

def word_count(fname, word_list):

    count_w = dict()

    for w in word_list:

        count_w[w] = 0


    with open(fname) as input_text:

        text = input_text.read()

        words = text.lower().split()

        for word in words:

            _word = word.strip('.,:-)()')

            if _word in count_w:

                count_w[_word] +=1


    return count_w


def punctaction_count(fname, punctaction):

    count_p = dict()

    for p in punctaction:

        count_p[p] = 0

    with open(fname) as input_text:

        for c in input_text.read():

            if c in punctaction:

                count_p[c] +=1

    return count_p





print(word_count('c_prog.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of", "or", "since", "that",

                                "though", "until", "when", "whenever", "whereas", "which", "while", "yet"]))


print(punctaction_count('c_prog.txt', [",", ";", "-", "'"]))

如果您想在一个功能中执行此操作:


def word_count(fname, word_list, punctaction):

    count_w = dict()

    for w in word_list:

        count_w[w] = 0


    count_p = dict()

    for p in punctaction:

        count_p[p] = 0


    with open(fname) as input_text:

        text = input_text.read()

        words = text.lower().split()

        for word in words:

            _word = word.strip('.,:-)()')

            if _word in count_w:

                count_w[_word] +=1


        for c in text:

            if c in punctaction:

                count_p[c] +=1


    count_w.update(count_p)

    return count_w





print(word_count('c_prog.txt', ["also", "although", "and", "as", "because", "before", "but", "for", "if", "nor", "of", "or", "since", "that",

                                "though", "until", "when", "whenever", "whereas", "which", "while", "yet"], [",", ";", "-", "'"]))



查看完整回答
反对 回复 2022-06-14
?
Cats萌萌

TA贡献1805条经验 获得超9个赞

在 2.7 和 3.1 中,您要实现的目标有特殊的Counter dict


由于您尚未发布任何示例输出。我想给你一个你可以使用的方法。维护一个列表。在列表中附加您需要的这些单词。例如,如果您接近单词“also”,请将其附加到列表中。


>>> l.append("also")

>>> l

['also']

同样,你遇到“虽然”这个词,列表变成:


>>> l.append("although")

>>> l

['also', 'although']

如果您再次遇到“也”,请再次将其附加到上面的列表中。


列表变为:


['also', 'although', 'also']

现在使用 Counter 来计算列表元素的出现次数:


>>> l = ['also', 'although', 'also']

>>> result = Counter(l)

>>> l

['also', 'although', 'also']

>>> result

Counter({'also': 2, 'although': 1})


查看完整回答
反对 回复 2022-06-14
  • 2 回答
  • 0 关注
  • 93 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号