首页猿问将列表更改为字符串以删除字符

将列表更改为字符串以删除字符

Python

守着星空守着你 2022-07-05 18:50:27

我有一个文件，我正在尝试对其进行词频列表，但我在列表和字符串方面遇到了问题。我将文件更改为字符串以从文件中删除数字，但这最终会弄乱标记化。预期的输出是我打开的文件的字数，不包括数字，但我得到的是以下内容：Counter({'<_io.TextIOWrapper': 1, "name='german/test/polarity/negative/neg_word_list.txt'": 1, "mode='r'": 1, "encoding='cp'>": 1})done这是代码：import refrom collections import Counterdef word_freq(file_tokens): global count for word in file_tokens: count = Counter(file_tokens) return countf = open("german/test/polarity/negative/neg_word_list.txt")clean = re.sub(r'[0-9]', '', str(f))file_tokens = clean.split()print(word_freq(file_tokens))print("done")f.close()

查看完整描述

2 回答

慕村225694

TA贡献1880条经验获得超4个赞

这最终奏效了，感谢Rakesh

import re

from collections import Counter

def word_freq(file_tokens):

global count

for word in file_tokens:

count = Counter(file_tokens)

return count

f = open("german/test/polarity/negative/neg_word_list.txt")

clean = re.sub(r'[0-9]', '', f.read())

file_tokens = clean.split()

print(word_freq(file_tokens))

print("done")

f.close()

反对回复 2022-07-05

蝴蝶刀刀

TA贡献1801条经验获得超8个赞

进一步阅读我注意到你没有“阅读”文件，你只是打开了它。

如果您只打印打开文件：

f = open("german/test/polarity/negative/neg_word_list.txt")

print(f)

你会注意到它会告诉你对象是什么，“io.TextIOWrapper”。所以你需要阅读它：

f_path = open("german/test/polarity/negative/neg_word_list.txt")

f = f_path.read()

f_path.close() # don't forget to do this to clear stuff

print(f)

# >>> what's really inside the file

或者没有“close（）”的另一种方法：

# adjust your encoding

with open("german/test/polarity/negative/neg_word_list.txt", encoding="utf-8") as r:

f = r.read()

这样做可能不会在列表中，而是在纯文本文件中，因此您可以迭代每一行：

list_of_lines = []

# adjust your encoding

with open("german/test/polarity/negative/neg_word_list.txt", encoding="utf-8") as r:

# read each line and append to list

for line in r:

list_of_lines.append(line)

反对回复 2022-07-05

2 回答
0 关注
110 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

将列表更改为字符串以删除字符

将列表更改为字符串以删除字符

2 回答

添加回答