我正在看这个系列https://www.youtube.com/watch?v=wlnx-7cm4Gg&list=PL5tcWHG-UPH2zBfOz40HSzcGUPAVOOnu1这是关于使用 tweepy (python) 挖掘推文,并且该人将推文与所有内容(例如 created_at、 id, id_str, text) 然后他在 Pandas 中使用 Dataframes 来仅存储文本。这种方式有效吗?我如何只在 Json 文件中存储“文本”而不是所有其他详细信息?代码:ACCESS_TOKEN = "xxxxxxxxxxxxxxxxxxxxx"ACCESS_TOKEN_SECRET = "xxxxxxxxxxxxxxxxxxxxxxxxx"CONSUMER_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"CONSUMER_SECRET = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"import tweepyimport numpy as npimport pandas as pd# import twitter_credentialsclass TwitterAuthenticator(): def authenticate_twitter_app(self): auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET) auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET) return authclass TwitterStreamer(): """ Class for streaming and processing live tweets. """ def __init__(self): self.twitter_authenticator = TwitterAuthenticator() def stream_tweets(self, fetched_tweets_filename, hash_tag): # This handles Twitter authetification and the connection to Twitter Streaming API listener = TwitterListener(fetched_tweets_filename) auth = self.twitter_authenticator.authenticate_twitter_app() # api = tweepy.API(auth) stream = tweepy.Stream(auth,listener) stream.filter(track = hash_tag)class TwitterListener(tweepy.StreamListener): """ This is a basic listener class that just prints received tweets to stdout. """ def __init__(self, fetched_tweets_filename): self.fetched_tweets_filename = fetched_tweets_filename def on_data(self, data): try: print(data) with open(self.fetched_tweets_filename, 'a') as tf: tf.write(data) return True except BaseException as e: print("Error on_data %s" % str(e)) return True如果问题不清楚,请将其注释掉,我会尝试编辑问题。
2 回答
噜噜哒
TA贡献1784条经验 获得超7个赞
如果您只想将“文本”字段保存在 json 文件中,您可以调整该TwitterListener.on_data方法的定义:
import json
def on_data(self, data):
try:
print(data)
with open(self.fetched_tweets_filename, 'a') as tf:
json_load = json.loads(data)
text = {'text': json_load['text']}
tf.write(json.dumps(text))
return True
except BaseException as e:
print("Error on_data %s" % str(e))
return True
公平警告,我没有tweepy安装/设置,所以我只能使用您在上面发布的 json 文件测试上述代码的一个版本。如果您遇到任何错误,请告诉我,我会看看我能做些什么。
添加回答
举报
0/150
提交
取消