我正在尝试获取 Curl 吐出的 Netscape HTTP Cookie 文件并将其转换为 Requests 库可以使用的 Cookiejar。我netscapeCookieString在我的 Python 脚本中有一个变量,它看起来像:# Netscape HTTP Cookie File# https://curl.haxx.se/docs/http-cookies.html# This file was generated by libcurl! Edit at your own risk..miami.edu TRUE / TRUE 0 PS_LASTSITE https://canelink.miami.edu/psc/PUMI2J/由于我不想自己解析 cookie 文件,所以我想使用cookielib. 遗憾的是,这意味着我必须写入磁盘,因为cookielib.MozillaCookieJar()不会将字符串作为输入:它必须采用文件。所以我正在使用NamedTemporaryFile(无法开始SpooledTemporaryFile工作;如果可能的话,再次想在内存中完成所有这些操作)。tempCookieFile = tempfile.NamedTemporaryFile()# now take the contents of the cookie string and put it into this in memory file# that cookielib will read from. There are a couple quirks though. for line in netscapeCookieString.splitlines(): # cookielib doesn't know how to handle httpOnly cookies correctly # so we have to do some pre-processing to make sure they make it into # the cookielib. Basically just removing the httpOnly prefix which is honestly # an abuse of the RFC in the first place. note: httpOnly actually refers to # cookies that javascript can't access, as in only http protocol can # access them, it has nothing to do with http vs https. it's purely # to protect against XSS a bit better. These cookies may actually end up # being the most critical of all cookies in a given set. # https://stackoverflow.com/a/53384267/2611730 if line.startswith("#HttpOnly_"): # this is actually how the curl library removes the httpOnly, by doing length line = line[len("#HttpOnly_"):] tempCookieFile.write(line)tempCookieFile.flush()但问题是,这是行不通的!print tempCookieFile.read()打印一个空行。因此,pprint.pprint(cookieJar)打印一个空的饼干罐。我怎样才能真正写信给 a NamedTemporaryFile?
1 回答
小唯快跑啊
TA贡献1863条经验 获得超2个赞
写入文件后,指向该文件的指针指向写入数据之后的位置(在您的文件末尾),因此当您读取它时返回一个空字符串(文件末尾后没有更多数据)只是寻找 0读之前
>>> import tempfile
>>> tempCookieFile = tempfile.NamedTemporaryFile()
>>> tempCookieFile.write("hey")
>>> tempCookieFile.seek(0)
>>> print(tempCookieFile.read())
添加回答
举报
0/150
提交
取消