1 回答
TA贡献1831条经验 获得超10个赞
您收到未经授权的错误,因为他们使用 cookie 来存储与您的会话相关的一些信息。具体来说,cookie 名为Sdarot. 我已经使用requests库来下载并保存视频。
要点是,当您使用 selenium 打开 url 时,它工作正常,因为 selenium 使用相同的 http 客户端(浏览器),该客户端已经具有可用的 cookie 详细信息,但是当您使用 urllib 调用时,基本上它是不同的 http 客户端,因此它是对服务器。为了克服这个问题,您必须像浏览器一样提供足够的会话信息,在本例中由 cookie 维护。
检查我如何提取Sdarotcookie 的值并将其应用到requests.get方法中。您也可以使用来做到这一点urllib。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import requests
def load(driver, url):
driver.get(url) # open the page in the browser
try:
# wait for the episode to "load"
# if something is wrong and the episode doesn't load after 45 seconds,
# the function will call itself again and try to load again.
continue_btn = WebDriverWait(driver, 45).until(
EC.element_to_be_clickable((By.ID, "proceed"))
)
continue_btn.click()
except:
load(driver,url) #corrected parameter error
def save_video(driver, filename):
video_element = driver.find_element_by_tag_name(
"video") # get the video element
video_url = video_element.get_property('src') # get the video url
cookies = driver.get_cookies()
#iterate all the cookies and extract cookie value named Sdarot
for entry in cookies:
if(entry["name"] == 'Sdarot'):
cookies = dict({entry["name"]:entry["value"]})
#set request with proper cookies
r = requests.get(video_url, cookies=cookies,stream = True)
# start download
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size = 1024*1024):
if chunk:
f.write(chunk)
def main():
URL = r'https://www.sdarot.dev/watch/339-%D7%94%D7%A4%D7%99%D7%92-%D7%9E%D7%95%D7%AA-ha-pijamot/season/1/episode/23'
DRIVER = webdriver.Chrome()
load(DRIVER, URL)
video_url = save_video(DRIVER, "video.mp4")
if __name__ == "__main__":
main()
添加回答
举报