如何在python中使用selenium在动态href链接上循环？

我想在动态 href 上循环。事实上，我每页下载一组文件。在每个页面上，我下载了 100 个文本文件，但我必须下载 200 000 个文件。所以，我必须在 2000 中点击 next 按钮。为此，我得到了 next 按钮的 href 地址，但不幸的是，此链接中的两个对象发生了变化，页码 1,2,3 等和一串人物。请参阅随附的下一个更改按钮的示例。https://search.proquest.com/something/E6981FD6D11F45E8PQ/2?accountid=12543#scrollTohttps://search.proquest.com/something/E6981FD6D11F45E8PQ/3?accountid=12543#scrollTohttps://search.proquest.com/something/61C27022597C4092PQ/4?accountid=12543#scrollTohttps://search.proquest.com/something/E431552DC6554BF7PQ/5?accountid=12543#scrollTo我是 Python 的新用户。我的水平很差。#Before I add selenium setup for scraping. n=2000for i in range(1,n): href="https://search.proquest.com/something/715376F5A5AF44BBPQ/" + str(i) + "?accountid=12543#scrollTo" driver.get(href)#Here, I add the code which allows downloading for each page.

查看完整描述

2 回答

收到一只叮咚

TA贡献1821条经验获得超4个赞

示例链接对我不可用（我无法注册）

第一的..

什么是“字符串”？

书号？或类别编号？

如果它只是随机字符串，我认为您应该找到另一种方法。

使用ActionChain怎么样？或driver.execute_script()？

首先，在我看来，找到字符串的含义（来自 .js 或 .html）更重要。

反对回复 2022-01-05

肥皂起泡泡

TA贡献1829条经验获得超6个赞

我需要帮助来识别下一页按钮的 xpath。我的目标是遍历 Python Selenium 中的页面。请在此图片上的 URL 页面上查看后找到下一页按钮的代码。

检查后的下一页按钮图片

//img1.sycdn.imooc.com//61d52bda0001344f17831169.jpg

我尝试使用 selenium 在 python 中编写以下代码以逐页下载文件。

while True:

scraping() # here I call my function that allows to download the files per page

try:

#Checks if there are more pages with links

next_link = driver.find_element_by_xpath("//*[@title='Page suivante']")

drive.execute_script("arguments[0].scrollIntoView();", next_link)

next_link.click()

#Time sleep

time.sleep(20)

except NoSuchElementException:

pages_rows= False

反对回复 2022-01-05

热搜

最近搜索清空

如何在python中使用selenium在动态href链接上循环？

如何在python中使用selenium在动态href链接上循环？

2 回答

添加回答