我试图抓取谷歌地图。phone 和 hours 变量不返回任何数据。其他变量工作正常并返回数据。XPATH 是正确的。我不确定这里有什么问题。其他选择器(如姓名、地址、职务、网站)可以正常返回数据,但电话和营业时间不会返回任何数据。希望得到一些答案。from selenium import webdriverfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECfrom selenium.common.exceptions import TimeoutExceptionfrom selenium.webdriver.support.ui import Selectfrom selenium.webdriver.common.action_chains import ActionChainsfrom selenium.webdriver.common.keys import Keysfrom scrapy.selector import Selectorimport csvfrom tqdm import tqdmimport timedriver = webdriver.Firefox()linksFile=open("links.txt",'r')allLinks = linksFile.readlines() for link in tqdm(allLinks): try: driver.get(link) except Exception: print('Something went wrong with the URL: ') # time.sleep(15) while True: WebDriverWait(driver, 15).until( EC.presence_of_element_located((By.XPATH, '//div[contains(text(), "Directions")] | //div[contains(text(), "Website")]')) ) results = driver.find_elements_by_xpath('//div[contains(text(), "Directions")] | //div[contains(text(), "Website")]') for result in results: # writing to the CSV file outFile = open("data.csv",'a+',newline="") writer = csv.writer(outFile) business = driver.find_element_by_xpath('//div[@role="heading"]/div') business.click() # waiting for the page to load WebDriverWait(driver, 15).until( EC.presence_of_element_located((By.XPATH, '//div[@class="immersive-container"]')) )
1 回答
月关宝盒
TA贡献1772条经验 获得超5个赞
您可以使用 Java 脚本 outerHTML intead of pageSource 吗?
response = Selector( driver.execute_script("return document.documentElement.outerHTML"))
在小时的 xpath 中也有一个问题:
hours = response.xpath('//a[contains(text(), "Hours")]/parent::span/following-sibling::div/label/span//b/text()').get()
添加回答
举报
0/150
提交
取消