为了账号安全,请及时绑定邮箱和手机立即绑定

用Python中的链接抓取手机模型

用Python中的链接抓取手机模型

呼啦一阵风 2021-12-09 10:56:43
我正在尝试从该网站上删除手机型号列表https://www.m1.com.sg/personal/mobile/phones/filters/all-plans/all/all/0/1500/0/0/没有任何这将列出型号和价格。我有以下代码,但所有价格都不正确。他们不应该是零。我做错了什么?此外,是否可以仅使用美丽的汤提供可点击的链接(允许用户点击“更多信息”,将他们带到包含手机型号附加信息的页面)?例如: iPhone XR 128GB   $ 0    More Infoimport urllib.requestfrom bs4 import BeautifulSoupfrom html.parser import HTMLParserurl_toscrape = "https://www.m1.com.sg/personal/mobile/phones/filters/all-plans/all/all/0/1500/0/0/none"response = urllib.request.urlopen(url_toscrape)info_type = response.info()responseData = response.read()soup = BeautifulSoup(responseData, 'lxml')Model_findall=soup.findAll("div",{"class":"td three title text-center"})price_findall=soup.findAll("div",{"class":"td two price text-center"})for models in Model_findall:    print('*',models.text.strip())    print(' ',price.text.strip())
查看完整描述

3 回答

?
守着星空守着你

TA贡献1799条经验 获得超8个赞

以下脚本应为您提供所需的输出。


import requests

from bs4 import BeautifulSoup


url = "https://www.m1.com.sg/personal/mobile/phones/filters/all-plans/all/all/0/1500/0/0/none"


response = requests.get(url)

soup = BeautifulSoup(response.text, 'lxml')

for items in soup.find_all(class_="phone-line"):

    model = items.find(class_="title").text.strip()

    price = items.find(class_="light-blue").text.strip()

    print(model,price)


查看完整回答
反对 回复 2021-12-09
?
牧羊人nacy

TA贡献1862条经验 获得超7个赞

你的意思是这样吗?


url_toscrape = "https://www.m1.com.sg/personal/mobile/phones/filters/all-plans/all/all/0/1500/0/0/none"

response = urllib.request.urlopen(url_toscrape)

info_type = response.info()

responseData = response.read()

soup = BeautifulSoup(responseData, 'lxml')


for tr in soup.find_all("div",{"class":"tr middle"}):

    for model in tr.find_all("div",{"class":"td three title text-center"}):

        model = model.text.strip()

    for price in tr.find_all("div",{"class":"td two price text-center"}):

        price = price.text.strip()

    for info in tr.find_all("div",{"class":"td two description"}):

        for link in info.find_all("a"):

            info = info.text.strip() + ": https://www.m1.com.sg" + link['href'].replace(" ","%20")

    print (model,price,info)


查看完整回答
反对 回复 2021-12-09
?
开心每一天1111

TA贡献1836条经验 获得超13个赞

您可以使用以下 css 类和 id 选择器


import requests

from bs4 import BeautifulSoup 

import pandas as pd


url = "https://www.m1.com.sg/personal/mobile/phones/filters/all-plans/all/all/0/1500/0/0/none"  

response = requests.get(url)

soup = BeautifulSoup(response.text, 'lxml')


models = [item.text for item in soup.select('#PhoneListDiv .color-orange')]

prices = [item.text for item in soup.select('.price .light-blue')]

df = pd.DataFrame(list(zip(models, prices)), columns = ['Model', 'Price'])

print(df)


查看完整回答
反对 回复 2021-12-09
  • 3 回答
  • 0 关注
  • 199 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信