3 回答
TA贡献1906条经验 获得超10个赞
原问题:
在这种情况下,我将使用带有 contains 运算符的 attribute = value 选择器来href通过 string 定位属性mailto。添加 CSS 选择器:[href*=mailto]
如果使用,querySelectorAll("[href*=mailto]")您可以测试该.Length属性是否大于 0 或使用querySelector并测试if Not querySelector("[href*=mailto]") Is Nothing。
如果你设置一个变量
Dim ele As Object
Set ele = html.document.querySelector("[href*=mailto]")
If Not ele Is Nothing Then
Debug.Print ele.href 'do something with the href to parse out email
End If
更新的问题:
对于更新的问题,我会将 nodeList 中的当前节点转移outerHTML到代理HTMLDocument变量中,以便我可以querySelector再次利用方法。我会按类别定位电子邮件。
Option Explicit
Public Sub GetListingInfo()
Const URL As String = "https://www.gelbeseiten.de/Suche/Ambulante%20Pflegedienste/Bundesweit"
Dim http As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument
Set http = New MSXML2.XMLHTTP60
Set html = New MSHTML.HTMLDocument
With http
.Open "Get", URL, False
.send
html.body.innerHTML = .responseText
End With
Dim post As Object, html2 As MSHTML.HTMLDocument
Set post = html.querySelectorAll(".mod-Treffer")
Set html2 = New MSHTML.HTMLDocument
Dim i As Long, emailNode As Object
With ActiveSheet
.Range("A1").Resize(1, 3).Value = Array("Title", "Phone", "Email")
For i = 0 To post.Length - 1
html2.body.innerHTML = post.Item(i).outerHTML
.Cells(i + 2, 1).Value = html2.querySelector("h2").innerText
.Cells(i + 2, 2).Value = html2.querySelector(".mod-AdresseKompakt__phoneNumber").innerText
Set emailNode = html2.querySelector(".contains-icon-email")
If Not emailNode Is Nothing Then .Cells(i + 2, 3).Value = Replace$(emailNode.href, "mailto:", vbNullString)
Next i
End With
End Sub
TA贡献1770条经验 获得超3个赞
<article class="mod mod-Treffer" data-teilnehmerid="122085958708">
<div data-wipe="{"listener": "click", "name": "Trefferliste Eintrag zur Detailseite", "id": "122085958708", "synchron": true}" data-realid="2aeca1d2-2bc5-4070-ac4d-e16b10badca5" data-tnid="122085958708" target="_self">
<div class="mod-hervorhebung">
<p class="mod-hervorhebung--partnerHervorhebung" data-hervorhebungsstufe="3">Silber Partner</p>
</div>
<picture class="trefferlisten_logo">
<source media="(min-width: 768px)" srcset="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png">
<img alt="" data-lazy-src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png" src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png">
</picture>
<h2 data-wipe-name="Titel">A & S Billing Pflege-Service GmbH</h2>
<p class="d-inline-block mod-Treffer--besteBranche">Ambulante Pflegedienste</p>
<div class="mod mod-Stars mod-Stars--" title="2.9/5" data-float="2,9">
<span class="mod-Stars__text" style="width: 58.000001907348632812500%;">2.9</span>
</div>
<span>2.9</span>
<span>(8)</span>
<address class="mod mod-AdresseKompakt">
<p data-wipe-name="Adresse">
Kirchenberg 2‑4,
<span class="nobr">
90482
Nürnberg
</span>
(Mögeldorf)
</p>
<p class="mod-AdresseKompakt__phoneNumber" data-hochgestellt-position="end" data-wipe-name="Kontaktdaten">(0911) 60 00 99 77</p>
</address>
</div>
<div class="aktionsleiste_kompakt">
<div class="mod-gsSlider mod-gsSlider--noneOnWhite">
<span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="left" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-links"}"></span>
<span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="right" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-rechts"}"></span>
<div class="mod-gsSlider__slider" data-initialized="true">
<a class="contains-icon-homepage gs-btn" target="_blank" rel=" noopener" href="http://www.as-billing.de" data-wipe="{"listener":"click", "name":"Trefferliste Webseite-Button", "id":"122085958708"}" data-isneededpromise="false">Webseite</a>
<a class="contains-icon-email gs-btn" href="mailto:info@as-billing.de?subject=Anfrage%20%C3%BCber%20Gelbe%20Seiten" data-wipe="{"listener":"click", "name":"Trefferliste Email-Button", "id":"122085958708"}" data-isneededpromise="false">E-Mail</a>
<span class="contains-icon-route_finden gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Navigation-Button", "id":"122085958708"}" data-parameters="{"partner": "googlemaps", "searchquery": "A%20%26%20S%20Billing%20Pflege-Service%20GmbH%20Kirchenberg%202-4%2090482%20N%C3%BCrnberg"}" data-target="_blank">Route</span>
<span class="contains-icon-details gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Actionbutton Mehr Details", "id":"122085958708"}" data-parameters="{"partner": "gs", "realId": "2aeca1d2-2bc5-4070-ac4d-e16b10badca5", "tnId": "122085958708"}">Mehr Details</span>
</div>
</div>
</div>
</article>
TA贡献1865条经验 获得超7个赞
我可以用这些线来弄清楚
If InStr(post.Item(i).getElementsByTagName("a")(1).href, "mailto:") Then
Debug.Print Split(Split(post.Item(i).getElementsByTagName("a")(1).href, "mailto:")(1), "?")(0)
End If
但我欢迎任何其他改进和了解更多的建议。* 经过测试,如果在元素中找不到电子邮件,我会遇到错误。如何避免错误呢?我可以用On Error Resume Next。但我希望处理该错误而不是跳过它。
** 编辑:我可以使用这个结构解决第二点
Dim emailObj As Object
Set emailObj = post.Item(i).getElementsByTagName("a")(1)
If Not emailObj Is Nothing Then
If InStr(post.Item(i).getElementsByTagName("a")(1).href, "mailto:") Then
Debug.Print Split(Split(post.Item(i).getElementsByTagName("a")(1).href, "mailto:")(1), "?")(0)
End If
该代码可以工作,但有时电子邮件无法正确抓取..这是因为这一行
Set emailObj = post.Item(i).getElementsByTagName("a")(1)
有时该对象未分配给 1。所以我的最后一个问题:无论分配的数字如何,如何获取电子邮件数据?
在循环中,我尝试了这条线并没有用
Set aNodeList = post.Item(i).querySelectorAll(".contains-icon-email")(0)
- 3 回答
- 0 关注
- 293 浏览
添加回答
举报