3 回答
TA贡献1876条经验 获得超5个赞
当您使用attribute值进行搜索时,您还应该提供tag名称。试试下面的代码find。如果只有一个元素要搜索 try 。如果多个元素要搜索 tryfind_all然后迭代循环。希望这有帮助。
from bs4 import BeautifulSoup
html="""<html><td class="center iz" data-stat="age"></td>,
<td class="left " data-stat="team_id"><a href="/teams/BOS/">BOS</a></td>,
<td class="left " data-stat="lg_id">NBA</td>,
<td class="center iz" data-stat="pos"></td>,
<td class="right " data-stat="g">2</td>,
<td class="right incomplete iz" data-stat="gs"></td>,
<td class="right " data-stat="mp_per_g">12.0</td>,
<td class="right " data-stat="fg_per_g">1.5</td>,
<td class="right " data-stat="fga_per_g">6.5</td>,
<td class="right " data-stat="fg_pct">.231</td>,
<td class="right " data-stat="ft_per_g">1.0</td>,
<td class="right " data-stat="fta_per_g">1.5</td>,
<td class="right " data-stat="ft_pct">.667</td>,
<td class="right " data-stat="orb_per_g">3.0</td>,
<td class="right " data-stat="drb_per_g">4.5</td>,
<td class="right " data-stat="trb_per_g">**7.5**</td>,
<td class="right " data-stat="ast_per_g">1.5</td>,
<td class="right " data-stat="stl_per_g">0.5</td>,
<td class="right " data-stat="blk_per_g">0.5</td>,
<td class="right " data-stat="tov_per_g">1.5</td>,
<td class="right " data-stat="pf_per_g">2.0</td>,
<td class="right " data-stat="pts_per_g">4.0</td></html>"""
soup = BeautifulSoup(html,'html.parser')
findtag=soup.find('td',attrs={"data-stat" : "trb_per_g" })
print(findtag.text)
要搜索多个项目,试试这个。
findtags=soup.find_all('td',attrs={"data-stat" : "trb_per_g" })
for tag in findtags:
print(tag.text)
TA贡献1839条经验 获得超15个赞
我认为使用 css 选择器组合按表 id 和属性 = 感兴趣的 td 单元格进行定位会更快
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
url = "https://www.basketball-reference.com/players/a/abdulza01.html"
soup = bs(requests.get(url).content, 'lxml')
data = [item.text for item in soup.select('#per_game [data-stat=trb_per_g]')]
df = pd.DataFrame(data)
df.rename(columns=df.iloc[0], inplace = True)
df.drop(df.index[0], inplace = True)
print(df)
df.to_csv(r'C:\Users\Users\Desktop\Data.csv', sep=',', encoding='utf-8',index = False )
添加回答
举报