3 回答
TA贡献1804条经验 获得超7个赞
网站中的表格是动态加载的,因此您无法使用requests. 你必须使用selenium才能做到这一点。这是完整的代码:
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import pandas as pd
url = 'https://bscscan.com/tokenholdings?a=0x00a2c3d755c21bc837a3ca9a32279275eae9e3d6'
driver = webdriver.Chrome()
driver.get(url)
time.sleep(5)
html = driver.page_source
driver.close()
soup = BeautifulSoup(html,'html5lib')
tbody = soup.find('tbody', id = "tb1")
tr_tags = tbody.find_all('tr')
symbols = []
quantities = []
for tr in tr_tags:
td_tags = tr.find_all('td')
symbols.append(td_tags[2].text)
quantities.append(td_tags[3].text)
df = pd.DataFrame((symbols,quantities))
df = df.T
df.columns = ['Symbol','Quantity']
print(df)
输出:
Symbol Quantity
0 BNB 17.98420742
1 Cake 19.76899295
2 ANY 1
3 FREE 1,502
4 LFI 326.87340092
5 LFI 326.87340092
TA贡献2021条经验 获得超8个赞
我推荐一个非常好的工具,叫做 re,你可以从两个子字符串中搜索特定的字符串,例如
import re
s = ''<td>THERE IS TEXT I WANT TO GET</td>"
result = re.search('<td>(.*)</td>', s)
print(result.group(1))
TA贡献1824条经验 获得超6个赞
>>> html="<td>THERE IS TEXT I WANT TO GET</td>\n<td>THERE IS TEXT I WANT TO GET</td>\n<td>THERE IS TEXT I WANT TO GET</td>\n<td>THERE IS TEXT I WANT TO GET</td>"
>>> soup = BeautifulSoup(html)
>>> for td in soup.find_all('td'): print(td.text)
添加回答
举报