为了账号安全,请及时绑定邮箱和手机立即绑定

使用python从bseindia下载csv文件

使用python从bseindia下载csv文件

扬帆大鱼 2023-03-30 16:15:38
我想从“https://www.bseindia.com/corporates/Forth_Results.aspx”下载 Results.csv 我想基本上以数据帧格式获取数据。我使用下面的代码下载了文件,但它得到了一些错误数据。import requestsimport pandas as pdbse_url = 'https://www.bseindia.com/corporates/Forth_Results.aspx'r = requests.get(bse_url)file_name = Results.csvwith open(file_name, 'wb') as f:    for chunk in r.iter_content():         f.write(chunk)        f.flush()
查看完整描述

2 回答

?
湖上湖

TA贡献2003条经验 获得超2个赞

您可以在 selenium 的帮助下执行此操作,请按照以下步骤操作:

第 1 步:下载 chrome 的网络驱动程序:

首先检查您的 chrome 版本(浏览器的菜单(三个垂直点)-> 帮助 -> 关于 Google Chrome

第二步:根据你的chrome浏览器版本下载驱动(我的是81.0.4044.138)

第 3 步:下载后解压缩文件并将chromedriver.exe放在脚本所在的目录中。

步骤4:pip install selenium

现在使用下面的代码:

from selenium import webdriver

import os

import pandas as pd


#your website url

site = 'https://www.bseindia.com/corporates/Forth_Results.aspx'


#your driver path

driver = webdriver.Chrome(executable_path = 'chromedriver.exe')

#passing website url

driver.get(site)


#wait until whole sites load

time.sleep(5)


#click download icon using xpath

driver.find_element_by_xpath("/html/body/div[1]/form/div[4]/div/div[2]/div/div/div[2]/a/i").click()

#closing browser

driver.close()

#reading Results.csv from defalut download directory

df = pd.read_csv("c:/users/viupadhy/downloads/Results.csv")

df

输出:


    Security Code   Security Name   Company name    Result Date

0   542579  AGOL    Ashapuri Gold Ornament Ltd  24 Jul 2020

1   500425  AMBUJACEM   AMBUJA CEMENTS LTD. 24 Jul 2020

2   531223  ANJANI  ANJANI SYNTHETICS LTD.-$    24 Jul 2020

3   500820  ASIANPAINT  ASIAN PAINTS LTD.   24 Jul 2020

4   500027  ATUL    ATUL LTD.   24 Jul 2020

5   512063  AYOME   AYOKI MERCANTILE LTD.   24 Jul 2020

6   517246  BCCFUBA BCC FUBA INDIA LTD. 24 Jul 2020

7   540700  BRNL    Bharat Road Network Ltd 24 Jul 2020

8   519600  CCL CCL PRODUCTS (INDIA) LTD.   24 Jul 2020

9   531621  CENTERAC    CENTERAC TECHNOLOGIES LTD.  24 Jul 2020

10  539991  CFEL    Confidence Futuristic Energetech Ltd    24 Jul 2020

11  500110  CHENNPETRO  CHENNAI PETROLEUM CORPORATION LTD.  24 Jul 2020

12  534691  COMCL   COMFORT COMMOTRADE LTD. 24 Jul 2020

13  531216  COMFINTE    COMFORT INTECH LTD.-$   24 Jul 2020

14  526829  CONFIPET    CONFIDENCE PETROLEUM INDIA LTD. 24 Jul 2020

15  506395  COROMANDEL  COROMANDEL INTERNATIONAL LTD.   24 Jul 2020

16  539876  CROMPTON    Crompton Greaves Consumer Electricals Ltd   24 Jul 2020

17  526269  CRSTCHM CRESTCHEM LTD.  24 Jul 2020

18  541546  GAYAHWS Gayatri Highways Ltd    24 Jul 2020

19  500171  GHCL    GHCL LTD.   24 Jul 2020

20  524590  HEMORGANIC  Hemo Organic Limited    24 Jul 2020

21  505725  HINDEVER    HINDUSTAN EVEREST TOOLS LTD.    24 Jul 2020

22  501295  IITL    INDUSTRIAL INVESTMENT TRUST LTD.    24 Jul 2020

23  513295  IMEC    Imec Services Ltd   24 Jul 2020

24  541300  INDINFR IndInfravit Trust   24 Jul 2020

25  500875  ITC ITC LTD.    24 Jul 2020

26  509715  JAYSHREETEA JAY SHREE TEA & INDUSTRIES LTD. 24 Jul 2020

27  500228  JSWSTEEL    JSW STEEL LTD.  24 Jul 2020

28  506184  KANANIIND   KANANI INDUSTRIES LTD.  24 Jul 2020

29  512036  KAPILCO KAPIL COTEX LTD.    24 Jul 2020

... ... ... ... ...


查看完整回答
反对 回复 2023-03-30
?
ABOUTYOU

TA贡献1812条经验 获得超5个赞

from selenium import webdriver

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

import os

import time

import pandas as pd

#PATH CHECK

import pathlib


while 1 == 1 :  # This constructs an infinite loop

    filename='C:/Users/Administrator/Downloads/Results.csv'

    file = pathlib.Path(filename)

    if file.exists ():

        os.remove('C:/Users/Administrator/Downloads/Results.csv')

    #your website url

    site = 'https://www.bseindia.com/corporates/Forth_Results.aspx'


    #your driver path

    driver = webdriver.Chrome(executable_path = 'chromedriver.exe')

    #passing website url

    driver.get(site)

    time.sleep(10)

    wait = WebDriverWait(driver, 20)

    wait.until(EC.presence_of_element_located((By.ID, 'ContentPlaceHolder1_lnkDownload')))

    

    #click download icon using xpath

    el=driver.find_element_by_xpath("/html/body/div[1]/form/div[4]/div/div[2]/div/div/div[2]/a/i")

    el.click()

    #elem.click()

    time.sleep(20)

    driver.close()

    if file.exists ():

        break


df = pd.read_csv("C:/Users/Administrator/Downloads/Results.csv")

print(df)


查看完整回答
反对 回复 2023-03-30
  • 2 回答
  • 0 关注
  • 230 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
意见反馈 帮助中心 APP下载
官方微信