为了账号安全,请及时绑定邮箱和手机立即绑定

Python - pycharm - 使用OCR代码时出错

Python - pycharm - 使用OCR代码时出错

哈士奇WWW 2022-09-06 16:57:43
我正在尝试从这里使用代码:https://www.geeksforgeeks.org/python-reading-contents-of-pdf-using-ocr-optical-character-recognition/# Import libraries from PIL import Imageimport pytesseractimport sysfrom pdf2image import convert_from_pathimport os# Path of the pdf PDF_file = "/Users/user1/Desktop/pdf1.pdf"''' Part #1 : Converting PDF to images '''# Store all the pages of the PDF in a variable pages = convert_from_path(PDF_file, 500)# Counter to store images of each page of PDF to image image_counter = 1# Iterate through all the pages stored above for page in pages:    # Declaring filename for each page of PDF as JPG    # For each page, filename will be:     # PDF page 1 -> page_1.jpg     # PDF page 2 -> page_2.jpg     # PDF page 3 -> page_3.jpg     # ....     # PDF page n -> page_n.jpg     filename = "page_" + str(image_counter) + ".jpg"    # Save the image of the page in system     page.save(filename, 'JPEG')    # Increment the counter to update filename     image_counter = image_counter + 1''' Part #2 - Recognizing text from the images using OCR '''3# Variable to get count of total number of pages filelimit = image_counter - 1# Creating a text file to write the output outfile = "/Users/user1/Desktop/ocr/pdf1.txt"# Open the file in append mode so that  # All contents of all images are added to the same file f = open(outfile, "a")# Iterate from 1 to total number of pages for i in range(1, filelimit + 1):    # Set filename to recognize text from    # Again, these files will be:     # page_1.jpg     # page_2.jpg     # ....     # page_n.jpg     filename = "page_" + str(i) + ".jpg"    # Recognize the text as string in image using pytesserct     text = str(((pytesseract.image_to_string(Image.open(filename)))))
查看完整描述

1 回答

?
哆啦的时光机

TA贡献1779条经验 获得超6个赞

您需要安装 poppler 并确保它位于 Windows PATH 中 - 请参阅如何在 Windows 上安装 Poppler


查看完整回答
反对 回复 2022-09-06
  • 1 回答
  • 0 关注
  • 149 浏览
慕课专栏
更多

添加回答

举报

0/150
提交
取消
微信客服

购课补贴
联系客服咨询优惠详情

帮助反馈 APP下载

慕课网APP
您的移动学习伙伴

公众号

扫描二维码
关注慕课网微信公众号