为了账号安全,请及时绑定邮箱和手机立即绑定

为什么最后用urlopen读取线上pdf地址时,读取信息显示异常

显示如下:

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2096

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3237

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 884

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1528

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 703

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3344

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 4177

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1492

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 990

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2082

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 801

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 703

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2096

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3237

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 5196

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 933

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 884

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1528

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1492

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 990

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2082

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 801

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 4033

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 841

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 686

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1107

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 1625

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 683

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2201

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 3647

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 660

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2059

WARNING:pdfminer.converter:undefined: <PDFCIDFont: basefont='VIKMFH+MSungHK-Light', cidcoding='Adobe-CNS1'>, 2986

...

...

正在回答

3 回答

WARNING:pdfminer.converter:undefined:

i try this, and it works.

import logging 
logging.Logger.propagate = False 
logging.getLogger().setLevel(logging.ERROR)

however , i don't know why !

-------------------------------------------------------------------------------------------------------------------------------------------

it sets the root logger to level Error. This will stop PDFMiner warn logging, since it logs to the root logger, but not your own logging.

I needed to set propagation to False, because after PDFMiner usage, I had duplicate logging entries. This was caused by the root logger.

from: http://stackoverflow.com/questions/29762706/warnings-on-pdfminer

0 回复 有任何疑惑可以回复我~
#1

原来我叫小土慕课网给我改了名字 提问者

非常感谢!
2016-11-16 回复 有任何疑惑可以回复我~
#2

KeithTt

厉害,警告是没了,不过还是显示不了中文。
2018-08-05 回复 有任何疑惑可以回复我~

emmmmmm 对啊,去除警告不是目的,目的是为了显示中文啊。。。。警告去了,中文还是没显示出来。。有啥意义呢

0 回复 有任何疑惑可以回复我~
#1

慕UI5065323

请问这个问题你解决了吗
2019-03-23 回复 有任何疑惑可以回复我~

回复 原来我叫小土慕课网给我改了名字:

我後來繼續做 發現 pdf 分兩種 

1.文字轉pdf => 用pdfminerk3 處理 轉回txt

2.圖片轉pdf=> 用Tesseract (OCR庫)處理 轉回txt

所以上面那篇如果轉出來 還是沒東西的話 

可以用Tesseract (OCR庫)試試看 

我最後用下面幾個庫 解決pdf是圖檔狀態下的問題

tesseract ( OCR庫 命令在python外執行 )

pyocr     (tesseract  python 庫的接口 ) 

pillow   (p3從python圖像庫PIL分出來的 )

imagemagick

wand      (imagemagick python 庫的接口 ) 


0 回复 有任何疑惑可以回复我~

举报

0/150
提交
取消

为什么最后用urlopen读取线上pdf地址时,读取信息显示异常

我要回答 关注问题
意见反馈 帮助中心 APP下载
官方微信