2 回答

TA贡献1828条经验 获得超3个赞
根据对您想做什么的稀疏描述,我建议:
从图像中读取文本
用冒号替换所有空格
String csvContent = imgData.replaceAll(" ",";");
将文本保存到 csv 文件
用excel打开csv文件
以下示例假定您已设法检索数据,然后对这些数据进行后处理以提供 csv 格式。内容被写入一个文件,您只需双击该文件即可看到数据是否按照您的要求分成了列。
String[] data = new String[] {
"BOWLING O M R W ECON 0s 45 6", //notice that your OCR software does not properly recognise the string here
"TABoult 4 0 3 0 925 M 2 3",
"JETED 6 0 = 4 O 0 0"
};
BufferedWriter writer = new BufferedWriter( new FileWriter( System.getProperty( "user.home" ) + System.getProperty( "file.separator" ) + "data.csv" ) );
for( String record : data ) {
writer.write( record.replaceAll( " ", ";" ) );
writer.write( "\n" );
}
writer.close();
就像我在上面的评论中所说的那样,您的 OCR 无法正常工作。我建议您查看 JSOUP html 解析器以获取信息并从那里继续。否则你不会对结果满意。

TA贡献1725条经验 获得超7个赞
driver.get("https://www.espncricinfo.com/series/8048/scorecard/1178425/chennai-super-kings-vs-delhi-capitals-50th-match-indian-premier-league-2019"); WebElement element = driver.findElement(By.xpath("//article[@class='sub-module scorecard'][1]")); JavascriptExecutor js = (JavascriptExecutor) 驱动程序;js.executeScript("arguments[0].scrollIntoView(true);", element);
File screen = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
File file = new File("C:\\Users\\user\\Desktop\\screenshot1\\screenshotOfElement2.png");
FileHandler.copy(screen, file);
ITesseract instance = new Tesseract();
instance.setDatapath("C:\\selenium_work\\ScrapingText.PDF\\tessdata");
String result = instance.doOCR(file);
//System.out.println(result);
String[] lines = result.split("\\n");
this one what am trying
添加回答
举报