项目中使用java解析excel经常会遇见oom的问题,这里选择两个开源框架,进行简单的比较和测试。
EasyExcel
阿里开源(EasyExcel)
引入maven依赖
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>easyexcel</artifactId>
<version>1.1.2-beta5</version>
</dependency>
使用EasyExcel对我们解析excel提供了便利,但是测试的时候当excel达到100M左右
依然会发送OOM,下面看一下入门的使用方法
简单使用
public class ExcelListener extends AnalysisEventListener {
private List<Object> datas = new ArrayList<Object>();
public void invoke(Object object, AnalysisContext context) {
System.out.println("当前行:"+context.getCurrentRowNum());
System.out.println(object);
datas.add(object);//数据存储到list,供批量处理,或后续自己业务逻辑处理。
doSomething(object);//根据自己业务做处理
}
private void doSomething(Object object) {
//1、入库调用接口
}
public void doAfterAllAnalysed(AnalysisContext context) {
// datas.clear();//解析结束销毁不用的资源
}
public List<Object> getDatas() {
return datas;
}
public void setDatas(List<Object> datas) {
this.datas = datas;
}
}
读取:
public static void saxReadSheetsV2007() throws IOException {
InputStream inputStream = new FileInputStream(new File("F:/Excel/demo02.xlsx"));
BufferedInputStream inputStreams = new BufferedInputStream(inputStream);
ExcelListener excelListener = new ExcelListener();
ExcelReader excelReader = EasyExcelFactory.getReader(inputStreams,excelListener);
List<Sheet> sheets = excelReader.getSheets();
System.out.println();
for (Sheet sheet:sheets) {
if(sheet.getSheetNo() ==1) {
excelReader.read(sheet);
}else if(sheet.getSheetNo() ==2){
excelReader.read(sheet);
}else if(sheet.getSheetNo() ==3){
excelReader.read(sheet);
}
System.out.println(sheet.getSheetNo());
}
inputStream.close();
}
xlsx-streamer
**github地址:https://github.com/monitorjbl/excel-streaming-reader**
使用xlsx-streamer可以解决OMM的问题,它是一种流式的读取方式。
引入maven依赖:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.18</version>
</dependency>
<dependency>
<groupId>com.monitorjbl</groupId>
<artifactId>xlsx-streamer</artifactId>
<version>2.1.0</version>
</dependency>
简单使用
public static void main(String[] args) {
try (
InputStream is = new FileInputStream(new File("F:/Excel/sss.xlsx"));
Workbook workbook = StreamingReader.builder()
.rowCacheSize(100)
.bufferSize(4096)
.open(is)) {
for (Sheet sheet : workbook){
System.out.println(sheet.getSheetName());
for (Row r : sheet) {
for (Cell c : r) {
System.out.println(c.getStringCellValue());
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
问题
当使用xlsx-streamer读取excel的时候,如果文件达到500M左右,会抛出大概如下异常:
"Zip bomb detected! The file would exceed the max size of the expanded data in the zip-file.This may indicates that the file is used to inflate memory usage and thus could pose a security risk.You can adjust this limit via ZipSecureFile.setMaxEntrySize() if you need to work with files which are very large.
原因是poi进行了限制 [0-4GB]:
/**
* Sets the maximum file size of a single zip entry. It defaults to 4GB,
* i.e. the 32-bit zip format maximum.
*
* This can be used to limit memory consumption and protect against
* security vulnerabilities when documents are provided by users.
*
* @param maxEntrySize the max. file size of a single zip entry
*/
public static void setMaxEntrySize(long maxEntrySize) {
if (maxEntrySize < 0 || maxEntrySize > 0xFFFFFFFFL) { // don't use MAX_ENTRY_SIZE here!
throw new IllegalArgumentException("Max entry size is bounded [0-4GB], but had " + maxEntrySize);
}
MAX_ENTRY_SIZE = maxEntrySize;
}
点击查看更多内容
为 TA 点赞
评论
共同学习,写下你的评论
评论加载中...
作者其他优质文章
正在加载中
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦