为了账号安全,请及时绑定邮箱和手机立即绑定

Java解析Excel遇见的问题

标签:
Java
项目中使用java解析excel经常会遇见oom的问题,这里选择两个开源框架,进行简单的比较和测试。

EasyExcel

阿里开源(EasyExcel)
引入maven依赖
       <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>easyexcel</artifactId>
            <version>1.1.2-beta5</version>
        </dependency>
        
使用EasyExcel对我们解析excel提供了便利,但是测试的时候当excel达到100M左右
依然会发送OOM,下面看一下入门的使用方法

简单使用

public class ExcelListener extends AnalysisEventListener {

    private List<Object> datas = new ArrayList<Object>();
    public void invoke(Object object, AnalysisContext context) {
        System.out.println("当前行:"+context.getCurrentRowNum());
        System.out.println(object);
        datas.add(object);//数据存储到list,供批量处理,或后续自己业务逻辑处理。
        doSomething(object);//根据自己业务做处理
    }
    private void doSomething(Object object) {
        //1、入库调用接口
    }
    public void doAfterAllAnalysed(AnalysisContext context) {
        // datas.clear();//解析结束销毁不用的资源
    }
    public List<Object> getDatas() {
        return datas;
    }
    public void setDatas(List<Object> datas) {
        this.datas = datas;
    }

}

读取:
public static void saxReadSheetsV2007() throws IOException {
        InputStream inputStream = new FileInputStream(new File("F:/Excel/demo02.xlsx"));
        BufferedInputStream inputStreams = new BufferedInputStream(inputStream);

       ExcelListener excelListener = new ExcelListener();

        ExcelReader excelReader = EasyExcelFactory.getReader(inputStreams,excelListener);

        List<Sheet> sheets = excelReader.getSheets();
        System.out.println();
        for (Sheet sheet:sheets) {
            if(sheet.getSheetNo() ==1) {
                excelReader.read(sheet);
            }else if(sheet.getSheetNo() ==2){
                excelReader.read(sheet);
            }else if(sheet.getSheetNo() ==3){
                excelReader.read(sheet);
            }
           System.out.println(sheet.getSheetNo());
        }
        inputStream.close();
    }

xlsx-streamer

**github地址:https://github.com/monitorjbl/excel-streaming-reader**
使用xlsx-streamer可以解决OMM的问题,它是一种流式的读取方式。
引入maven依赖:
		       <dependency>
		            <groupId>org.apache.commons</groupId>
		            <artifactId>commons-compress</artifactId>
		            <version>1.18</version>
		        </dependency>

        <dependency>
            <groupId>com.monitorjbl</groupId>
            <artifactId>xlsx-streamer</artifactId>
            <version>2.1.0</version>
        </dependency>

简单使用

  public static void main(String[] args) {

        try (
                InputStream is = new FileInputStream(new File("F:/Excel/sss.xlsx"));
                Workbook workbook = StreamingReader.builder()
                        .rowCacheSize(100)
                        .bufferSize(4096)
                        .open(is)) {
            for (Sheet sheet : workbook){
                System.out.println(sheet.getSheetName());
                for (Row r : sheet) {
                    for (Cell c : r) {
                        System.out.println(c.getStringCellValue());
                    }
                }
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

问题

当使用xlsx-streamer读取excel的时候,如果文件达到500M左右,会抛出大概如下异常:
 "Zip bomb detected! The file would exceed the max size of the expanded data in the zip-file.This may indicates that the file is used to inflate memory usage and thus could pose a security risk.You can adjust this limit via ZipSecureFile.setMaxEntrySize() if you need to work with files which are very large.


原因是poi进行了限制  [0-4GB]:
     /**
     * Sets the maximum file size of a single zip entry. It defaults to 4GB,
     * i.e. the 32-bit zip format maximum.
     * 
     * This can be used to limit memory consumption and protect against 
     * security vulnerabilities when documents are provided by users.
     *
     * @param maxEntrySize the max. file size of a single zip entry
     */
    public static void setMaxEntrySize(long maxEntrySize) {
        if (maxEntrySize < 0 || maxEntrySize > 0xFFFFFFFFL) {   // don't use MAX_ENTRY_SIZE here!
            throw new IllegalArgumentException("Max entry size is bounded [0-4GB], but had " + maxEntrySize);
        }
        MAX_ENTRY_SIZE = maxEntrySize;
    }

点击查看更多内容
TA 点赞

若觉得本文不错,就分享一下吧!

评论

作者其他优质文章

正在加载中
JAVA开发工程师
手记
粉丝
6394
获赞与收藏
157

关注作者,订阅最新文章

阅读免费教程

  • 推荐
  • 评论
  • 收藏
  • 共同学习,写下你的评论
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦
今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与 放弃机会
意见反馈 帮助中心 APP下载
官方微信

举报

0/150
提交
取消