为了账号安全,请及时绑定邮箱和手机立即绑定

ElasticSearch 遇见(4)

标签:
Java
希望给同样遇见es的你带来帮助,也希望彼此有更多的讨论
版本选择6.4.3
1-Java 客户端的使用 (下)
  批量插入
  聚合查询
  scroll-scan  

批量插入

一般快速导入数据,会选择批量插入的方式,比如重新索引数据的时候
    @Override
    public void bulk(List<CometIndex> list) {
        //批量插入数据
        BulkRequest request = new BulkRequest();
        try {
            for (CometIndex cometIndex:list){
                request.add(new IndexRequest(CometIndexKey.INDEX_NAME, CometIndexKey.INDEX_NAME)
                        .source(objectMapper.writeValueAsString(cometIndex), XContentType.JSON));
            }

            BulkResponse bulkResponse = client.bulk(request, RequestOptions.DEFAULT);
            //The Bulk response provides a method to quickly check if one or more operation has failed:
            if (bulkResponse.hasFailures()) {
               log.info("all success");
            }
            TimeValue took = bulkResponse.getTook();
            log.info("[批量新增花费的毫秒]:{},({}),{}", took, took.getMillis(), took.getSeconds());
            //所有操作结果进行迭代
            /*for (BulkItemResponse bulkItemResponse : bulkResponse) {
                if (bulkItemResponse.isFailed()) {
                    BulkItemResponse.Failure failure = bulkItemResponse.getFailure();
                }
            }*/
        }catch (Exception e){
            e.printStackTrace();
        }
    }
    
    @Test
    public void bulkAdd(){
       List<CometIndex>list=new ArrayList<>();
       int count=0;
       for (int i=0;i<1000;i++){
           CometIndex cometIndex=new CometIndex();
           cometIndex.setCometId((long)i);
           cometIndex.setAuthor("心机boy");
           cometIndex.setCategory("movie");
           cometIndex.setContent("肖申克的救赎"+i);
           cometIndex.setDescription("肖申克的救赎满分");
           cometIndex.setEditor("cctv");
           cometIndex.setTitle("肖申克的救赎"+i);
           cometIndex.setCreateTime(new Date());
           list.add(cometIndex);
           count++;
           if (count%100==0) {//批量100
               searchService.bulk(list);
               list.clear();
           }
       }
    }

聚合查询

1-Metric聚合
  基于一组文档进行聚合,比如mysql中的MIN(), MAX(), STDDEV(), SUM() 等方法。
获取最大的值  
GET _search
{
    "aggs":{
        "max_id":{
            "max":{
                "field":"cometId" 
            }
        }
    }
}
2-Bucketing聚合
  基于检索构成了逻辑文档组,满足特定规则的文档放置到一个桶里,每一个桶关联一个key。比如mysql中的group by。
按照分类聚合  
GET _search
{
    "aggs" : {
        "category_agg" : {
            "terms" : { "field" : "category",
            "order" : { "_count" : "desc" }
            }
      }
    }
}
按照分类分组聚合后继续按照编辑分组聚合
GET _search
{
    "aggs" : {
        "category_agg" : {
            "terms" : { "field" : "category",
            "order" : { "_count" : "desc" }
            },
         "aggs" : {
            "author_agg" : {
               "terms": {
                 "field": "editor"
               }
             }
         }
      }
    }
}
    @Override
    public Map <Object,Long> aggregateCategory() {
        //按照分类 聚合 获取每种分类的个数

       Map <Object,Long>result=new HashMap<>();

       try {
           SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

           TermsAggregationBuilder aggregation = AggregationBuilders.terms(CometIndexKey.CATEGORY_AGG)
                   .field(CometIndexKey.CATEGORY).order((BucketOrder.aggregation("_count", false)));

           //聚合
           searchSourceBuilder.aggregation(aggregation).size(0);
           SearchRequest searchRequest = new SearchRequest();
           searchRequest.indices(CometIndexKey.INDEX_NAME);
           searchRequest.source(searchSourceBuilder);
           SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

           Aggregations aggregations = searchResponse.getAggregations();
           Terms byCategoryAggregation = aggregations.get(CometIndexKey.CATEGORY_AGG);

           if(byCategoryAggregation.getBuckets()!=null && !byCategoryAggregation.getBuckets().isEmpty()){

               List <? extends Terms.Bucket>list=byCategoryAggregation.getBuckets();

               for (Terms.Bucket bucket:list){
                   bucket.getDocCount();
                   bucket.getKey();
                   log.info("key:{},value:{}",bucket.getKey(),bucket.getDocCount());
                   result.put(bucket.getKey(),bucket.getDocCount());
               }
           }

       }catch (Exception e){
           log.error("agg error");
           return result;
       }
       return result;
    }

    @Test
    public void testAgg(){

        Map<Object,Long>result=searchService.aggregateCategory();

        for (Map.Entry<Object,Long> entry : result.entrySet()) {
            System.out.println("Key = " + entry.getKey() + ", Value = " + entry.getValue());
        }
    }
聚合的种类很多,这里只给出简单的一种,大家可以多在dev_tools中尝试

scroll-scan

1-from-size 的限制: 数据越多,其效率就越低
2-scroll:
滚动搜索,它会及时获取一个快照(先做一次初始化搜索把所有符合搜索条件的结果缓存起来生成一个快照,然后持续地、批量地从快照里拉取数据直到没有数据剩下)。这不会受到后来对索引的改变的影响。
3-scan:
深度分页的最耗资源的部分就是对结果的整体排序,但是如果我们关闭排序,那么可以消耗极少资源返回所有的文档.
我们可以使用 scan 搜索类型。scan 会告诉ES 不去排序,而是仅仅从每个仍然有结果的分片中返回下一批数据。
    @Override
    public void scrollScan() {
        //scroll 查询  批量插入
        try {

            final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
            SearchRequest searchRequest = new SearchRequest();
            searchRequest.indices(CometIndexKey.INDEX_NAME);//设置指定的索引
            searchRequest.scroll(scroll);
            SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
            searchSourceBuilder.query(QueryBuilders.matchAllQuery());//查询所有
            searchSourceBuilder.size(1000);
            searchRequest.source(searchSourceBuilder);

            SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
            String scrollId = searchResponse.getScrollId();//获取第一次scrollId

            SearchHits searchHits=searchResponse.getHits();
            log.info("scrollId:{},total:{}",scrollId,searchHits.getTotalHits());
            SearchHit[] hits=searchHits.getHits();

            while (hits != null && hits.length > 0) {

                for (SearchHit hit : hits) {
                    // do something with the SearchHit
                    Map<String, Object> sourceAsMap =  hit.getSourceAsMap();
                    log.info("title:{}",sourceAsMap.get(CometIndexKey.TITLE));
                }

                log.info("scrollId:{}",scrollId);
                SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);//根据scrollId检索
                scrollRequest.scroll(scroll);
                searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
                scrollId = searchResponse.getScrollId();//获取下一次scrollId
                log.info("scrollId:{}",scrollId);
                hits = searchResponse.getHits().getHits();

            }

            //release the search context
            ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
            clearScrollRequest.addScrollId(scrollId);
            ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
            boolean succeeded = clearScrollResponse.isSucceeded();

            log.info("ScrollRequest result:{}",succeeded);
        }catch (Exception e){
            e.printStackTrace();
        }

    }

    @Test
    public void scrollScan(){
        searchService.scrollScan();
    }

掌握了使用的api,我们可以通过批量插入数据的api,生成数据,然后进行测试.
后面会介绍我们怎么使用它。
  • 完整代码,完结后会提供github地址
点击查看更多内容
TA 点赞

若觉得本文不错,就分享一下吧!

评论

作者其他优质文章

正在加载中
  • 推荐
  • 评论
  • 收藏
  • 共同学习,写下你的评论
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦
今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与 放弃机会
意见反馈 帮助中心 APP下载
官方微信

举报

0/150
提交
取消