Elasticsearch之分析。

xiaoxiao2021-02-28  97

Elasticsearch有一个功能叫做聚合(aggregations),它允许你在数据上生成复杂的分析统计。它很像SQL中的GROUP BY,但是功能更强大。

举个例子,让我们找到所有职员中共同点(兴趣爱好)是什么:

        GET /megacorp/employee/_search

         {

            "aggs" : {

               "all_interests" : {

                   "terms" : { "field" : "interests" }

               }

            }

         }

        暂时先忽略语法只看查询结果:

        {

            ...

            "hits" : {...},

            "aggregations" : {

              "all_interests" : {

                  "buckets" : [

                      {

                      "key" : "music",

                      "doc_count" : 2

                       },

                     {

                      "key" : "sports",

                      "doc_count" : 1

                       }

                  ]

              }

            }

         }

       我们可以看到两个职员对音乐有兴趣,一个喜欢运动。这些数据并没有被先计算好,它们是实时的从匹配查询语句的文档中动态计算生成的。

找到所有姓“Smith”的人最大的共同点(兴趣爱好)。

        GET      /megacorp/employee/_search

       {

          "query" : {

              "match" : {

                  "last_name" : "smith"

               }

          },

          "aggs" : {

               "all_interests" : {

                    "terms" : {

                           "field" : "interests"

                     }

                }

           }

       }

聚合也允许分级汇总。例如,让我们统计每种兴趣下职员的平均年龄:

        GET /megacorp/employee/_search

        {

              "aggs" : {

                    "all_interests" : {

                          "terms" : {"field" : "interests" },

                          "aggs" : {

                                "avg_age" : {

                                    "avg" : {"field" : "age" }

                                 }

                           }

                     }

               }

         }

        当然这次返回的聚合结果有些复杂,但仍然很容易理解:

     ...

"all_interests": {          "buckets": [             {                "key": "music",                "doc_count": 2,                "avg_age": {                   "value": 28.5                }             },             {                "key": "sport",                "doc_count": 1,                "avg_age": {                   "value": 25                }             },          ]       }

      该聚合结果比之前的聚合结果更加丰富。我们依然得到了兴趣以及数量(指具有该兴趣的员工人数)的列表,但是现在每个兴趣额外拥有avg_age字段来显示具有该兴趣员工的平均年龄。

转载请注明原文地址: https://www.6miu.com/read-30575.html

最新回复(0)