ElasticSearch必知必會-基礎篇

2023-01-10 14:11:22 來源:51CTO博客
商業發展與職能技術部-體驗保障研發組 康睿 姚再毅 李振 劉斌 王北永

說明:以下全部均基于eslaticsearch 8.1 版本

一.索引的定義

官網文檔地址:


(相關資料圖)

索引的全局認知

ElasticSearch

Mysql

Index

Table

Type廢棄

Table廢棄

Document

Row

Field

Column

Mapping

Schema

Everything is indexed

Index

Query DSL

SQL

GET http://...

select * from

POST http://...

update table set ...

Aggregations

group by\sum\sum

cardinality

去重 distinct

reindex

數據遷移

索引的定義

定義: 相同文檔結構(Mapping)文檔的結合 由唯一索引名稱標定 一個集群中有多個索引 不同的索引代表不同的業務類型數據 注意事項: 索引名稱不支持大寫 索引名稱最大支持255個字符長度 字段的名稱,支持大寫,不過建議全部統一小寫

索引的創建

index-settings 參數解析

官網文檔地址:

注意: 靜態參數索引創建后,不再可以修改,動態參數可以修改 思考: 一、為什么主分片創建后不可修改? A document is routed to a particular shard in an index using the following formula: the defalue value userd for _routing is the document`s _id es中寫入數據,是根據上述的公式計算文檔應該存儲在哪個分片中,后續的文檔讀取也是根據這個公式,一旦分片數改變,數據也就找不到了 簡單理解 根據ID做Hash 然后再 除以 主分片數 取余,被除數改變,結果就不一樣了 二、如果業務層面根據數據情況,確實需要擴展主分片數,那怎么辦? reindex 遷移數據到另外一個索引

索引的基本操作


二.Mapping-Param之dynamic

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/dynamic.html??

核心功能

自動檢測字段類型后添加字段 也就是哪怕你沒有在es的mapping中定義該字段,es也會動態的幫你檢測字段類型

初識dynamic

// 刪除test01索引,保證這個索引現在是干凈的DELETE test01// 不定義mapping,直接一條插入數據試試看,POST test01/_doc/1{  "name":"kangrui10"}// 然后我們查看test01該索引的mapping結構 看看name這個字段被定義成了什么類型// 由此可以看出,name一級為text類型,二級定義為keyword,但其實這并不是我們想要的結果,// 我們業務查詢中name字段并不會被分詞查詢,一般都是全匹配(and name = xxx)// 以下的這種結果,我們想要實現全匹配 就需要 name.keyword = xxx  反而麻煩GET test01/_mapping{  "test01" : {    "mappings" : {      "properties" : {        "name" : {          "type" : "text",          "fields" : {            "keyword" : {              "type" : "keyword",              "ignore_above" : 256            }          }        }      }    }  }}

dynamic的可選值

可選值

說明

解釋

true

New fields are added to the mapping (default).

創建mapping時,如果不指定dynamic的值,默認true,即如果你的字段沒有收到指定類型,就會es幫你動態匹配字段類型

false

New fields are ignored. These fields will not be indexed or searchable, but will still appear in the _source field of returned hits. These fields will not be added to the mapping, and new fields must be added explicitly.

若設置為false,如果你的字段沒有在es的mapping中創建,那么新的字段,一樣可以寫入,但是不能被查詢,mapping中也不會有這個字段,也就是被寫入的字段,不會被創建索引

strict

If new fields are detected, an exception is thrown and the document is rejected. New fields must be explicitly added to the mapping.

若設置為strict,如果新的字段,沒有在mapping中創建字段,添加會直接報錯,生產環境推薦,更加嚴謹。示例如下,如要新增字段,就必須手動的新增字段

動態映射的弊端

字段匹配相對準確,但不一定是用戶期望的比如現在有一個text字段,es只會給你設置為默認的standard分詞器,但我們一般需要的是ik中文分詞器占用多余的存儲空間string類型匹配為text和keyword兩種類型,意味著會占用更多的存儲空間mapping爆炸如果不小心寫錯了查詢語句,get用成了put誤操作,就會錯誤創建很多字段

三.Mapping-Param之doc_values

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/doc-values.html??

核心功能

DocValue其實是??Lucene??在構建倒排索引時,會額外建立一個有序的正排索引(基于document => field value的映射列表) DocValue本質上是一個序列化的 列式存儲,這個結構非常適用于聚合(aggregations)、排序(Sorting)、腳本(scripts access to field)等操作。而且,這種存儲方式也非常便于壓縮,特別是數字類型。這樣可以減少磁盤空間并且提高訪問速度。 幾乎所有字段類型都支持DocValue,除了text和annotated_text字段。

何為正排索引

正排索引其實就是類似于數據庫表,通過id和數據進行關聯,通過搜索文檔id,來獲取對應的數據

doc_values可選值

true:默認值,默認開啟false:需手動指定,設置為false后,sort、aggregate、access the field from script將會無法使用,但會節省磁盤空間

真題演練

// 創建一個索引,test03,字段滿足以下條件//     1. speaker: keyword//     2. line_id: keyword and not aggregateable//     3. speech_number: integerPUT test03{  "mappings": {    "properties": {      "speaker": {        "type": "keyword"      },      "line_id":{        "type": "keyword",        "doc_values": false      },      "speech_number":{        "type": "integer"      }    }  }}

四.分詞器analyzers

ik中文分詞器安裝

何為倒排索引

數據索引化的過程

分詞器的分類

官網地址: ??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/analysis-analyzers.html??


五.自定義分詞

自定義分詞器三段論

1.Character filters 字符過濾

官網文檔地址: 可配置0個或多個

??HTML Strip Character Filter??:用途:刪除HTML元素,如 ,并解 碼HTML實體,如&amp

??Mapping Character Filter??:用途:替換指定字符

??Pattern Replace Character Filter??:用途:基于正則表達式替換指定字符

2.Tokenizer 文本切為分詞

官網文檔地址: 只能配置一個 用分詞器對文本進行分詞

3.Token filters 分詞后再過濾

官網文檔地址: 可配置0個或多個 分詞后再加工,比如轉小寫、刪除某些特殊的停用詞、增加同義詞等

真題演練

有一個文檔,內容類似 dag & cat, 要求索引這個文檔,并且使用match_parase_query, 查詢dag & cat 或者 dag and cat,都能夠查到 題目分析: 1.何為match_parase_query:match_phrase 會將檢索關鍵詞分詞。match_phrase的分詞結果必須在被檢索字段的分詞中都包含,而且順序必須相同,而且默認必須都是連續的。 2.要實現 & 和 and 查詢結果要等價,那么就需要自定義分詞器來實現了,定制化的需求 3.如何自定義一個分詞器: 4.解法1核心使用功能點,??Mapping Character Filter?? 5.解法2核心使用功能點,

解法1

# 新建索引PUT /test01{  "settings": {    "analysis": {      "analyzer": {        "my_analyzer": {          "char_filter": [            "my_mappings_char_filter"          ],          "tokenizer": "standard",        }      },      "char_filter": {        "my_mappings_char_filter": {          "type": "mapping",          "mappings": [            "& => and"          ]        }      }    }  },  "mappings": {    "properties": {      "content":{        "type": "text",        "analyzer": "my_analyzer"      }    }  }}// 說明// 三段論之Character filters,使用char_filter進行文本替換// 三段論之Token filters,使用默認分詞器// 三段論之Token filters,未設定// 字段content 使用自定義分詞器my_analyzer# 填充測試數據PUT test01/_bulk{"index":{"_id":1}}{"content":"doc & cat"}{"index":{"_id":2}}{"content":"doc and cat"}# 執行測試,doc & cat || oc and cat 結果輸出都為兩條POST test01/_search{  "query": {    "bool": {      "must": [        {          "match_phrase": {            "content": "doc & cat"          }        }      ]    }  }}

解法2

# 解題思路,將& 和 and  設定為同義詞,使用Token filters# 創建索引PUT /test02{  "settings": {    "analysis": {      "analyzer": {        "my_synonym_analyzer": {          "tokenizer": "whitespace",          "filter": [            "my_synonym"          ]        }      },      "filter": {        "my_synonym": {          "type": "synonym",          "lenient": true,          "synonyms": [            "& => and"          ]        }      }    }  },  "mappings": {    "properties": {      "content": {        "type": "text",        "analyzer": "my_synonym_analyzer"      }    }  }}// 說明// 三段論之Character filters,未設定// 三段論之Token filters,使用whitespace空格分詞器,為什么不用默認分詞器?因為默認分詞器會把&分詞后剔除了,就無法在去做分詞后的過濾操作了// 三段論之Token filters,使用synony分詞后過濾器,對&和and做同義詞// 字段content 使用自定義分詞器my_synonym_analyzer# 填充測試數據PUT test02/_bulk{"index":{"_id":1}}{"content":"doc & cat"}{"index":{"_id":2}}{"content":"doc and cat"}# 執行測試POST test02/_search{  "query": {    "bool": {      "must": [        {          "match_phrase": {            "content": "doc & cat"          }        }      ]    }  }}

六.multi-fields

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/multi-fields.html??

// 單字段多類型,比如一個字段我想設置兩種分詞器PUT my-index-000001{  "mappings": {    "properties": {      "city": {        "type": "text",        "analyzer":"standard",        "fields": {          "fieldText": {             "type":  "text",            "analyzer":"ik_smart",          }        }      }    }  }}

七.runtime_field 運行時字段

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/runtime.html??

產生背景

假如業務中需要根據某兩個數字類型字段的差值來排序,也就是我需要一個不存在的字段, 那么此時應該怎么辦? 當然你可以刷數,新增一個差值結果字段來實現,假如此時不允許你刷數新增字段怎么辦?

解決方案

應用場景

在不重新建立索引的情況下,向現有文檔新增字段在不了解數據結構的情況下處理數據在查詢時覆蓋從原索引字段返回的值為特定用途定義字段而不修改底層架構

功能特性

Lucene完全無感知,因沒有被索引化,沒有doc_values不支持評分,因為沒有倒排索引打破傳統先定義后使用的方式能阻止mapping爆炸增加了API的靈活性注意,會使得搜索變慢

實際使用

運行時檢索指定,即檢索環節可使用(也就是哪怕mapping中沒有這個字段,我也可以查詢)動態或靜態mapping指定,即mapping環節可使用(也就是在mapping中添加一個運行時的字段)

真題演練1

# 假定有以下索引和數據PUT test03{  "mappings": {    "properties": {      "emotion": {        "type": "integer"      }    }  }}POST test03/_bulk{"index":{"_id":1}}{"emotion":2}{"index":{"_id":2}}{"emotion":5}{"index":{"_id":3}}{"emotion":10}{"index":{"_id":4}}{"emotion":3}# 要求:emotion > 5, 返回emotion_falg = "1",  # 要求:emotion < 5, 返回emotion_falg = "-1",  # 要求:emotion = 5, 返回emotion_falg = "0",

解法1

檢索時指定運行時字段: 該字段本質上是不存在的,所以需要檢索時要加上 fields *

GET test03/_search{  "fields": [    "*"  ],   "runtime_mappings": {    "emotion_falg": {      "type": "keyword",      "script": {        "source": """          if(doc["emotion"].value>5)emit("1");          if(doc["emotion"].value<5)emit("-1");          if(doc["emotion"].value==5)emit("0");          """      }    }  }}

解法2

創建索引時指定運行時字段: 該方式支持通過運行時字段做檢索

# 創建索引并指定運行時字段PUT test03_01{  "mappings": {    "runtime": {      "emotion_falg": {        "type": "keyword",        "script": {          "source": """          if(doc["emotion"].value>5)emit("1");          if(doc["emotion"].value<5)emit("-1");          if(doc["emotion"].value==5)emit("0");          """        }      }    },    "properties": {      "emotion": {        "type": "integer"      }    }  }}# 導入測試數據POST test03_01/_bulk{"index":{"_id":1}}{"emotion":2}{"index":{"_id":2}}{"emotion":5}{"index":{"_id":3}}{"emotion":10}{"index":{"_id":4}}{"emotion":3}# 查詢測試GET test03_01/_search{  "fields": [    "*"  ]}

真題演練2

# 有以下索引和數據PUT test04{  "mappings": {    "properties": {      "A":{        "type": "long"      },      "B":{        "type": "long"      }    }  }}PUT task04/_bulk{"index":{"_id":1}}{"A":100,"B":2}{"index":{"_id":2}}{"A":120,"B":2}{"index":{"_id":3}}{"A":120,"B":25}{"index":{"_id":4}}{"A":21,"B":25}# 需求:在task04索引里,創建一個runtime字段,其值是A-B,名稱為A_B; 創建一個range聚合,分為三級:小于0,0-100,100以上;返回文檔數// 使用知識點:// 1.檢索時指定運行時字段: https://www.elastic.co/guide/en/elasticsearch/reference/8.1/runtime-search-request.html// 2.范圍聚合 https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-bucket-range-aggregation.html

解法

# 結果測試GET task04/_search{  "fields": [    "*"  ],   "size": 0,   "runtime_mappings": {    "A_B": {      "type": "long",      "script": {        "source": """          emit(doc["A"].value - doc["B"].value);          """      }    }  },  "aggs": {    "price_ranges_A_B": {      "range": {        "field": "A_B",        "ranges": [          { "to": 0 },          { "from": 0, "to": 100 },          { "from": 100 }        ]      }    }  }}

八.Search-highlighted

highlighted語法初識

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/highlighting.html??

九.Search-Order

Order語法初識

官網文檔地址: ??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/sort-search-results.html??

// 注意:text類型默認是不能排或聚合的,如果非要排序或聚合,需要開啟fielddataGET /kibana_sample_data_ecommerce/_search{  "query": {    "match": {      "customer_last_name": "wood"    }  },  "highlight": {    "number_of_fragments": 3,    "fragment_size": 150,    "fields": {      "customer_last_name": {        "pre_tags": [          ""        ],        "post_tags": [          ""        ]      }    }  },  "sort": [    {      "currency": {        "order": "desc"      },      "_score": {        "order": "asc"      }    }  ]}

十.Search-Page

page語法初識

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/paginate-search-results.html??

# 注意 from的起始值是 0 不是 1GET kibana_sample_data_ecommerce/_search{  "from": 5,  "size": 20,  "query": {    "match": {      "customer_last_name": "wood"    }  }}

真題演練1

# 題目In the spoken lines of the play, highlight the word Hamlet (int the text_entry field) startint the highlihnt with "#aaa#" and ending it with "#bbb#"return all of speech_number field lines in reverse order; "20" speech lines per page,starting from line "40"# highlight 處理 text_entry 字段 ; 關鍵詞 Hamlet 高亮# page分頁:from:40;size:20# speech_number:倒序POST test09/_search{  "from": 40,  "size": 20,  "query": {    "bool": {      "must": [        {          "match": {            "text_entry": "Hamlet"          }        }      ]    }  },  "highlight": {    "fields": {      "text_entry": {        "pre_tags": [          "#aaa#"        ],        "post_tags": [          "#bbb#"        ]      }    }  },  "sort": [    {      "speech_number.keyword": {        "order": "desc"      }    }  ]}

十一.Search-AsyncSearch

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/async-search.html??

發行版本

7.7.0

適用場景

允許用戶在異步搜索結果時可以檢索,從而消除了僅在查詢完成后才等待最終響應的情況

常用命令

執行異步檢索POST /sales*/_async_search?size=0查看異步檢索GET /_async_search/id值查看異步檢索狀態GET /_async_search/id值刪除、終止異步檢索DELETE /_async_search/id值

異步查詢結果說明

返回值

含義

id

異步檢索返回的唯一標識符

is_partial

當查詢不再運行時,指示再所有分片上搜索是成功還是失敗。在執行查詢時,is_partial=true

is_running

搜索是否仍然再執行

total

將在多少分片上執行搜索

successful

有多少分片已經成功完成搜索

十二.Aliases索引別名

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/aliases.html??

Aliases的作用

在ES中,索引別名(index aliases)就像一個快捷方式或??軟連接??,可以指向一個或多個索引。別名帶給我們極大的靈活性,我們可以使用索引別名實現以下功能:

在一個運行中的ES集群中無縫的切換一個索引到另一個索引上(無需停機)分組多個索引,比如按月創建的索引,我們可以通過別名構造出一個最近3個月的索引查詢一個索引里面的部分數據構成一個類似數據庫的視圖(views

假設沒有別名,如何處理多索引的檢索

方式1:POST index_01,index_02.index_03/_search 方式2:POST index*/search

創建別名的三種方式

創建索引的同時指定別名
# 指定test05的別名為 test05_aliasesPUT test05{  "mappings": {    "properties": {      "name":{        "type": "keyword"      }    }  },  "aliases": {    "test05_aliases": {}  }}
使用索引模板的方式指定別名
PUT _index_template/template_1{  "index_patterns": ["te*", "bar*"],  "template": {    "settings": {      "number_of_shards": 1    },    "mappings": {      "_source": {        "enabled": true      },      "properties": {        "host_name": {          "type": "keyword"        },        "created_at": {          "type": "date",          "format": "EEE MMM dd HH:mm:ss Z yyyy"        }      }    },    "aliases": {      "mydata": { }    }  },  "priority": 500,  "composed_of": ["component_template1", "runtime_component_template"],   "version": 3,  "_meta": {    "description": "my custom"  }}
對已有的索引創建別名
POST _aliases{  "actions": [    {      "add": {        "index": "logs-nginx.access-prod",        "alias": "logs"      }    }  ]}

刪除別名

POST _aliases{  "actions": [    {      "remove": {        "index": "logs-nginx.access-prod",        "alias": "logs"      }    }  ]}

真題演練1

# Define an index alias for "accounts-row" called "accounts-male": Apply a filter to only show the male account owners# 為"accounts-row"定義一個索引別名,稱為"accounts-male":應用一個過濾器,只顯示男性賬戶所有者POST _aliases{  "actions": [    {      "add": {        "index": "accounts-row",        "alias": "accounts-male",        "filter": {          "bool": {            "filter": [              {                "term": {                  "gender.keyword": "male"                }              }            ]          }        }      }    }  ]}

十三.Search-template

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-template.html??

功能特點

模板接受在運行時指定參數。搜索模板存儲在服務器端,可以在不更改客戶端代碼的情況下進行修改。

初識search-template

# 創建檢索模板PUT _scripts/my-search-template{  "script": {    "lang": "mustache",    "source": {      "query": {        "match": {          "{{query_key}}": "{{query_value}}"        }      },      "from": "{{from}}",      "size": "{{size}}"    }  }}# 使用檢索模板查詢GET my-index/_search/template{  "id": "my-search-template",  "params": {    "query_key": "your filed",    "query_value": "your filed value",    "from": 0,    "size": 10  }}

索引模板的操作

創建索引模板

PUT _scripts/my-search-template{  "script": {    "lang": "mustache",    "source": {      "query": {        "match": {          "message": "{{query_string}}"        }      },      "from": "{{from}}",      "size": "{{size}}"    },    "params": {      "query_string": "My query string"    }  }}

驗證索引模板

POST _render/template{  "id": "my-search-template",  "params": {    "query_string": "hello world",    "from": 20,    "size": 10  }}

執行檢索模板

GET my-index/_search/template{  "id": "my-search-template",  "params": {    "query_string": "hello world",    "from": 0,    "size": 10  }}

獲取全部檢索模板

GET _cluster/state/metadata?pretty&filter_path=metadata.stored_scripts

刪除檢索模板

DELETE _scripts/my-search-templateath=metadata.stored_scripts

十四.Search-dsl 簡單檢索

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/query-dsl.html??

檢索選型

檢索分類

自定義評分

如何自定義評分

1.index Boost索引層面修改相關性

// 一批數據里,有不同的標簽,數據結構一致,不同的標簽存儲到不同的索引(A、B、C),最后要嚴格按照標簽來分類展示的話,用什么查詢比較好?// 要求:先展示A類,然后B類,然后C類# 測試數據如下put /index_a_123/_doc/1{  "title":"this is index_a..."}put /index_b_123/_doc/1{  "title":"this is index_b..."}put /index_c_123/_doc/1{  "title":"this is index_c..."}# 普通不指定的查詢方式,該查詢方式下,返回的三條結果數據評分是相同的POST index_*_123/_search{  "query": {    "bool": {      "must": [        {          "match": {            "title": "this"          }        }      ]    }  }}官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-search.htmlindices_boost# 也就是索引層面提升權重POST index_*_123/_search{  "indices_boost": [    {      "index_a_123": 10    },    {      "index_b_123": 5    },    {      "index_c_123": 1    }  ],   "query": {    "bool": {      "must": [        {          "match": {            "title": "this"          }        }      ]    }  }}

2.boosting 修改文檔相關性

某索引index_a有多個字段, 要求實現如下的查詢:1)針對字段title,滿足"ssas"或者"sasa’。2)針對字段tags(數組字段),如果tags字段包含"pingpang",則提升評分。要求:寫出實現的DSL?# 測試數據如下put index_a/_bulk{"index":{"_id":1}}{"title":"ssas","tags":"basketball"}{"index":{"_id":2}}{"title":"sasa","tags":"pingpang; football"}# 解法1POST index_a/_search{  "query": {    "bool": {      "must": [        {          "bool": {            "should": [              {                "match": {                  "title": "ssas"                }              },              {                "match": {                  "title": "sasa"                }              }            ]          }        }      ],      "should": [        {          "match": {            "tags": {              "query": "pingpang",              "boost": 1            }                      }        }      ]    }  }}# 解法2// https://www.elastic.co/guide/en/elasticsearch/reference/8.1/query-dsl-function-score-query.htmlPOST index_a/_search{  "query": {    "bool": {      "should": [        {          "function_score": {            "query": {              "match": {                "tags": {                  "query": "pingpang"                }              }            },            "boost": 1          }        }      ],      "must": [        {          "bool": {            "should": [              {                "match": {                  "title": "ssas"                }              },              {                "match": {                  "title": "sasa"                }              }            ]          }        }      ]    }  }}

3.negative_boost降低相關性

對于某些結果不滿意,但又不想通過 must_not 排除掉,可以考慮可以考慮boosting query的negative_boost。即:降低評分negative_boost(Required, float) Floating point number between 0 and 1.0 used to decrease the relevance scores of documents matching the negative query.官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/query-dsl-boosting-query.htmlPOST index_a/_search{  "query": {    "boosting": {      "positive": {        "term": {          "tags": "football"        }      },      "negative": {        "term": {          "tags": "pingpang"        }      },      "negative_boost": 0.5    }  }}

4.function_score 自定義評分

如何同時根據 銷量和瀏覽人數進行相關度提升?問題描述:針對商品,例如有想要有一個提升相關度的計算,同時針對銷量和瀏覽人數?例如oldScore*(銷量+瀏覽人數)**************************  商品        銷量        瀏覽人數  A         10           10      B         20           20C         30           30************************** # 示例數據如下    put goods_index/_bulk{"index":{"_id":1}}{"name":"A","sales_count":10,"view_count":10}{"index":{"_id":2}}{"name":"B","sales_count":20,"view_count":20}{"index":{"_id":3}}{"name":"C","sales_count":30,"view_count":30}官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/query-dsl-function-score-query.html知識點:script_scorePOST goods_index/_search{  "query": {    "function_score": {      "query": {        "match_all": {}      },      "script_score": {        "script": {          "source": "_score * (doc["sales_count"].value+doc["view_count"].value)"        }      }    }  }}

十五.Search-del Bool復雜檢索

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/query-dsl-bool-query.html??

基本語法

真題演練

寫一個查詢,要求某個關鍵字再文檔的四個字段中至少包含兩個以上功能點:bool 查詢,should / minimum_should_match    1.檢索的bool查詢    2.細節點 minimum_should_match注意:minimum_should_match 當有其他子句的時候,默認值為0,當沒有其他子句的時候默認值為1POST test_index/_search{  "query": {    "bool": {      "should": [        {          "match": {            "filed1": "kr"          }        },        {          "match": {            "filed2": "kr"          }        },        {          "match": {            "filed3": "kr"          }        },        {          "match": {            "filed4": "kr"          }        }      ],      "minimum_should_match": 2    }  }}

十六.Search-Aggregations

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations.html??

聚合分類

分桶聚合(bucket)

terms

官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-bucket-terms-aggregation.html# 按照作者統計文檔數POST bilili_elasticsearch/_search{  "size": 0,  "aggs": {    "agg_user": {      "terms": {        "field": "user",        "size": 1      }    }  }}

date_histogram

官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-bucket-datehistogram-aggregation.html# 按照up_time 按月進行統計POST bilili_elasticsearch/_search{  "size": 0,  "aggs": {    "agg_up_time": {      "date_histogram": {        "field": "up_time",        "calendar_interval": "month"      }    }  }}

指標聚合 (metrics)

Max

官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-metrics-max-aggregation.html# 獲取up_time最大的POST bilili_elasticsearch/_search{  "size": 0,  "aggs": {    "agg_max_up_time": {      "max": {        "field": "up_time"      }    }  }}

Top_hits

官網文檔地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-metrics-top-hits-aggregation.html# 根據user聚合只取一個聚合結果,并且獲取命中數據的詳情前3條,并按照指定字段排序POST bilili_elasticsearch/_search{  "size": 0,  "aggs": {    "terms_agg_user": {      "terms": {        "field": "user",        "size": 1      },      "aggs": {        "top_user_hits": {          "top_hits": {            "_source": {              "includes": [                "video_time",                "title",                "see",                "user",                "up_time"              ]            },             "sort": [              {                "see":{                  "order": "desc"                }              }            ],             "size": 3          }        }      }    }  }}// 返回結果如下{  "took" : 91,  "timed_out" : false,  "_shards" : {    "total" : 1,    "successful" : 1,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : {      "value" : 1000,      "relation" : "eq"    },    "max_score" : null,    "hits" : [ ]  },  "aggregations" : {    "terms_agg_user" : {      "doc_count_error_upper_bound" : 0,      "sum_other_doc_count" : 975,      "buckets" : [        {          "key" : "Elastic搜索",          "doc_count" : 25,          "top_user_hits" : {            "hits" : {              "total" : {                "value" : 25,                "relation" : "eq"              },              "max_score" : null,              "hits" : [                {                  "_index" : "bilili_elasticsearch",                  "_id" : "5ccCVoQBUyqsIDX6wIcm",                  "_score" : null,                  "_source" : {                    "video_time" : "03:45",                    "see" : "92",                    "up_time" : "2021-03-19",                    "title" : "Elastic 社區大會2021: 用加 Gatling 進行Elasticsearch的負載測試,寓教于樂。",                    "user" : "Elastic搜索"                  },                  "sort" : [                    "92"                  ]                },                {                  "_index" : "bilili_elasticsearch",                  "_id" : "8scCVoQBUyqsIDX6wIgn",                  "_score" : null,                  "_source" : {                    "video_time" : "10:18",                    "see" : "79",                    "up_time" : "2020-10-20",                    "title" : "為Elasticsearch啟動htpps訪問",                    "user" : "Elastic搜索"                  },                  "sort" : [                    "79"                  ]                },                {                  "_index" : "bilili_elasticsearch",                  "_id" : "7scCVoQBUyqsIDX6wIcm",                  "_score" : null,                  "_source" : {                    "video_time" : "04:41",                    "see" : "71",                    "up_time" : "2021-03-19",                    "title" : "Elastic 社區大會2021: Elasticsearch作為一個地理空間的數據庫",                    "user" : "Elastic搜索"                  },                  "sort" : [                    "71"                  ]                }              ]            }          }        }      ]    }  }}

子聚合 (Pipeline)

Pipeline:基于聚合的聚合 官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-pipeline.html??

bucket_selector

官網文檔地址:??https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-pipeline-bucket-selector-aggregation.html??

# 根據order_date按月分組,并且求銷售總額大于1000POST kibana_sample_data_ecommerce/_search{  "size": 0,  "aggs": {    "date_his_aggs": {      "date_histogram": {        "field": "order_date",        "calendar_interval": "month"      },      "aggs": {        "sum_aggs": {          "sum": {            "field": "total_unique_products"          }        },        "sales_bucket_filter": {          "bucket_selector": {            "buckets_path": {              "totalSales": "sum_aggs"            },            "script": "params.totalSales > 1000"          }        }      }    }  }}

真題演練

earthquakes索引中包含了過去30個月的地震信息,請通過一句查詢,獲取以下信息l 過去30個月,每個月的平均 magl 過去30個月里,平均mag最高的一個月及其平均magl 搜索不能返回任何文檔    max_bucket 官網地址:https://www.elastic.co/guide/en/elasticsearch/reference/8.1/search-aggregations-pipeline-max-bucket-aggregation.htmlPOST earthquakes/_search{  "size": 0,   "query": {    "range": {      "time": {        "gte": "now-30M/d",        "lte": "now"      }    }  },  "aggs": {    "agg_time_his": {      "date_histogram": {        "field": "time",        "calendar_interval": "month"      },      "aggs": {        "avg_aggs": {          "avg": {            "field": "mag"          }        }      }    },    "max_mag_sales": {      "max_bucket": {        "buckets_path": "agg_time_his>avg_aggs"       }    }  }}

標簽: 測試數據 字段類型

上一篇:觀察:【Redis 技術探索】「數據遷移實戰」手把手教你如何實現在線 + 離線模式進行遷移 Redis 數據實戰指南(數據檢查對比)
下一篇:當前資訊!服務案例 SQL Server數據庫反復重啟問題