3 回答
TA贡献1801条经验 获得超8个赞
我认为没有必要改变任何映射。尝试使用query_string,它是完美的。所有方案都适用于默认的标准分析器:
我们有数据:
{"_id" : "1","name" : "John Doeman","function" : "Janitor"}
{"_id" : "2","name" : "Jane Doewoman","function" : "Teacher"}
场景1:
{"query": {
"query_string" : {"default_field" : "name", "query" : "*Doe*"}
} }
响应:
{"_id" : "1","name" : "John Doeman","function" : "Janitor"}
{"_id" : "2","name" : "Jane Doewoman","function" : "Teacher"}
场景2:
{"query": {
"query_string" : {"default_field" : "name", "query" : "*Jan*"}
} }
响应:
{"_id" : "1","name" : "John Doeman","function" : "Janitor"}
场景3:
{"query": {
"query_string" : {"default_field" : "name", "query" : "*oh* *oe*"}
} }
响应:
{"_id" : "1","name" : "John Doeman","function" : "Janitor"}
{"_id" : "2","name" : "Jane Doewoman","function" : "Teacher"}
TA贡献1828条经验 获得超3个赞
我也在使用nGram。我使用标准tokenizer和nGram作为过滤器。这是我的设置:
{
"index": {
"index": "my_idx",
"type": "my_type",
"analysis": {
"index_analyzer": {
"my_index_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"mynGram"
]
}
},
"search_analyzer": {
"my_search_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"lowercase",
"mynGram"
]
}
},
"filter": {
"mynGram": {
"type": "nGram",
"min_gram": 2,
"max_gram": 50
}
}
}
}
}
让我们找到最多50个字母的单词部分。根据需要调整max_gram。在德语中,单词可以变得非常大,所以我将其设置为高价值。
TA贡献1865条经验 获得超7个赞
使用前导和尾随通配符进行搜索对于大型索引来说会非常慢。如果您希望能够通过单词前缀进行搜索,请删除前导通配符。如果你真的需要在一个单词的中间找到一个子字符串,那么你最好使用ngram tokenizer。
- 3 回答
- 0 关注
- 1621 浏览
添加回答
举报