秩字段类型
以下表格列出了UDB-SX支持的所有排名字段类型。
| 字段数据类型 | 描述 |
|---|---|
| rank_feature | 增加或减少文档的相关性得分。 |
| rank_features | 当特征列表稀疏时使用。 |
排名特征和排名特征字段只能使用排名特征查询进行查询。它们不支持聚合或排序。
排名特征
A rank feature field type 使用一个正浮点值来提升或降低 rank_feature 查询中文档的相关性得分。默认情况下,此值提升相关性得分。要降低相关性得分,将可选的 positive_score_impact 参数设置为 false。
示例
创建一个包含排名功能字段的映射:
PUT chessplayers
{
"mappings": {
"properties": {
"name" : {
"type" : "text"
},
"rating": {
"type": "rank_feature"
},
"age": {
"type": "rank_feature",
"positive_score_impact": false
}
}
}
}
索引三个文档,其中包含一个提升分数的rank_feature字段(rating)和一个降低分数的rank_feature字段(age):
PUT chessplayers/_doc/1
{
"name" : "John Doe",
"rating" : 2554,
"age" : 75
}
PUT chessplayers/_doc/2
{
"name" : "Kwaku Mensah",
"rating" : 2067,
"age": 10
}
PUT chessplayers/_doc/3
{
"name" : "Nikki Wolf",
"rating" : 1864,
"age" : 22
}
排名特征查询
使用排名功能查询,您可以按评分、年龄或评分和年龄同时对玩家进行排名。如果您按评分对玩家进行排名,评分更高的玩家将具有更高的相关性得分。如果您按年龄对玩家进行排名,年轻的玩家将具有更高的相关性得分。
使用排名特征查询根据年龄和评分搜索玩家:
GET chessplayers/_search
{
"query": {
"bool": {
"should": [
{
"rank_feature": {
"field": "rating"
}
},
{
"rank_feature": {
"field": "age"
}
}
]
}
}
}
按年龄和评分排名时,年龄较轻且排名较高的玩家得分更高:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.2093145,
"hits" : [
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.2093145,
"_source" : {
"name" : "Kwaku Mensah",
"rating" : 1967,
"age" : 10
}
},
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0150313,
"_source" : {
"name" : "Nikki Wolf",
"rating" : 1864,
"age" : 22
}
},
{
"_index" : "chessplayers",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.8098284,
"_source" : {
"name" : "John Doe",
"rating" : 2554,
"age" : 75
}
}
]
}
}
排名特征
A rank features field type与rank feature field type相似,但更适合于稀疏的特征列表。rank features field可以索引数值特征向量,这些向量随后用于在rank_feature查询中提升或降低文档的相关度评分。
示例
创建一个具有排名特征字段的映射:
PUT testindex1
{
"mappings": {
"properties": {
"correlations": {
"type": "rank_features"
}
}
}
}
要使用排名特征字段索引文档,请使用具有字符串键和正浮点值键的哈希表:
PUT testindex1/_doc/1
{
"correlations": {
"young kids" : 1,
"older kids" : 15,
"teens" : 25.9
}
}
PUT testindex1/_doc/2
{
"correlations": {
"teens": 10,
"adults": 95.7
}
}
使用排名特征查询检索文档:
GET testindex1/_search
{
"query": {
"rank_feature": {
"field": "correlations.teens"
}
}
}
响应按相关性得分排序:
{
"took" : 123,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.6258503,
"hits" : [
{
"_index" : "testindex1",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.6258503,
"_source" : {
"correlations" : {
"young kids" : 1,
"older kids" : 15,
"teens" : 25.9
}
}
},
{
"_index" : "testindex1",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.39263803,
"_source" : {
"correlations" : {
"teens" : 10,
"adults" : 95.7
}
}
}
]
}
}
排名特征和排名特征字段使用最高九位有效位进行精确度,导致大约0.4%的相对误差。值以2 = 0.00390625的相对精度存储。