秩字段类型

以下表格列出了UDB-SX支持的所有排名字段类型。

字段数据类型 描述
rank_feature 增加或减少文档的相关性得分。
rank_features 当特征列表稀疏时使用。

排名特征和排名特征字段只能使用排名特征查询进行查询。它们不支持聚合或排序。

排名特征

A rank feature field type 使用一个正浮点值来提升或降低 rank_feature 查询中文档的相关性得分。默认情况下,此值提升相关性得分。要降低相关性得分,将可选的 positive_score_impact 参数设置为 false。

示例

创建一个包含排名功能字段的映射:

PUT chessplayers
{
  "mappings": {
    "properties": {
      "name" : {
        "type" : "text"
      },
      "rating": {
        "type": "rank_feature" 
      },
      "age": {
        "type": "rank_feature",
        "positive_score_impact": false 
      }
    }
  }
}

索引三个文档,其中包含一个提升分数的rank_feature字段(rating)和一个降低分数的rank_feature字段(age):

PUT chessplayers/_doc/1
{
  "name" : "John Doe",
  "rating" : 2554,
  "age" : 75
}
PUT chessplayers/_doc/2
{
  "name" : "Kwaku Mensah",
  "rating" : 2067,
  "age": 10
}
PUT chessplayers/_doc/3
{
  "name" : "Nikki Wolf",
  "rating" : 1864,
  "age" : 22
}

排名特征查询

使用排名功能查询,您可以按评分、年龄或评分和年龄同时对玩家进行排名。如果您按评分对玩家进行排名,评分更高的玩家将具有更高的相关性得分。如果您按年龄对玩家进行排名,年轻的玩家将具有更高的相关性得分。

使用排名特征查询根据年龄和评分搜索玩家:

GET chessplayers/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "rank_feature": {
            "field": "rating"
          }
        },
        {
          "rank_feature": {
            "field": "age"
          }
        }
      ]
    }
  }
}

按年龄和评分排名时,年龄较轻且排名较高的玩家得分更高:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.2093145,
    "hits" : [
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.2093145,
        "_source" : {
          "name" : "Kwaku Mensah",
          "rating" : 1967,
          "age" : 10
        }
      },
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0150313,
        "_source" : {
          "name" : "Nikki Wolf",
          "rating" : 1864,
          "age" : 22
        }
      },
      {
        "_index" : "chessplayers",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.8098284,
        "_source" : {
          "name" : "John Doe",
          "rating" : 2554,
          "age" : 75
        }
      }
    ]
  }
}

排名特征

A rank features field type与rank feature field type相似,但更适合于稀疏的特征列表。rank features field可以索引数值特征向量,这些向量随后用于在rank_feature查询中提升或降低文档的相关度评分。

示例

创建一个具有排名特征字段的映射:

PUT testindex1
{
  "mappings": {
    "properties": {
      "correlations": {
        "type": "rank_features" 
      }
    }
  }
}

要使用排名特征字段索引文档,请使用具有字符串键和正浮点值键的哈希表:

PUT testindex1/_doc/1
{
  "correlations": { 
    "young kids" : 1,
    "older kids" : 15,
    "teens" : 25.9
  }
}
PUT testindex1/_doc/2
{
  "correlations": {
    "teens": 10,
    "adults": 95.7
  }
}

使用排名特征查询检索文档:

GET testindex1/_search
{
  "query": {
    "rank_feature": {
      "field": "correlations.teens"
    }
  }
}

响应按相关性得分排序:

{
  "took" : 123,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.6258503,
    "hits" : [
      {
        "_index" : "testindex1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.6258503,
        "_source" : {
          "correlations" : {
            "young kids" : 1,
            "older kids" : 15,
            "teens" : 25.9
          }
        }
      },
      {
        "_index" : "testindex1",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.39263803,
        "_source" : {
          "correlations" : {
            "teens" : 10,
            "adults" : 95.7
          }
        }
      }
    ]
  }
}

排名特征和排名特征字段使用最高九位有效位进行精确度,导致大约0.4%的相对误差。值以2 = 0.00390625的相对精度存储。