支持的字段类型

在创建映射时，可以为字段指定数据类型。下表列出了 UDB-SX 支持的所有数据字段类型。

类别	字段类型和描述
别名	`alias`: 现有字段的额外名称。
二进制	`binary`: Base64 编码的二进制值。
数值类型	数值（`byte`、`double`、`float`、`half_float`、`integer`、`long`、`unsigned_long`、`scaled_float`、`short`）。
布尔类型	`boolean`: 布尔值。
日期类型	`date`: 以毫秒存储的日期。 `date_nanos`: 以纳秒存储的日期。
IP 类型	`ip`: IPv4 或 IPv6 格式的 IP 地址。
范围类型	值范围（`integer_range`、`long_range`、`double_range`、`float_range`、`date_range`、`ip_range`）。
对象类型	`object`: JSON 对象。 `nested`: 当数组中的对象需要作为独立文档索引时使用。 `flat_object`: 被视为字符串的 JSON 对象。 `join`: 在同一索引中的文档之间建立父子关系。
字符串类型	`keyword`: 包含未分析的字符串。 `text`: 包含已分析的字符串。 `match_only_text`: `text` 字段的空间优化版本。 `token_count`: 存储字符串中已分析词元的数量。 `wildcard`: `keyword` 的变体，支持高效的子串和正则表达式匹配。
自动补全类型	`completion`: 通过补全建议器提供自动补全功能。 `search_as_you_type`: 使用前缀和中缀补全提供边输入边搜索功能。
地理类型	`geo_point`: 地理点。 `geo_shape`: 地理形状。
排名类型	提升或降低文档的相关性得分（`rank_feature`、`rank_features`）。
k-NN 向量类型	`knn_vector`: 允许将 k-NN 向量索引到 UDB-SX 中，并执行不同类型的 k-NN 搜索。
过滤器类型	`percolator`: 指定将此字段视为查询。
派生类型	`derived`: 通过对现有字段执行脚本来动态创建新字段。
星型树类型	`star_tree`: 预计算聚合并将其存储在星型树索引中，以加速聚合查询的性能。

数组

UDB-SX 中没有专用的数组字段类型。相反，您可以将值数组传递给任何字段。数组中的所有值必须具有相同的字段类型。

PUT testindex1/_doc/1
{
  "number": 1
}

PUT testindex1/_doc/2
{
  "number": [1, 2, 3]
}

多字段

多字段用于以不同方式索引同一字段。字符串通常映射为 text 用于全文查询，并映射为 keyword 用于精确值查询。

可以使用 fields 参数创建多字段。例如，您可以将书籍的 title 映射为 text 类型，并保留一个 keyword 类型的 title.raw 子字段。

PUT books
{
  "mappings" : {
    "properties" : {
      "title" : {
        "type" : "text",
        "fields" : {
          "raw" : {
            "type" : "keyword"
          }
        }
      }
    }
  }
}

空值

将字段的值设置为 null、空数组或 null 值数组会使该字段等同于空字段。因此，您无法搜索该字段中包含 null 的文档。

为了使字段可搜索 null 值，您可以在索引的映射中指定其 null_value 参数。然后，传递给该字段的所有 null 值都将替换为指定的 null_value。

null_value 参数必须与字段的类型相同。例如，如果您的字段是字符串，则该字段的 null_value 也必须是字符串。

示例

创建一个映射，将 emergency_phone 字段中的 null 值替换为字符串 “NONE”：

PUT testindex
{
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword"
      },
      "emergency_phone": {
        "type": "keyword",
        "null_value": "NONE" 
      }
    }
  }
}

将三个文档索引到 testindex 中。文档 1 和 3 的 emergency_phone 字段包含 null，而文档 2 的 emergency_phone 字段是一个空数组：

PUT testindex/_doc/1
{
  "name": "Akua Mansa",
  "emergency_phone": null
}

PUT testindex/_doc/2
{
  "name": "Diego Ramirez",
  "emergency_phone" : []
}

PUT testindex/_doc/3 
{
  "name": "Jane Doe",
  "emergency_phone": [null, null]
}

搜索没有紧急电话的人：

GET testindex/_search
{
  "query": {
    "term": {
      "emergency_phone": "NONE"
    }
  }
}

响应包含文档 1 和 3，但不包含文档 2，因为只有显式的 null 值才会被替换为字符串 “NONE”：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.18232156,
    "hits" : [
      {
        "_index" : "testindex",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "Akua Mansa",
          "emergency_phone" : null
        }
      },
      {
        "_index" : "testindex",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.18232156,
        "_source" : {
          "name" : "Jane Doe",
          "emergency_phone" : [
            null,
            null
          ]
        }
      }
    ]
  }
}

_source 字段仍然包含显式的 null 值，因为它不受 null_value 的影响。