拆分处理器

split 处理器根据指定的分隔符将字符串字段拆分为子字符串数组。

请求体字段

下表列出了所有可用的请求字段。

字段 数据类型 描述
field 字符串 包含要拆分的字符串的字段。必需。
separator 字符串 用于拆分字符串的分隔符。可以指定单个分隔字符或正则表达式模式。必需。
preserve_trailing 布尔值 如果设置为 true,则保留结果数组中的空尾随字段(例如 '')。如果设置为 false,则从结果数组中移除空尾随字段。默认值为 false
target_field 字符串 存储子字符串数组的字段。如果未指定,则字段将被就地更新。
tag 字符串 处理器的标识符。
description 字符串 处理器的描述。
ignore_failure 布尔值 如果为 true,则 UDB-SX 忽略此处理器的任何故障并继续运行搜索管道中的其余处理器。可选。默认值为 false

示例

以下示例演示了使用包含 split 处理器的搜索管道。

设置

创建一个名为 my_index 的索引,并索引一个包含 message 字段的文档:

POST /my_index/_doc/1
{
  "message": "ingest, search, visualize, and analyze data",
  "visibility": "public"
}

创建搜索管道

以下请求创建一个包含 split 响应处理器的搜索管道,该处理器拆分 message 字段并将结果存储在 split_message 字段中:

PUT /_search/pipeline/my_pipeline
{
  "response_processors": [
    {
      "split": {
        "field": "message",
        "separator": ", ",
        "target_field": "split_message"
      }
    }
  ]
}

使用搜索管道

在不使用搜索管道的情况下搜索 my_index 中的文档:

GET /my_index/_search

响应包含 message 字段:

响应
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "message": "ingest, search, visualize, and analyze data",
          "visibility": "public"
        }
      }
    ]
  }
}

要使用管道进行搜索,请在 search_pipeline 查询参数中指定管道名称:

GET /my_index/_search?search_pipeline=my_pipeline

message 字段被拆分,结果存储在 split_message 字段中:

响应
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}

您也可以使用 fields 选项来搜索文档中的特定字段:

POST /my_index/_search?pretty&search_pipeline=my_pipeline
{
    "fields": ["visibility", "message"]
}

在响应中,message 字段被拆分,结果存储在 split_message 字段中:

响应
{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        },
        "fields": {
          "visibility": [
            "public"
          ],
          "message": [
            "ingest, search, visualize, and analyze data"
          ],
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}