拆分处理器

split 处理器根据指定的分隔符将字符串字段拆分为子字符串数组。

请求体字段

下表列出了所有可用的请求字段。

字段	数据类型	描述
`field`	字符串	包含要拆分的字符串的字段。必需。
`separator`	字符串	用于拆分字符串的分隔符。可以指定单个分隔字符或正则表达式模式。必需。
`preserve_trailing`	布尔值	如果设置为 `true`，则保留结果数组中的空尾随字段（例如 `''`）。如果设置为 `false`，则从结果数组中移除空尾随字段。默认值为 `false`。
`target_field`	字符串	存储子字符串数组的字段。如果未指定，则字段将被就地更新。
`tag`	字符串	处理器的标识符。
`description`	字符串	处理器的描述。
`ignore_failure`	布尔值	如果为 `true`，则 UDB-SX 忽略此处理器的任何故障并继续运行搜索管道中的其余处理器。可选。默认值为 `false`。

示例

以下示例演示了使用包含 split 处理器的搜索管道。

设置

创建一个名为 my_index 的索引，并索引一个包含 message 字段的文档：

POST /my_index/_doc/1
{
  "message": "ingest, search, visualize, and analyze data",
  "visibility": "public"
}

创建搜索管道

以下请求创建一个包含 split 响应处理器的搜索管道，该处理器拆分 message 字段并将结果存储在 split_message 字段中：

PUT /_search/pipeline/my_pipeline
{
  "response_processors": [
    {
      "split": {
        "field": "message",
        "separator": ", ",
        "target_field": "split_message"
      }
    }
  ]
}

使用搜索管道

在不使用搜索管道的情况下搜索 my_index 中的文档：

GET /my_index/_search

响应包含 message 字段：

响应

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "message": "ingest, search, visualize, and analyze data",
          "visibility": "public"
        }
      }
    ]
  }
}

要使用管道进行搜索，请在 search_pipeline 查询参数中指定管道名称：

GET /my_index/_search?search_pipeline=my_pipeline

message 字段被拆分，结果存储在 split_message 字段中：

响应

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}

您也可以使用 fields 选项来搜索文档中的特定字段：

POST /my_index/_search?pretty&search_pipeline=my_pipeline
{
    "fields": ["visibility", "message"]
}

在响应中，message 字段被拆分，结果存储在 split_message 字段中：

响应

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "my_index",
        "_id": "1",
        "_score": 1,
        "_source": {
          "visibility": "public",
          "message": "ingest, search, visualize, and analyze data",
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        },
        "fields": {
          "visibility": [
            "public"
          ],
          "message": [
            "ingest, search, visualize, and analyze data"
          ],
          "split_message": [
            "ingest",
            "search",
            "visualize",
            "and analyze data"
          ]
        }
      }
    ]
  }
}