复杂查询
除了简单的 SFW (SELECT-FROM-WHERE) 查询,SQL 插件还支持复杂查询,例如子查询、连接、并集和差集。这些查询操作多个 UDB-SX 索引。要检查这些查询在后台如何执行,请使用 explain 操作。
连接
UDB-SX SQL 支持内连接、交叉连接和左外连接。
约束
连接有一些约束:
只能连接两个索引。
必须为索引使用别名(例如
people p)。在 ON 子句中,只能使用 AND 条件。
在 WHERE 语句中,不要组合包含多个索引的树。例如,以下语句有效:
WHERE (a.type1 > 3 OR a.type1 < 0) AND (b.type2 > 4 OR b.type2 < -1)
以下语句无效:
WHERE (a.type1 > 3 OR b.type2 < 0) AND (a.type1 > 4 OR b.type2 < -1)
不能对结果使用 GROUP BY 或 ORDER BY。
不支持带有 OFFSET 的 LIMIT(例如
LIMIT 25 OFFSET 25)。
描述
JOIN 子句使用每个索引共有的值来组合一个或多个索引的列。
语法
规则 tableSource:

规则 joinPart:

示例 1:内连接
内连接基于您的连接谓词,通过组合两个索引的列来创建新的结果集。它遍历两个索引并比较每个文档以找到满足连接谓词的文档。您可以选择在 JOIN 子句前加上 INNER 关键字。
连接谓词由 ON 子句指定。
SQL 查询:
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
JOIN employees_nested e
ON a.account_number = e.id
employees_nested 索引创建可参考SQL 和 PPL API
Explain:
explain 输出很复杂,因为 JOIN 子句与两个在独立的查询规划器框架中执行的 UDB-SX DSL 查询相关联。您可以通过检查 Physical Plan 和 Logical Plan 对象来解释它。
{
"Physical Plan" : {
"Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : {
"Top [ count=200 ]" : {
"BlockHashJoin[ conditions=( a.account_number = e.id ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : {
"Scroll [ employees_nested as e, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"id",
"name"
]
}
}
},
"Scroll [ accounts as a, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"account_number",
"firstname",
"lastname"
]
}
}
},
"useTermsFilterOptimization" : false
}
}
}
},
"description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query",
"Logical Plan" : {
"Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : {
"Top [ count=200 ]" : {
"Join [ conditions=( a.account_number = e.id ) type=JOIN ]" : {
"Group" : [
{
"Project [ columns=[a.account_number, a.firstname, a.lastname] ]" : {
"TableScan" : {
"tableAlias" : "a",
"tableName" : "accounts"
}
}
},
{
"Project [ columns=[e.name, e.id] ]" : {
"TableScan" : {
"tableAlias" : "e",
"tableName" : "employees_nested"
}
}
}
]
}
}
}
}
}
结果集:
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 6 | Hattie | Bond | 6 | Jane Smith |
示例 2:交叉连接
交叉连接,也称为笛卡尔积连接,将第一个索引中的每个文档与第二个索引中的每个文档组合。
结果集是两个索引文档的笛卡尔积。
此操作类似于没有指定连接条件的 ON 子句的内连接。
请注意:在两个大型甚至中型大小的索引上执行交叉连接存在风险。它可能触发断路器以终止查询,从而避免内存耗尽。
SQL 查询:
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
JOIN employees_nested e
结果集:
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 1 | Amber | Duke | 3 | Bob Smith |
| 1 | Amber | Duke | 4 | Susan Smith |
| 1 | Amber | Duke | 6 | Jane Smith |
| 6 | Hattie | Bond | 3 | Bob Smith |
| 6 | Hattie | Bond | 4 | Susan Smith |
| 6 | Hattie | Bond | 6 | Jane Smith |
| 13 | Nanette | Bates | 3 | Bob Smith |
| 13 | Nanette | Bates | 4 | Susan Smith |
| 13 | Nanette | Bates | 6 | Jane Smith |
| 18 | Dale | Adams | 3 | Bob Smith |
| 18 | Dale | Adams | 4 | Susan Smith |
| 18 | Dale | Adams | 6 | Jane Smith |
示例 3:左外连接
使用左外连接来保留第一个索引中即使不满足连接谓词的行。关键字 OUTER 是可选的。
SQL 查询:
SELECT
a.account_number, a.firstname, a.lastname,
e.id, e.name
FROM accounts a
LEFT JOIN employees_nested e
ON a.account_number = e.id
结果集:
| a.account_number | a.firstname | a.lastname | e.id | e.name |
|---|---|---|---|---|
| 1 | Amber | Duke | null | null |
| 6 | Hattie | Bond | 6 | Jane Smith |
| 13 | Nanette | Bates | null | null |
| 18 | Dale | Adams | null | null |
子查询
子查询是在另一个语句中使用并用括号括起来的完整 SELECT 语句。
从 explain 输出中,您可以看到一些子查询实际上被转换为等效的连接查询来执行。
示例 1:表子查询
SQL 查询:
SELECT a1.firstname, a1.lastname, a1.balance
FROM accounts a1
WHERE a1.account_number IN (
SELECT a2.account_number
FROM accounts a2
WHERE a2.balance > 10000
)
Explain:
{
"Physical Plan" : {
"Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : {
"Top [ count=200 ]" : {
"BlockHashJoin[ conditions=( a1.account_number = a2.account_number ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : {
"Scroll [ accounts as a2, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"query" : {
"bool" : {
"filter" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must_not" : [
{
"bool" : {
"adjust_pure_negative" : true,
"must_not" : [
{
"exists" : {
"field" : "account_number",
"boost" : 1
}
}
],
"boost" : 1
}
}
],
"boost" : 1
}
},
{
"range" : {
"balance" : {
"include_lower" : false,
"include_upper" : true,
"from" : 10000,
"boost" : 1,
"to" : null
}
}
}
],
"boost" : 1
}
}
],
"boost" : 1
}
}
],
"adjust_pure_negative" : true,
"boost" : 1
}
},
"from" : 0
}
},
"Scroll [ accounts as a1, pageSize=10000 ]" : {
"request" : {
"size" : 200,
"from" : 0,
"_source" : {
"excludes" : [ ],
"includes" : [
"firstname",
"lastname",
"balance",
"account_number"
]
}
}
},
"useTermsFilterOptimization" : false
}
}
}
},
"description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query",
"Logical Plan" : {
"Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : {
"Top [ count=200 ]" : {
"Join [ conditions=( a1.account_number = a2.account_number ) type=JOIN ]" : {
"Group" : [
{
"Project [ columns=[a1.balance, a1.firstname, a1.lastname, a1.account_number] ]" : {
"TableScan" : {
"tableAlias" : "a1",
"tableName" : "accounts"
}
}
},
{
"Project [ columns=[a2.account_number] ]" : {
"Filter [ conditions=[AND ( AND account_number ISN null, AND balance GT 10000 ) ] ]" : {
"TableScan" : {
"tableAlias" : "a2",
"tableName" : "accounts"
}
}
}
}
]
}
}
}
}
}
结果集:
| a1.firstname | a1.lastname | a1.balance |
|---|---|---|
| Amber | Duke | 39225 |
| Nanette | Bates | 32838 |
示例 2:From 子查询
SQL 查询:
SELECT f,l,a
FROM (
SELECT firstname AS f,lastname AS l,age AS a
FROM accounts
WHERE age>30
) AS a
Explain:
{
"from" : 0,
"size" : 200,
"query" : {
"bool" : {
"filter" : [
{
"bool" : {
"must" : [
{
"range" : {
"age" : {
"from" : 30,
"to" : null,
"include_lower" : false,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"firstname",
"lastname",
"age"
],
"excludes" : [ ]
}
}
结果集:
| f | l | a |
|---|---|---|
| Amber | Duke | 32 |
| Dale | Adams | 33 |
| Hattie | Bond | 36 |