为了账号安全,请及时绑定邮箱和手机立即绑定

TiDB 关联子查询及半连接的优化实践(下)

标签:
开源 数据库


https://img1.sycdn.imooc.com/674838e3000148b010800200.jpg

关联子查询定义

   


关联子查询是指和外部查询有关联的子查询,即在子查询中使用了外部查询包含的列。在关联子查询中,对于外部查询返回的每一行数据,内部查询都要执行一次。关联子查询具有以下几方面的特点:


  1. 信息流双向:关联子查询中的信息流是双向的。外部查询的每行数据传递一个值给子查询,子查询为每一行数据执行一次并返回记录,然后外部查询根据返回的记录做出决策。

  2. 灵活性:关联子查询可以使用关联列的灵活性,将 SQL 查询写成子查询的形式,这往往可以极大地简化 SQL 查询,并使 SQL 查询的语义更加方便理解。

  3. 执行挑战:为了计算关联结果的值(子查询的输出),关联子查询需要采用迭代(iterative)的执行方式。


以下是一个典型的关联子查询示例:

select * from t1 where t1.a < (select sum(t2.a) from t2 where t2.b = t1.b);
TiDB 关联子查询优化-“子查询去关联”


   


关联子查询每次子查询执行时都是要和它的外部查询结果绑定的,因此,如果上述语句中 t1.a 有一千万个值,那这个子查询就要被重复执行一千万次,因为 t2.b=t1.b 这个条件会随着 t1.a 值的不同而发生变化。如果能将关联依赖解除的话,这个子查询就只需要被执行一次。默认情况下,TiDB 会尝试进行子查询去关联,以达到更高的执行效率。以下是 TiDB 中针对这条语句生成的执行计划,可以看出 TiDB 的优化器将语句的执行计划生成了一个 Hash Join 关联方式。模拟 2 个表的数据量为 1 千万行进行测试时,语句的实际执行耗时约为 14 秒

mysql> explain select * from t1 where t1.a < (select sum(t2.a) from t2 where t2.b = t1.b);
+--------------------------------+-----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------+
| id                             | estRows   | task      | access object | operator info                                                                                                |
+--------------------------------+-----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------+
| HashJoin_12                    | 452412.14 | root      |               | inner join, equal:[eq(test.t1.b, test.t2.b)], other cond:lt(cast(test.t1.a, decimal(10,0) BINARY), Column#7) |
| ├─TableReader_15(Build)        | 452412.14 | root      |               | data:Selection_14                                                                                            |
| │ └─Selection_14               | 452412.14 | cop[tikv] |               | not(isnull(test.t1.b))                                                                                       |
| │   └─TableFullScan_13         | 452865.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                                                                               |
| └─HashAgg_21(Probe)            | 472411.12 | root      |               | group by:test.t2.b, funcs:sum(Column#8)->Column#7, funcs:firstrow(test.t2.b)->test.t2.b                      |
|   └─TableReader_22             | 472411.12 | root      |               | data:HashAgg_16                                                                                              |
|     └─HashAgg_16               | 472411.12 | cop[tikv] |               | group by:test.t2.b, funcs:sum(test.t2.a)->Column#8                                                           |
|       └─Selection_20           | 590513.90 | cop[tikv] |               | not(isnull(test.t2.b))                                                                                       |
|         └─TableFullScan_19     | 591105.00 | cop[tikv] | table:t2      | keep order:false, stats:pseudo                                                                               |
+--------------------------------+-----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)


假如 TiDB 没有做子查询去关联的优化,实际执行情况又是怎么样呢?我们可以用 HINT /*+ NO_DECORRELATE() */ 来关闭子查询去关联,模拟未优化前的情况。此时得到如下的执行计划,其中 operator info 为 range: decided by [eq(test.t2.b, test.t1.b)]IndexRangeScan_42 算子表示 TiDB 使用关联条件进行索引范围查询。同样的语句执行耗时超过 1 小时

mysql> explain select * from t1 where t1.a < (select /*+ NO_DECORRELATE() */ sum(t2.a) from t2 where t2.b = t1.b);
+------------------------------------------+-----------+-----------+---------------------------+--------------------------------------------------------------------------------------+
| id                                       | estRows   | task      | access object             | operator info                                                                        |
+------------------------------------------+-----------+-----------+---------------------------+--------------------------------------------------------------------------------------+
| Projection_10                            | 452865.00 | root      |                           | test.t1.a, test.t1.b                                                                 |
| └─Apply_12                               | 452865.00 | root      |                           | CARTESIAN inner join, other cond:lt(cast(test.t1.a, decimal(10,0) BINARY), Column#7) |
|   ├─TableReader_14(Build)                | 452865.00 | root      |                           | data:TableFullScan_13                                                                |
|   │ └─TableFullScan_13                   | 452865.00 | cop[tikv] | table:t1                  | keep order:false, stats:pseudo                                                       |
|   └─MaxOneRow_15(Probe)                  | 452865.00 | root      |                           |                                                                                      |
|     └─StreamAgg_20                       | 452865.00 | root      |                           | funcs:sum(Column#14)->Column#7                                                       |
|       └─Projection_45                    | 857574.13 | root      |                           | cast(test.t2.a, decimal(10,0) BINARY)->Column#14                                     |
|         └─IndexLookUp_44                 | 857574.13 | root      |                           |                                                                                      |
|           ├─IndexRangeScan_42(Build)     | 857574.13 | cop[tikv] | table:t2, index:idx_t2(b) | range: decided by [eq(test.t2.b, test.t1.b)], keep order:false                       |
|           └─TableRowIDScan_43(Probe)     | 857574.13 | cop[tikv] | table:t2                  | keep order:false                                                                     |
+------------------------------------------+-----------+-----------+---------------------------+--------------------------------------------------------------------------------------+
10 rows in set (0.01 sec)


何时需要关闭 “子查询去关联”


   


那么,是不是所有的关联子查询语句在关联依赖解除优化后性能都能有较大的提升呢?并不是这样。仍然用上面的示例来说,在外部的值比较少的情况下,不解除关联依赖反而可能对执行性能更有帮助。例如,如果 t3 表只有 100 条记录时,以下语句执行耗时仅 0.28 秒

mysql> explain analyze select * from t3 where t3.a = (select /*+ NO_DECORRELATE() */ sum(t2.a) from t2 where t2.b = t3.b);
+------------------------------------------+---------+---------+-----------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+-----------+------+
| id                                       | estRows | actRows | task      | access object             | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | operator info                                                         | memory    | disk |
+------------------------------------------+---------+---------+-----------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+-----------+------+
| Projection_11                            | 100.00  | 0       | root      |                           | time:277.9ms, loops:1, RU:117.173904, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | test.t3.a, test.t3.b                                                  | 3.55 KB   | N/A  |
| └─Apply_13                               | 100.00  | 0       | root      |                           | time:277.9ms, loops:1, concurrency:OFF, cache:ON, cacheHitRatio:0.000%                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | inner join, equal:[eq(Column#8, Column#7)]                            | 900 Bytes | N/A  |
|   ├─Projection_14(Build)                 | 100.00  | 100     | root      |                           | time:1.61ms, loops:2, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | test.t3.a, test.t3.b, cast(test.t3.a, decimal(10,0BINARY)->Column#8 | 1.82 KB   | N/A  |
|   │ └─TableReader_16                     | 100.00  | 100     | root      |                           | time:1.56ms, loops:2, cop_task: {num1max1.36ms, proc_keys: 100, tot_proc: 333.9µs, tot_wait: 146.6µs, copr_cache_hit_ratio: 0.00, build_task_duration: 18.3µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:1.32ms}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | data:TableFullScan_15                                                 | 1.86 KB   | N/A  |
|   │   └─TableFullScan_15                 | 100.00  | 100     | cop[tikv] | table:t3                  | tikv_task:{time:0s, loops:3}, scan_detail: {total_process_keys: 100, total_process_keys_size: 4400, total_keys: 101, get_snapshot_time: 46.4µs, rocksdb: {delete_skipped_count: 100, key_skipped_count: 200block: {cache_hit_count: 4}}}, time_detail: {total_process_time: 333.9µs, total_wait_time: 146.6µs, tikv_wall_time: 730.8µs}                                                                                                                                                                                                                                                                                                                                                                                                 | keep order:false, stats:pseudo                                        | N/A       | N/A  |
|   └─MaxOneRow_17(Probe)                  | 100.00  | 100     | root      |                           | time:265.2ms, loops:200                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                       | N/A       | N/A  |
|     └─StreamAgg_22                       | 100.00  | 100     | root      |                           | time:264.7ms, loops:200                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | funcs:sum(Column#16)->Column#7                                        | 8.75 KB   | N/A  |
|       └─Projection_47                    | 189.37  | 178     | root      |                           | time:264ms, loops:176, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | cast(test.t2.a, decimal(10,0BINARY)->Column#16                      | 8.61 KB   | N/A  |
|         └─IndexLookUp_46                 | 189.37  | 178     | root      |                           | time:263.1ms, loops:176, index_task: {total_time: 78.3ms, fetch_handle: 77.5ms, build124.7µs, wait629µs}, table_task: {total_time: 144.4ms, num76, concurrency: 5}, next: {wait_index: 90ms, wait_table_lookup_build: 7.42ms, wait_table_lookup_resp: 136.1ms}                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                       | 26.4 KB   | N/A  |
|           ├─IndexRangeScan_44(Build)     | 189.37  | 178     | cop[tikv] | table:t2, index:idx_t2(b) | time:100ms, loops:252, cop_task: {num100max1.38ms, min596.8µs, avg880.7µs, p95: 1.2ms, max_proc_keys: 8, p95_proc_keys: 5, tot_proc: 13.5ms, tot_wait: 9.62ms, copr_cache_hit_ratio: 0.00, build_task_duration: 2.07ms, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:100, total_time:85.1ms}}, tikv_task:{proc max:10ms, min:0s, avg300µs, p80:0s, p95:0s, iters:100, tasks:100}, scan_detail: {total_process_keys: 178, total_process_keys_size: 8188, total_keys: 278, get_snapshot_time: 3.17ms, rocksdb: {key_skipped_count: 178block: {cache_hit_count: 465}}}, time_detail: {total_process_time: 13.5ms, total_wait_time: 9.62ms, total_kv_read_wall_time: 30ms, tikv_wall_time: 41.8ms}                       | range: decided by [eq(test.t2.b, test.t3.b)], keep order:false        | N/A       | N/A  |
|           └─TableRowIDScan_45(Probe)     | 189.37  | 178     | cop[tikv] | table:t2                  | time:133.3ms, loops:152, cop_task: {num159max9.38ms, min0s, avg753.7µs, p95: 1.13ms, max_proc_keys: 2, p95_proc_keys: 2, tot_proc: 18.6ms, tot_wait: 24.3ms, copr_cache_hit_ratio: 0.00, build_task_duration: 3.5ms, max_distsql_concurrency: 1, max_extra_concurrency: 1, store_batch_num: 34}, rpc_info:{Cop:{num_rpc:125, total_time:117ms}}, tikv_task:{proc max:10ms, min:0s, avg62.9µs, p80:0s, p95:0s, iters:159, tasks:159}, scan_detail: {total_process_keys: 178, total_process_keys_size: 8366, total_keys: 178, get_snapshot_time: 5.14ms, rocksdb: {block: {cache_hit_count: 798}}}, time_detail: {total_process_time: 18.6ms, total_wait_time: 24.3ms, total_kv_read_wall_time: 10ms, tikv_wall_time: 63.3ms}   | keep order:false                                                      | N/A       | N/A  |
+------------------------------------------+---------+---------+-----------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------+-----------+------+
11 rows in set (0.28 sec)


而如果走 TiDB 默认的解除关联依赖时,耗时约 5 秒,且存在 OOM 的风险。
mysql> explain analyze select * from t3 where t3.a = (select sum(t2.a) from t2 where t2.b = t3.b);
ERROR 8175 (HY000): Your query has been cancelled due to exceeding the allowed memory limit for a single SQL query. Please try narrowing your query scope or increase the tidb_mem_quota_query limit and try again.[conn=480325804]
mysql> set tidb_mem_quota_query=10737418240;
Query OK, 0 rows affected (0.00 sec)

mysql> explain analyze select * from t3 where t3.a = (select sum(t2.a) from t2 where t2.b = t3.b);
+--------------------------------+-------------+----------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
| id                             | estRows     | actRows  | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | operator info                                                                            | memory  | disk    |
+--------------------------------+-------------+----------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
| HashJoin_13                    | 99.90       | 0        | root      |               | time:5.11s, loops:1, RU:11889.809066, build_hash_table:{total:1.48ms, fetch:1.36ms, build:122.2µs}, probe:{concurrency:5, total:25.5s, max:5.11s, probe:1.44s, fetch and wait:24.1s}                                                                                                                                                                                                                                                                                             | inner join, equal:[eq(test.t3.b, test.t2.b) eq(Column#8, Column#7)]                      | 46.0 KB | 0 Bytes |
| ├─Projection_14(Build)         | 99.90       | 100      | root      |               | time:1.26ms, loops:2, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                            | test.t3.a, test.t3.b, cast(test.t3.a, decimal(10,0) BINARY)->Column#8                    | 2.52 KB | N/A     |
| │ └─TableReader_17             | 99.90       | 100      | root      |               | time:1.22ms, loops:2, cop_task: {num: 1, max: 1.38ms, proc_keys: 100, tot_proc: 353µs, tot_wait: 129.8µs, copr_cache_hit_ratio: 0.00, build_task_duration: 24.3µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:1.33ms}}                                                                                                                                                                                                                                    | data:Selection_16                                                                        | 1.87 KB | N/A     |
| │   └─Selection_16             | 99.90       | 100      | cop[tikv] |               | tikv_task:{time:0s, loops:3}, scan_detail: {total_process_keys: 100, total_process_keys_size: 4400, total_keys: 101, get_snapshot_time: 49.9µs, rocksdb: {delete_skipped_count: 100, key_skipped_count: 200, block: {cache_hit_count: 4}}}, time_detail: {total_process_time: 353µs, total_wait_time: 129.8µs, tikv_wall_time: 716.5µs}                                                                                                                                          | not(isnull(test.t3.b))                                                                   | N/A     | N/A     |
| │     └─TableFullScan_15       | 100.00      | 100      | cop[tikv] | table:t3      | tikv_task:{time:0s, loops:3}                                                                                                                                                                                                                                                                                                                                                                                                                                                     | keep order:false, stats:pseudo                                                           | N/A     | N/A     |
| └─HashAgg_28(Probe)            | 5280768.00  | 5343857  | root      |               | time:5.09s, loops:5222, partial_worker:{wall_time:2.494534938s, concurrency:5, task_num:285, tot_wait:215.654904ms, tot_exec:12.25579073s, tot_time:12.47193031s, max:2.494394166s, p95:2.494394166s}, final_worker:{wall_time:5.107395873s, concurrency:5, task_num:25, tot_wait:70.452µs, tot_exec:8.250691502s, tot_time:24.968242542s, max:5.107349993s, p95:5.107349993s}                                                                                                   | group by:test.t2.b, funcs:sum(Column#11)->Column#7, funcs:firstrow(test.t2.b)->test.t2.b | 1.95 GB | 0 Bytes |
|   └─TableReader_29             | 5280768.00  | 9972967  | root      |               | time:225.2ms, loops:286, cop_task: {num: 285, max: 157.9ms, min: 470.7µs, avg: 55.1ms, p95: 83.7ms, max_proc_keys: 50176, p95_proc_keys: 50176, tot_proc: 13.8s, tot_wait: 51ms, copr_cache_hit_ratio: 0.04, build_task_duration: 41.8µs, max_distsql_concurrency: 11}, rpc_info:{Cop:{num_rpc:285, total_time:15.7s}}                                                                                                                                                           | data:HashAgg_21                                                                          | 25.2 MB | N/A     |
|     └─HashAgg_21               | 5280768.00  | 9972967  | cop[tikv] |               | tikv_task:{proc max:90ms, min:0s, avg: 46.7ms, p80:70ms, p95:70ms, iters:9769, tasks:285}, scan_detail: {total_process_keys: 9982592, total_process_keys_size: 469050751, total_keys: 9982867, get_snapshot_time: 19.6ms, rocksdb: {delete_skipped_count: 874, key_skipped_count: 9983466, block: {cache_hit_count: 18229}}}, time_detail: {total_process_time: 13.8s, total_suspend_time: 37.8ms, total_wait_time: 51ms, total_kv_read_wall_time: 9.13s, tikv_wall_time: 14.3s} | group by:test.t2.b, funcs:sum(test.t2.a)->Column#11                                      | N/A     | N/A     |
|       └─Selection_27           | 10000000.00 | 10000000 | cop[tikv] |               | tikv_task:{proc max:70ms, min:0s, avg: 33.2ms, p80:50ms, p95:60ms, iters:9769, tasks:285}                                                                                                                                                                                                                                                                                                                                                                                        | not(isnull(test.t2.b))                                                                   | N/A     | N/A     |
|         └─TableFullScan_26     | 10000000.00 | 10000000 | cop[tikv] | table:t2      | tikv_task:{proc max:70ms, min:0s, avg: 32ms, p80:50ms, p95:60ms, iters:9769, tasks:285}                                                                                                                                                                                                                                                                                                                                                                                          | keep order:false                                                                         | N/A     | N/A     |
+--------------------------------+-------------+----------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+---------+---------+
10 rows in set (5.12 sec)


全局关闭 “子查询去关联”


   


/*+ NO_DECORRELATE() */ HINT 是一种针对具体 SQL 级别来关闭子查询去关联的方式,TiDB 也提供另外一种可以全局关闭的方式,它是直接将子查询去关联的规则加入到黑名单中。关于优化规则黑名单的使用,可以参考 TiDB 官网文档 https://docs.pingcap.com/zh/tidb/v7.5/blocklist-control-plan。如果希望全局关闭子查询去关联的功能,需要执行以下命令,此后不需要使用 /*+ NO_DECORRELATE() */ 这个 HINT 我们也同样可以获得具有子查询关联的执行计划。

insert into mysql.opt_rule_blacklist values("decorrelate");
admin reload opt_rule_blacklist;


https://img1.sycdn.imooc.com/67483902000103d810800200.jpg

TiDB 优化器针对 semi join 半连接默认会转换为 HashJoin 或 IndexJoin 以获得更好的性能,然而在某些特殊的场景下默认选择的 HashJoin 或 IndexJoin 可能仍然达不到最佳的性能。TiDB 提供的 SEMI_JOIN_REWRITE() 可以扩大选择范围,让优化器选择更好的执行方式,从而在性能上达到几倍的性能。


TiDB 默认针对关联子查询有自动解除关联依赖的优化操作,当涉及查询条件的数据量较大时,将重复执行无数次的子查询转化为只需执行一次,从而大大缩减了执行耗时。然而,如果满足查询条件的数据量很小的话,这种优化方式可能会造成相反的效果,此时我们则需要借助 NO_DECORRELATE() 来关闭 “子查询去关联” 的优化,也可以通过全局添加规则黑名单的方式实现全局控制。




点击查看更多内容
TA 点赞

若觉得本文不错,就分享一下吧!

评论

作者其他优质文章

正在加载中
数据库工程师
手记
粉丝
61
获赞与收藏
84

关注作者,订阅最新文章

阅读免费教程

  • 推荐
  • 评论
  • 收藏
  • 共同学习,写下你的评论
感谢您的支持,我会继续努力的~
扫码打赏,你说多少就多少
赞赏金额会直接到老师账户
支付方式
打开微信扫一扫,即可进行扫码打赏哦
今天注册有机会得

100积分直接送

付费专栏免费学

大额优惠券免费领

立即参与 放弃机会
意见反馈 帮助中心 APP下载
官方微信

举报

0/150
提交
取消