resource control的sql级别query_limit 配置不生效

艾贝克斯 · 2025 年5 月 22 日 03:13

【TiDB 使用环境】测试环境
【TiDB 版本】v8.1.2
【操作系统】ubuntu 20.04 LTS
【部署方式】机器部署
【集群数据量】12G
【集群节点数】本地服务器部署1monitor+3tiflash/tikv/pd/tidb, 单节点8c32G内存，数据盘data1和data2分别50G，tiflash 2副本
【问题复现路径】
资源控制功能已开启。
1、创建资源组rg_service_user,绑定用户service_user到该资源组，该用户已具有相应数据库的读写权限；

2、配置资源组query_limit规则，进行sql并发压测，都没有在mysql.tidb_runaway_queries表里查到处理记录，所有sql都在正常执行。如果加上WATCH=SIMILAR DURATION=‘1m’，也是一样，没有效果。
ALTER RESOURCE GROUP rg_service_user RU_PER_SEC=10000 BURSTABLE QUERY_LIMIT=(EXEC_ELAPSED=‘2s’, ACTION=KILL);
或配置发现及添加到监控列表：
ALTER RESOURCE GROUP rg_service_user RU_PER_SEC=10000 BURSTABLE QUERY_LIMIT=(EXEC_ELAPSED=‘2s’, ACTION=KILL, WATCH=SIMILAR DURATION=‘1m’);

3、取消QUERY_LIMIT的WATCH=SIMILAR DURATION=‘1m’，继续添加QUERY WATCH,效果是直接把识别到的sql kill掉了，也就是只有手动配置QUERY WATCH，才能达到立即kill的效果(sql执行报错Quarantined and interrupted because of being in runaway watch list)，但是我想只看QUERY_LIMIT能不能控制sql 超时触发条件再被kill：
QUERY WATCH ADD RESOURCE GROUP rg_service_user ACTION KILL SQL TEXT SIMILAR TO “select rel.id, …省略…”;

【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面

【其他附件：截图/日志/监控】

图中sql18是配置了QUERY_LIMIT的sql，集群无压力的情况下，正常执行一次1.7s，在压测上来时，执行时间超过QUERY_LIMIT配置的60s，却还在执行；
查看监控隔离表，也没记录：

资源组信息里，有QUERY_LIMIT配置：

Miracle · 2025 年5 月 22 日 05:32

把BURSTABLE 去了试试

艾贝克斯 · 2025 年5 月 22 日 06:10

ALTER RESOURCE GROUP rg_service_user burstable=false; 去掉了，压测还是没有效果，kill不掉

Miracle · 2025 年5 月 22 日 11:42

感觉是这个issue
https://github.com/pingcap/tidb/issues/51325

艾贝克斯 · 2025 年5 月 22 日 11:55

看着像，那应该是8.1版本的bug了

有猫万事足 · 2025 年5 月 22 日 14:44

还真没合进8.1分支。所以即使你的8.1.2版本晚于这个issue的修复时间，还是没有包含这个修复。

github.com/pingcap/tidb

runaway: add runaway tidb-side time checker

master ← HuSharp:add_tidb_runaway_check

已打开 03:46AM - 29 Jul 24 UTC

HuSharp

+110 -24

### What problem does this PR solve? mark runaway query when processing time …on tidb side exceed executed time Issue Number: close #51325 Problem Summary: ### What changed and how does it work? ```bash mysql> ALTER RESOURCE GROUP default QUERY_LIMIT=(EXEC_ELAPSED='2s', ACTION=KILL, WATCH=EXACT DURATION='10m'); Query OK, 0 rows affected (0.07 sec) mysql> select sleep(3) from t; ERROR 8253 (HY000): Query execution was interrupted, identified as runaway query ``` can find log ```bash [2024/08/07 16:54:49.501 +08:00] [WARN] [expensivequery.go:109] ["runaway query timeout"] [costTime=2.06010825s] [groupName=default] [rule=execElapsedTimeMs:2s] [processInfo="{id:295698438, user:root, host:127.0.0.1:52409, db:test, command:Query, time:2, state:autocommit, info:select sleep(3) from t}"] [2024/08/07 16:54:49.502 +08:00] [INFO] [server.go:896] [kill] [conn=295698438] [query=true] [maxExecutionTime=false] [runawayExceed=true] [2024/08/07 16:54:49.502 +08:00] [WARN] [sqlkiller.go:61] ["kill initiated"] ["connection ID"=295698438] [reason="[executor:8253]Query execution was interrupted, identified as runaway query"] [2024/08/07 16:54:49.503 +08:00] [WARN] [sqlkiller.go:137] ["kill finished"] [conn=295698438] ``` ### Check List Tests - [ ] Unit test - [x] Integration test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test > - [ ] I checked and no code files have been changed. > ### Release note Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note. ```release-note Fix the bug that runaway only counts the time consumed by coprocessor, and add the time checker on the tidb side. ```

从这个修改的文件里面可以看到，他修复了一些文档中的拼写错误。

当我看了一下，8.1.2这个分支上的对应文档里面的拼写错误依然存在。

艾贝克斯 · 2025 年5 月 23 日 01:27

果然是，我看这个issue最终合并到了8.3.0版本里
https://github.com/pingcap/tidb/commits/v8.3.0
我升级到8.3.0再试试

清风明月 · 2025 年5 月 23 日 02:28

升级到新版吧看下呢。

Billmay表妹 · 2025 年5 月 23 日 03:21

你升级到 v8.5.1

艾贝克斯 · 2025 年5 月 23 日 05:09

升级到v8.5.1测试，可以了，有效果。感谢！

system · 2025 年5 月 30 日 05:10

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。