有条SQL执行了2个多小时,一直没有释放。直到手动KILL掉

【TiDB 使用环境】生产环境 /测试/ Poc

WITH   cfg_must as (
                select  cfg.id,cfg.sales_price_adj_sku_id,tpvsc.model_cfg_name_alias,cfg.model_cfg_opt_id as model_cfg_opt_id ,tmco.name as model_cfg_opt_name,cfg.model_cfg_id ,tmc.name as model_cfg_name,cfg.price_amount  from tm_sales_price_adj_sku_cfg  cfg left join tm_model_cfg tmc on cfg.model_cfg_id=tmc.id left join tm_model_cfg_opt tmco on cfg.model_cfg_opt_id =tmco.id
                left join tr_prod_vehicle_sku_cfg tpvsc  on tpvsc.id=cfg.prod_vehicle_sku_cfg_id 
)
select
 sku.id,
 sku.prod_vehicle_sku_id,
 sku.bare_amount,
 sku.bare_tax_amount,
 sku.sales_price_adj_id,
 sku.total_amount ,
 sku.vehicle_model_id,
 tspa.approve_time,
 tspa.effective_start_date,
 tspa.effective_end_date,
 tspa.tax_id,
.........
from tm_sales_price_adj_sku  sku left join tm_sales_price_adj tspa on sku.sales_price_adj_id=tspa.id 
left join  cfg_must cfg1 on sku.id=cfg1.sales_price_adj_sku_id  and cfg1.model_cfg_name ="内饰"
left join  cfg_must cfg2 on sku.id=cfg2.sales_price_adj_sku_id  and cfg2.model_cfg_name ="外饰"
left join  cfg_must cfg3 on sku.id=cfg3.sales_price_adj_sku_id  and cfg3.model_cfg_name ="轮毂"
left join  cfg_must cfg4 on sku.id=cfg4.sales_price_adj_sku_id  and cfg4.model_cfg_name ="轮胎"
left join  cfg_must cfg5 on sku.id=cfg5.sales_price_adj_sku_id  and cfg5.model_cfg_name ="座位数"
left join  cfg_must cfg6 on sku.id=cfg6.sales_price_adj_sku_id  and cfg6.model_cfg_name ="卡钳"
left join  cfg_must cfg7 on sku.id=cfg7.sales_price_adj_sku_id  and cfg7.model_cfg_name ="选装" and cfg7.model_cfg_name_alias  ="选装一"
left join  cfg_must cfg8 on sku.id=cfg8.sales_price_adj_sku_id  and cfg8.model_cfg_name ="选装" and cfg8.model_cfg_name_alias  ="选装二"
left join  cfg_must cfg9 on sku.id=cfg9.sales_price_adj_sku_id  and cfg9.model_cfg_name ="选装" and cfg9.model_cfg_name_alias  ="选装三"
left join  cfg_must cfg10 on sku.id=cfg10.sales_price_adj_sku_id  and cfg10.model_cfg_name ="选装" and cfg10.model_cfg_name_alias  ="选装四"
left join  cfg_must cfg11 on sku.id=cfg11.sales_price_adj_sku_id  and cfg11.model_cfg_name ="选装" and cfg11.model_cfg_name_alias  ="选装五"
left join  cfg_must cfg12 on sku.id=cfg12.sales_price_adj_sku_id  and cfg12.model_cfg_name ="选装" and cfg12.model_cfg_name_alias  ="选装六"
left join  cfg_must cfg13 on sku.id=cfg13.sales_price_adj_sku_id  and cfg13.model_cfg_name ="选装" and cfg13.model_cfg_name_alias  ="选装七"
where tspa.adjust_type =0  group by sku.prod_vehicle_sku_id  order by approve_time  desc

tidb节点的cpu都被打满了

查看日志看到执行超时,但是tidb节点的cpu资源一直没有释放

[txn_mode=PESSIMISTIC] [timestamp=461708067108814870] [err="[executor:1317]Query execution was interrupted\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\t/root/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20241219054535-6b8c588c3122/normalize.go:175\ngithub.com/pingcap/tidb/pkg/util/sqlkiller.(*SQLKiller).getKillError\n\t/workspace/source/tidb/pkg/util/sqlkiller/sqlkiller.go:74\ngithub.com/pingcap/tidb/pkg/util/sqlkiller.(*SQLKiller).HandleSignal\n\t/workspace/source/tidb/pkg/util/s
1 个赞

用执行计划看看,是不是哪里数据发散了?

看看锁信息试试?

1 个赞

这个第一时间,看等待事件

第二步看有没有阻塞

第三步,看锁情况

1 个赞

kill session

query the block session.

看了你的日志,有一个重复13次join,因为缺乏有效索引,导致tidb全表扫描、cpu打满。建议改用条件聚合合并join,并添加关键字段索引。

临时表中的SQL执行的就是全表扫描,就是要拉全部数据

kill了释放 资源

可以杀session

explain auto

看执行计划

大量join必然慢

1 个赞

是啊,超过3个join就应该考虑下表结构是否合适了

表结构,索引都需要重新设计

看执行计划

先看执行计划

收集统计信息