Left Join的ON中带between的条件时报 Out Of Memeory

【 TiDB 使用环境】生产环境
【 TiDB 版本】6.5.1
【表结构】

CREATE TABLE `result_product_brand_dimension` (
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id',
    `shop_id` int(10) unsigned NOT NULL DEFAULT '0' COMMENT '店铺id',
    `num_iid` varchar(100) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '' COMMENT '平台商品id',
    `product_no` varchar(100) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '' COMMENT '货号',
    `start_date` date NOT NULL COMMENT '开始日期',
    `end_date` date NOT NULL DEFAULT '3000-01-01' COMMENT '结束日期',
    `add_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '系统新增时间',
    `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '系统更新时间',
    PRIMARY KEY (`shop_id`,`num_iid`,`start_date`) /*T![clustered_index] CLUSTERED */,
    UNIQUE KEY `idx_uniq_id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci AUTO_INCREMENT=1 COMMENT='num_iid品牌维度表'
CREATE TABLE `result_shop_item_day` (
    `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增id',
    `shop_id` int(11) NOT NULL DEFAULT '0' COMMENT '店铺id',
    `day` date NOT NULL DEFAULT '1970-01-01' COMMENT 'day',
    `num_iid` varchar(100) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '' COMMENT '平台商品id',
    PRIMARY KEY (`shop_id`,`day`,`num_iid`) /*T![clustered_index] CLUSTERED */,
    UNIQUE KEY `uniq__id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci AUTO_INCREMENT=1 COMMENT='本店商品日宽表'

【查询sql】

EXPLAIN ANALYZE SELECT a.shop_id
FROM result_shop_item_day a
    LEFT JOIN result_product_brand_dimension c ON a.shop_id=c.shop_id AND a.num_iid=c.num_iid AND a.day BETWEEN c.start_date AND c.end_date
WHERE a.shop_id IN (12,16) AND a.day BETWEEN '2024-07-03' AND '2024-08-01' LIMIT 4000

【问题】当limit 3000的时候,sql能正常执行,但是当是5000的时候,就会报异常
【附件:截图/日志/监控】
正常执行时的执行计划截图(LIMIT 3000):

执行报错时的截图:

【详细的执行计划】
*************************** 1. row ***************************
id: Limit_15
estRows: 3000.00
actRows: 3000
task: root
access object:
execution info: time:100.9ms, loops:4
operator info: offset:0, count:3000
memory: N/A
disk: N/A
*************************** 2. row ***************************
id: └─IndexJoin_22
estRows: 3000.00
actRows: 3000
task: root
access object:
execution info: time:100.9ms, loops:3, inner:{total:124.3ms, concurrency:5, task:3, construct:3.47ms, fetch:107ms, build:13.8ms}, probe:9ms
operator info: left outer join, inner:TableReader_18, outer key:datacenter.result_shop_item_day.shop_id, datacenter.result_shop_item_day.num_iid, inner key:datacenter.result_product_brand_dimension.shop_id, datacenter.result_product_brand_dimension.num_iid, equal cond:eq(datacenter.result_shop_item_day.num_iid, datacenter.result_product_brand_dimension.num_iid), eq(datacenter.result_shop_item_day.shop_id, datacenter.result_product_brand_dimension.shop_id), other cond:ge(datacenter.result_shop_item_day.day, datacenter.result_product_brand_dimension.start_date), le(datacenter.result_shop_item_day.day, datacenter.result_product_brand_dimension.end_date)
memory: 1.28 GB
disk: N/A
*************************** 3. row ***************************
id: ├─Limit_29(Build)
estRows: 3000.00
actRows: 3000
task: root
access object:
execution info: time:24.6ms, loops:6
operator info: offset:0, count:3000
memory: N/A
disk: N/A
*************************** 4. row ***************************
id: │ └─TableReader_33
estRows: 3000.00
actRows: 3712
task: root
access object:
execution info: time:24.6ms, loops:4, cop_task: {num: 4, max: 20.2ms, min: 922µs, avg: 6.18ms, p95: 20.2ms, max_proc_keys: 2016, p95_proc_keys: 2016, tot_proc: 4ms, tot_wait: 16ms, rpc_num: 4, rpc_time: 24.7ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 1}
operator info: data:Limit_32
memory: 177.1 KB
disk: N/A
*************************** 5. row ***************************
id: │ └─Limit_32
estRows: 3000.00
actRows: 3712
task: cop[tikv]
access object:
execution info: tikv_task:{proc max:3ms, min:0s, avg: 1ms, p80:3ms, p95:3ms, iters:18, tasks:4}, scan_detail: {total_process_keys: 3712, total_process_keys_size: 1134088, total_keys: 3716, get_snapshot_time: 16.7ms, rocksdb: {key_skipped_count: 3712, block: {cache_hit_count: 73, read_count: 4, read_byte: 12.0 KB, read_time: 2.28ms}}}
operator info: offset:0, count:3000
memory: N/A
disk: N/A
*************************** 6. row ***************************
id: │ └─TableRangeScan_31
estRows: 16531.54
actRows: 3712
task: cop[tikv]
access object: table:a
execution info: tikv_task:{proc max:3ms, min:0s, avg: 1ms, p80:3ms, p95:3ms, iters:18, tasks:4}
operator info: range:[12 2024-07-03,12 2024-08-01], [16 2024-07-03,16 2024-08-01], keep order:false
memory: N/A
disk: N/A
*************************** 7. row ***************************
id: └─TableReader_18(Probe)
estRows: 1072.85
actRows: 32620
task: root
access object:
execution info: time:83.3ms, loops:37, cop_task: {num: 75, max: 51.8ms, min: 368.6µs, avg: 3.25ms, p95: 9.1ms, max_proc_keys: 480, p95_proc_keys: 381, tot_proc: 82ms, tot_wait: 77ms, rpc_num: 75, rpc_time: 243.3ms, copr_cache_hit_ratio: 0.80, distsql_concurrency: 15}
operator info: data:Selection_17
memory: N/A
disk: N/A
*************************** 8. row ***************************
id: └─Selection_17
estRows: 1072.85
actRows: 32620
task: cop[tikv]
access object:
execution info: tikv_task:{proc max:100ms, min:1ms, avg: 24.1ms, p80:39ms, p95:66ms, iters:284, tasks:75}, scan_detail: {total_process_keys: 4360, total_process_keys_size: 1209086, total_keys: 22009, get_snapshot_time: 91.7ms, rocksdb: {delete_skipped_count: 157, key_skipped_count: 20499, block: {cache_hit_count: 6368, read_count: 120, read_byte: 947.4 KB, read_time: 45.3ms}}}
operator info: in(datacenter.result_product_brand_dimension.shop_id, 12, 16)
memory: N/A
disk: N/A
*************************** 9. row ***************************
id: └─TableRangeScan_16
estRows: 3000.00
actRows: 32620
task: cop[tikv]
access object: table:c
execution info: tikv_task:{proc max:100ms, min:1ms, avg: 24.1ms, p80:39ms, p95:65ms, iters:284, tasks:75}
operator info: range: decided by [eq(datacenter.result_product_brand_dimension.shop_id, datacenter.result_shop_item_day.shop_id) eq(datacenter.result_product_brand_dimension.num_iid, datacenter.result_shop_item_day.num_iid) ge(datacenter.result_shop_item_day.day, datacenter.result_product_brand_dimension.start_date)], keep order:false
memory: N/A
disk: N/A
9 rows in set (0.10 sec)

【初步排查】
基本上问题出在关联管子中带了 BETWEEN条件,后来查看了代码,发现代码逻辑中有这一行,感觉是有问题的

【诉求】
需要帮忙确认下最终原因,以及在后续版本问题是否有修正

1 个赞

执行计划结果没贴啊

调了下代码结构,贴了执行计划

https://github.com/pingcap/tidb/pull/47795 是个已知问题,后续版本已经修复了

1 个赞

报错内存不足

issue一直打不开,所以想问下,这个问题修复是在哪个版本?
然后我这边要解决这个问题的话,需要做什么处理?升级版本?

升级吧,v6.5.6 开始应该就修了,推荐直接升级到最新的 v6.5.x

此话题已在最后回复的 60 天后被自动关闭。不再允许新回复。