【 TiDB 使用环境】生产环境 /测试/ Poc
【 TiDB 版本】7.5.1
【复现路径】做过哪些操作出现的问题
【遇到的问题:问题现象及影响】
【资源配置】
SELECT * FROM TABLE_STORAGE_STATS
WHERE table_schema = ‘dataset_task’ AND table_name = ‘ptask’
我理解TABLE_STORAGE_STATS这里的数据大小是压缩前的数据量大小;
我理解的对么?每张表压缩后的大小在哪里可以取到
https://docs.pingcap.com/zh/tidb/stable/information-schema-table-storage-stats#table_storage_stats
TABLE_STORAGE_STATS表提供有关由存储引擎 (TiKV) 存储的表大小的信息。
这里说是从tikv获得的存储大小。而tikv的数据已经是压缩后的大小了。
参考这个sql ,用压缩比估算,一般来说压缩比3左右
SELECT
db_name,
table_name,
ROUND(SUM(total_size / cnt), 2) Approximate_Size,
ROUND(SUM(total_size / cnt / (SELECT
ROUND(AVG(VALUE), 2)
FROM
METRICS_SCHEMA.store_size_amplification
WHERE
VALUE > 0)),
2) Disk_Size
FROM
(SELECT
db_name,
table_name,
region_id,
SUM(Approximate_Size) total_size,
COUNT(*) cnt
FROM
information_schema.TIKV_REGION_STATUS
WHERE
db_name = ‘sbtest’
AND table_name IN (‘sbtest1’)
GROUP BY db_name , table_name , region_id) tabinfo
GROUP BY db_name , table_name;
这个应该是压缩前的数据:
mysql> SELECT
→ db_name,
→ table_name,
→ ROUND(SUM(total_size / cnt), 2) Approximate_Size,
→ ROUND(SUM(total_size / cnt / (SELECT
→ ROUND(AVG(VALUE), 2)
→ FROM
→ METRICS_SCHEMA.store_size_amplification
→ WHERE
→ VALUE > 0)),
→ 2) Disk_Size
→ FROM
→ (SELECT
→ db_name,
→ table_name,
→ region_id,
→ SUM(Approximate_Size) total_size,
→ COUNT(*) cnt
→ FROM
→ information_schema.TIKV_REGION_STATUS
→ WHERE
→ db_name = ‘insurance_test’
→ AND table_name IN (‘sbtest3’)
→ GROUP BY db_name , table_name , region_id) tabinfo
→ GROUP BY db_name , table_name;
±---------------±-----------±-----------------±----------+
| db_name | table_name | Approximate_Size | Disk_Size |
±---------------±-----------±-----------------±----------+
| insurance_test | sbtest3 | 297.00 | 191.61 |
±---------------±-----------±-----------------±----------+
1 row in set (0.03 sec)
mysql> select sum(table_size) from table_storage_stats where table_schema=‘insurance_test’ and table_name=‘sbtest3’;
±----------------+
| sum(table_size) |
±----------------+
| 297 |
±----------------+
1 row in set (0.00 sec)
这个语句可以查询压缩后的大小 按照索引区分的 用ai稍微改改 可以查出来你想要的所有数据 压缩比是按照每个region是所在的tikv节点的压缩率计算的 虽然不是非常精准 但是准确度应该已经挺高的了
SELECT
r.db_name,
r.table_name,
r.is_index,
r.index_name,
ROUND(SUM(r.approximate_size / NULLIF(COALESCE(m.value, 1), 0)), 2) AS total_real_size_MB
FROM
information_schema.tikv_region_status r
JOIN information_schema.tikv_region_peers p ON r.region_id = p.region_id
LEFT JOIN (
SELECT *
FROM METRICS_SCHEMA.store_size_amplification a
WHERE value > 0
AND time = (
SELECT MAX(time)
FROM METRICS_SCHEMA.store_size_amplification b
WHERE b.store = a.store AND b.value > 0
)
) m ON p.store_id = m.store
WHERE
r.db_name like 'xxx_%'
GROUP BY
r.db_name, r.table_name, r.is_index, r.index_name
ORDER BY
total_real_size_MB DESC;
dashboard有没有这方面的视图显示
– 逻辑大小(压缩前)
SELECT
total_size AS logical_size_bytes
FROM information_schema.table_storage_stats
WHERE table_schema = ‘dataset_task’ AND table_name = ‘ptask’;
– TiKV 物理大小(压缩后)
SELECT
SUM(approximate_size) * 1024 * 1024 AS tikv_physical_size_bytes
FROM information_schema.tikv_region_status
WHERE db_name = ‘dataset_task’ AND table_name = ‘ptask’;
– TiFlash 物理大小(压缩后)
SELECT
SUM(total_bytes) AS tiflash_physical_size_bytes
FROM information_schema.tiflash_table_size
WHERE database_name = ‘dataset_task’ AND table_name = ‘ptask’;
INFORMATION_SCHEMA.TABLE_STORAGE_STATS吗? 这个是不是要权限