tikv节点无法全部up

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【概述】
集群配置为最简配置,使用mysql原生的load csv方法将一个约6亿条数据的csv导入,导入了5亿条数据后提示Transaction too large,根据相关资料设置txn-size为10g后重启集群,kv节点出现部分down
尝试手动重启kv,提示successfully,但实际上依然为down


集群状态如图

对应的grafana如图

尝试过多次重启,每次重启后kv节点状态都不相同,但发现规律为leader为0的节点可以up,其余节点则会直接down或在disconnected 30分钟后再down

大致看了一下日志,也没有发现任何报错ERROR

【背景】做过哪些操作

【现象】业务和数据库现象

【业务影响】

【TiDB 版本】
v4.0.6

【附件】
日志如链接
链接: 百度网盘-链接不存在 密码: fbp9

store如图:
store
{
“count”: 3,
“stores”: [
{
“store”: {
“id”: 1,
“address”: “10.12.5.107:20160”,
“version”: “4.0.6”,
“status_address”: “10.12.5.107:20180”,
“git_hash”: “ca2475bfbcb49a7c34cf783596acb3edd05fc88f”,
“start_timestamp”: 1623638194,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1623637902297399861,
“state_name”: “Down”
},
“status”: {
“capacity”: “320TiB”,
“available”: “289.8TiB”,
“used_size”: “673.5GiB”,
“leader_count”: 0,
“leader_weight”: 1,
“leader_score”: 0,
“leader_size”: 0,
“region_count”: 14688,
“region_weight”: 1,
“region_score”: 268944,
“region_size”: 268944,
“start_ts”: “2021-06-13T22:36:34-04:00”,
“last_heartbeat_ts”: “2021-06-13T22:31:42.297399861-04:00”
}
},
{
“store”: {
“id”: 4,
“address”: “10.12.5.105:20160”,
“version”: “4.0.6”,
“status_address”: “10.12.5.105:20180”,
“git_hash”: “ca2475bfbcb49a7c34cf783596acb3edd05fc88f”,
“start_timestamp”: 1623638193,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1623634031896099478,
“state_name”: “Down”
},
“status”: {
“capacity”: “320TiB”,
“available”: “289.8TiB”,
“used_size”: “673.8GiB”,
“leader_count”: 1439,
“leader_weight”: 1,
“leader_score”: 1439,
“leader_size”: 133695,
“region_count”: 14688,
“region_weight”: 1,
“region_score”: 268944,
“region_size”: 268944,
“start_ts”: “2021-06-13T22:36:33-04:00”,
“last_heartbeat_ts”: “2021-06-13T21:27:11.896099478-04:00”
}
},
{
“store”: {
“id”: 5,
“address”: “10.12.5.106:20160”,
“version”: “4.0.6”,
“status_address”: “10.12.5.106:20180”,
“git_hash”: “ca2475bfbcb49a7c34cf783596acb3edd05fc88f”,
“start_timestamp”: 1623638213,
“deploy_path”: “/tidb-deploy/tikv-20160/bin”,
“last_heartbeat”: 1623642338014175666,
“state_name”: “Up”
},
“status”: {
“capacity”: “320TiB”,
“available”: “289.8TiB”,
“used_size”: “673.8GiB”,
“leader_count”: 1448,
“leader_weight”: 1,
“leader_score”: 1448,
“leader_size”: 135249,
“region_count”: 14688,
“region_weight”: 1,
“region_score”: 268944,
“region_size”: 268944,
“start_ts”: “2021-06-13T22:36:53-04:00”,
“last_heartbeat_ts”: “2021-06-13T23:45:38.014175666-04:00”,
“uptime”: “1h8m45.014175666s”
}
}
]
}

  1. TiUP Cluster Display 信息

  2. TiUP Cluster Edit Config 信息

  3. TiDB- Overview 监控

  • 对应模块日志(包含问题前后1小时日志)

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

再重启一下集群试试?日志显示是节点之间连不上,所以想再启动一下看看结果如何

没做什么操作,现在看突然就好了,感谢

:call_me_hand::call_me_hand:应该是 系统自动拉起了

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。