【操作指南】AWS_tidb-部署_缩容tidb

【TiDB 使用环境】生产环境 /测试/ Poc
【TiDB 版本】8.5.4
【操作系统】 aws linux 2023
【部署方式】AWS
【集群数据量】 30G
【集群节点数】8

—这次打算将两个tidb-server减少为一个,以便安装一个CDC模块;

  1. 查看节点 ID 信息
    [ec2-user@tidb-adm ~]$ tiup cluster display tidb-test
    Cluster type: tidb
    Cluster name: tidb-test
    Cluster version: v8.5.4
    Deploy user: tidb
    SSH type: builtin
    Dashboard URL: http://172.244.0.4:2379/dashboard
    Dashboard URLs: http://172.244.0.4:2379/dashboard
    Grafana URL: http://172.244.0.76:3000
    ID Role Host Ports OS/Arch Status Data Dir Deploy Dir

172.244.0.76:9093 alertmanager 172.244.0.76 9093/9094 linux/x86_64 Up /tidb-data/alertmanager-9093 /tidb-deploy/alertmanager-9093
172.244.0.83:8300 cdc 172.244.0.83 8300 linux/x86_64 Up /tidb-data/cdc-8300 /tidb-deploy/cdc-8300
172.244.0.76:3000 grafana 172.244.0.76 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
172.244.0.109:2379 pd 172.244.0.109 2379/2380 linux/x86_64 Up|L /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.4:2379 pd 172.244.0.4 2379/2380 linux/x86_64 Up|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.62:2379 pd 172.244.0.62 2379/2380 linux/x86_64 Up /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.76:9090 prometheus 172.244.0.76 9090/9115/9100/12020 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
172.244.0.33:4000 tidb 172.244.0.33 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.244.0.83:4000 tidb 172.244.0.83 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.244.0.99:9000 tiflash 172.244.0.99 9000/3930/20170/20292/8234/8123 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000
172.244.0.122:20160 tikv 172.244.0.122 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.244.0.72:20160 tikv 172.244.0.72 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.244.0.95:20160 tikv 172.244.0.95 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
Total nodes: 13
[ec2-user@tidb-adm ~]$

  1. 执行缩容操作
    tiup cluster scale-in tidb-test --node 172.244.0.83:4000
    其中 --node 参数为需要下线节点的 ID,4000为tidb端口

预期输出 Scaled cluster tidb-test in successfully 信息,表示缩容操作成功。
[ec2-user@tidb-adm ~]$ tiup cluster scale-in tidb-test --node 172.244.0.83:4000
This operation will delete the 172.244.0.83:4000 nodes in tidb-test and all their data.
Do you want to continue? [y/N]:(default=N) y
Scale-in nodes…

  • [ Serial ] - SSHKeySet: privateKey=/home/ec2-user/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa, publicKey=/home/ec2-user/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.72
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.95
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.4
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.62
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.33
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.99
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.122
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.83
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.83
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.109
  • [ Serial ] - ClusterOperate: operation=DestroyOperation, options={Roles: Nodes:[172.244.0.83:4000] Force:false SSHTimeout:5 OptTimeout:120 APITimeout:600 IgnoreConfigCheck:false NativeSSH:false SSHType: Concurrency:5 SSHProxyHost: SSHProxyPort:22 SSHProxyUser:ec2-user SSHProxyIdentity:/home/ec2-user/.ssh/id_rsa SSHProxyUsePassword:false SSHProxyTimeout:5 SSHCustomScripts:{BeforeRestartInstance:{Raw:} AfterRestartInstance:{Raw:}} CleanupData:false CleanupLog:falseCleanupAuditLog:false RetainDataRoles: RetainDataNodes: DisplayMode:default Operation:StartOperation}
    Stopping component tidb
    Stopping instance 172.244.0.83
    Stop tidb 172.244.0.83:4000 success
    Destroying component tidb
    Destroying instance 172.244.0.83
    Destroy 172.244.0.83 finished
  • Destroy tidb paths: [/tidb-deploy/tidb-4000/log /tidb-deploy/tidb-4000 /etc/systemd/system/tidb-4000.service]
  • [ Serial ] - UpdateMeta: cluster=tidb-test, deleted='172.244.0.83:4000'
  • [ Serial ] - UpdateTopology: cluster=tidb-test
  • [ Serial ] - SSHKeySet: privateKey=/home/ec2-user/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa, publicKey=/home/ec2-user/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.72
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.95
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.33
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.99
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.83
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.76
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.4
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.109
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.62
  • [Parallel] - UserSSH: user=tidb, host=172.244.0.122
  • Refresh instance configs
    • Generate config pd → 172.244.0.4:2379 … Done
    • Generate config pd → 172.244.0.109:2379 … Done
    • Generate config pd → 172.244.0.62:2379 … Done
    • Generate config tikv → 172.244.0.122:20160 … Done
    • Generate config tikv → 172.244.0.72:20160 … Done
    • Generate config tikv → 172.244.0.95:20160 … Done
    • Generate config tidb → 172.244.0.33:4000 … Done
    • Generate config tiflash → 172.244.0.99:9000 … Done
    • Generate config cdc → 172.244.0.83:8300 … Done
    • Generate config prometheus → 172.244.0.76:9090 … Done
    • Generate config grafana → 172.244.0.76:3000 … Done
    • Generate config alertmanager → 172.244.0.76:9093 … Done
  • Reload prometheus and grafana
    • Reload prometheus → 172.244.0.76:9090 … Done
    • Reload grafana → 172.244.0.76:3000 … Done
      Scaled cluster tidb-test in successfully
      [ec2-user@tidb-adm ~]$
      [ec2-user@tidb-adm ~]$ tiup cluster display tidb-test
      Cluster type: tidb
      Cluster name: tidb-test
      Cluster version: v8.5.4
      Deploy user: tidb
      SSH type: builtin
      Dashboard URL: http://172.244.0.4:2379/dashboard
      Dashboard URLs: http://172.244.0.4:2379/dashboard
      Grafana URL: http://172.244.0.76:3000
      ID Role Host Ports OS/Arch Status Data Dir Deploy Dir

172.244.0.76:9093 alertmanager 172.244.0.76 9093/9094 linux/x86_64 Up /tidb-data/alertmanager-9093 /tidb-deploy/alertmanager-9093
172.244.0.83:8300 cdc 172.244.0.83 8300 linux/x86_64 Up /tidb-data/cdc-8300 /tidb-deploy/cdc-8300
172.244.0.76:3000 grafana 172.244.0.76 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000
172.244.0.109:2379 pd 172.244.0.109 2379/2380 linux/x86_64 Up|L /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.4:2379 pd 172.244.0.4 2379/2380 linux/x86_64 Up|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.62:2379 pd 172.244.0.62 2379/2380 linux/x86_64 Up /tidb-data/pd-2379 /tidb-deploy/pd-2379
172.244.0.76:9090 prometheus 172.244.0.76 9090/9115/9100/12020 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090
172.244.0.33:4000 tidb 172.244.0.33 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000
172.244.0.99:9000 tiflash 172.244.0.99 9000/3930/20170/20292/8234/8123 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000
172.244.0.122:20160 tikv 172.244.0.122 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.244.0.72:20160 tikv 172.244.0.72 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
172.244.0.95:20160 tikv 172.244.0.95 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
Total nodes: 12
[ec2-user@tidb-adm ~]$

1 个赞

这是操作流程,给大家借鉴的吗?欢迎发到 :earth_asia: 互助交流区

这个分类

好的,下次知道了!

1 个赞

缩容会不会影响业务

楼主的解决方案很巧妙,解决了我的一个疑惑。

2 个赞

新版本生产场景缩容的过程中已有的连接正在执行查询的话能自动迁移到其它节点继续查询吗?

2 个赞

不影响啊 还有一个节点的

2 个赞

手动点赞

2 个赞

AWS 环境 TiDB 缩容(以 TiKV 为例):TiUP 执行 scale-in 命令,待节点变 Tombstone 即完成,避高峰、保副本数≥3,跨 AZ 均衡部署。

1 个赞

看了半天才明白是个操作手册

这是测试还是生产环境操作?恭喜顺利通关!