背景
随着业务的变更,可能经常会遇到TiDB数据库的TiKV或TIDB Server节点扩缩容的需求。下面记录了在虚机环境下,完整的TiDBv5.3.0数据库的安装,扩缩容和升级过程,也记录了TiSark的部署过程。给有需要的小伙伴提供个参考。

安装TiDBv5.3.0
为了方便后面的升级测试,选择安装了较低版本的TiDB-v5.3.0。
安装环境
操作系统版本:CentOS 7.9
TiDB数据库版本:TIDB v5.3.0
TIDB数据库安装拓扑:
PD:3节点
TiDB:1节点(这里安装了最小节点数,原因是想后面进行TiDB节点扩容操作)
TiKV:3节点
PS: 以上PD,TiDB和TiKV的最小安装节点个数都是1节点。
系统配置
安装TiDB前要进行一些准备工作,创建安装目录,创建安装用户以及系统参数优化等工作。
挂载数据盘
查看数据盘。/dev/sdb
fdisk -l
创建分区。
parted -s -a optimal /dev/sdb mklabel gpt -- mkpart primary ext4 1 -1
格式化文件系统。
mkfs.ext4 /dev/sdb1
查看数据盘分区 UUID。
# lsblk -f
NAME FSTYPE LABEL UUID MOUNTPOINT
sda
├─sda1 xfs 278ae69e-30a6-40ca-a764-cd8862ef1527 /boot
└─sda2 LVM2_member tIRIrI-soRh-5TBf-q6tn-cMTd-DMdZ-FucCk7
├─centos-root xfs 13d08fa9-88bc-4b0a-bcab-4e1c1072afd2 /
└─centos-swap swap 488f568c-b72f-414e-9394-0765cbb9e5a2 [SWAP]
sdb
└─sdb1 ext4 87d0467c-d3a1-4916-8112-aed259bf8c8c
本例中 nvme0n1p1 的 UUID 为 87d0467c-d3a1-4916-8112-aed259bf8c8c。
编辑 /etc/fstab 文件,添加 nodelalloc 挂载参数。
# /etc/fstab
# Created by anaconda on Wed Apr 7 15:27:42 2021
#
# Accessible filesystems, by reference, are maintained under /dev/disk
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root / xfs defaults 0 0
UUID=278ae69e-30a6-40ca-a764-cd8862ef1527 /boot xfs defaults 0 0
/dev/mapper/centos-swap swap swap defaults 0 0
UUID=87d0467c-d3a1-4916-8112-aed259bf8c8c /u02 ext4 defaults,nodelalloc,noatime 0 2
挂载数据盘。
mkdir /data1 && mount -a
执行以下命令,如果文件系统为 ext4,并且挂载参数中包含 nodelalloc,则表示已生效。
# mount -t ext4
/dev/sdb1 on /u02 type ext4 (rw,noatime,nodelalloc,data=ordered)
关闭防火墙
检查防火墙状态(以 CentOS Linux release 7.7.1908 (Core) 为例)
sudo firewall-cmd --state
sudo systemctl status firewalld.service
关闭防火墙服务
sudo systemctl stop firewalld.service
关闭防火墙自动启动服务
sudo systemctl disable firewalld.service
检查防火墙状态
sudo systemctl status firewalld.service
开启时钟同步
检查chronyd服务状态
[root@cen7-pg-01 ~]# systemctl status chronyd.service
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:chronyd(8)
man:chrony.conf(5)
[root@cen7-pg-01 ~]# chronyc tracking
506 Cannot talk to daemon
设置chronyd同步服务器,开启所有节点时钟同步功能
# 修改同步服务器配置文件:
# vi /etc/chrony.conf
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.56.10 iburst
# Allow NTP client access # vi /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
# systemctl start chronyd.service
# systemctl status chronyd.service
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; disabled; vendor preset: enabled)
Active: active (running) since Fri 2022-07-22 14:39:39 CST; 4s ago
Docs: man:chronyd(8)
man:chrony.conf(5)
Process: 5505 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
Process: 5501 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 5504 (chronyd)
Tasks: 1
CGroup: /system.slice/chronyd.service
└─5504 /usr/sbin/chronyd
Jul 22 14:39:39 cen7-mysql-01 systemd[1]: Stopped NTP client/server.
Jul 22 14:39:39 cen7-mysql-01 systemd[1]: Starting NTP client/server...
Jul 22 14:39:39 cen7-mysql-01 chronyd[5504]: chronyd version 3.4 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +SECHASH +IPV6 +DEBUG)
Jul 22 14:39:39 cen7-mysql-01 chronyd[5504]: Frequency 0.156 +/- 235.611 ppm read from /var/lib/chrony/drift
Jul 22 14:39:39 cen7-mysql-01 systemd[1]: Started NTP client/server.
Jul 22 14:39:44 cen7-mysql-01 chronyd[5504]: Selected source 192.168.56.10
-- 检查系统时钟同步状态:Leap status = Normal表示状态正常
[root@cen7-mysql-01 ~]# chronyc tracking
Reference ID : C0A8380A (cen7-mysql-01)
Stratum : 11
Ref time (UTC) : Fri Jul 22 06:39:43 2022
System time : 0.000000000 seconds fast of NTP time
Last offset : -0.000003156 seconds
RMS offset : 0.000003156 seconds
Frequency : 0.107 ppm fast
Residual freq : -0.019 ppm
Skew : 250.864 ppm
Root delay : 0.000026199 seconds
Root dispersion : 0.001826834 seconds
Update interval : 0.0 seconds
Leap status : Normal
-- 设置chronyd服务开机自启动
# systemctl enable chronyd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/chronyd.service to /usr/lib/systemd/system/chronyd.service.
优化系统参数
在生产系统的 TiDB 中,建议按照官方文档TiDB 环境与系统配置检查 | PingCAP Docs对操作系统进行配置优化。
本次在虚机环境进行安装测试,只关闭了透明大页:
-- 检查操作系统时候关闭透明大页:
# cat /sys/kernel/mm/transparent_hugepage/enabled
# cat /sys/kernel/mm/transparent_hugepage/defrag
如果查询为:
[always] madvise never
表示透明大页处于启用状态,需要关闭。
-- 关闭操作系统的透明大页方法:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
-- 永久关闭操作系统的透明大页方法:
# vi /etc/rc.d/rc.local
-- 在文件尾追加如下内容
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
# chmod +x /etc/rc.d/rc.local
创建TiDB用户
# groupadd tidb
# useradd -g tidb tidb
# passwd tidb
Changing password for user tidb.
New password:-- 输入tidb用户密码
BAD PASSWORD: The password is shorter than 8 characters
Retype new password: -- 再次输入tidb用户密码
passwd: all authentication tokens updated successfully.
-- 创建数据库安装目录,并赋权
# mkdir /u02/tidb-data
# mkdir /u02/tidb-deploy
# chown -R tidb: /u02
修改 sysctl 参数
--修改/etc/sysctl.conf 设置操作系统参数
# echo "fs.file-max = 1000000">> /etc/sysctl.conf
# echo "net.core.somaxconn = 32768">> /etc/sysctl.conf
# echo "net.ipv4.tcp_tw_recycle = 0">> /etc/sysctl.conf
# echo "net.ipv4.tcp_syncookies = 0">> /etc/sysctl.conf
# echo "vm.overcommit_memory = 1">> /etc/sysctl.conf
-- 使修改生效
# sysctl -p
配置tidb用户的 limits.conf 文件
# cat << EOF >>/etc/security/limits.conf
tidb soft nofile 1000000
tidb hard nofile 1000000
tidb soft stack 32768
tidb hard stack 32768
EOF
安装TiDB
部署离线环境 TiUP 组件
-- 解压TIDB安装文件
# tar -zxf tidb-community-server-v5.3.0-linux-amd64.tar.gz
# tar -zxf tidb-community-toolkit-v5.3.0-linux-amd64.tar.gz
# chown -R tidb: /u01/soft/ti*
# su - tidb
-- 配置TIDB相关环境变量
$ cd /u01/soft/tidb-community-server-v5.3.0-linux-amd64/
$ sh ./local_install.sh && source /home/tidb/.bash_profile
Disable telemetry success
Successfully set mirror to /u01/soft/tidb-community-server-v5.3.0-linux-amd64
Detected shell: bash
Shell profile: /home/tidb/.bash_profile
/home/tidb/.bash_profile has been modified to to add tiup to PATH
open a new terminal or source /home/tidb/.bash_profile to use it
Installed path: /home/tidb/.tiup/bin/tiup
===============================================
1. source /home/tidb/.bash_profile
2. Have a try: tiup playground
===============================================
$ which tiup
~/.tiup/bin/tiup
创建安装拓扑文件
$ vi topology.yaml
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/u02/tidb-deploy"
data_dir: "/u02/tidb-data"
#server_configs: {}
pd_servers:
- host: 192.168.56.10
- host: 192.168.56.11
- host: 192.168.56.12
tidb_servers:
- host: 192.168.56.10
tikv_servers:
- host: 192.168.56.10
- host: 192.168.56.11
- host: 192.168.56.12
monitoring_servers:
- host: 192.168.56.10
grafana_servers:
- host: 192.168.56.10
alertmanager_servers:
- host: 192.168.56.10
安装tidb集群
这里使用TiUP离线部署方式安装TIDB:
-- 安装环境检查
$ tiup cluster check ./topology.yaml --user root -p
详细的检查结果已省略
在日志的最后可以看到哪些检查项未通过
-- 修复系统环境中存在的问题
$ tiup cluster deploy tidb-test v5.3.0 ./topology.yaml --user root -p
详细的检查结果已省略
在日志的最后如果看到的所有- Apply change on <IP> 的结果都为Done,表示修复成功
-- 开始安装TIDB集群
$ tiup cluster deploy tidb-te + Detect CPU Arch
- Detecting node 192.168.56.10 ... Done
- Detecting node 192.168.56.11 ... Done
- Detecting node 192.168.56.12 ... Done
Please confirm your topology:
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v5.3.0
Role Host Ports OS/Arch Directories
---- ---- ----- ------- -----------
pd 192.168.56.10 2379/2380 linux/x86_64 /u02/tidb-deploy/pd-2379,/u02/tidb-data/pd-2379
pd 192.168.56.11 2379/2380 linux/x86_64 /u02/tidb-deploy/pd-2379,/u02/tidb-data/pd-2379
pd 192.168.56.12 2379/2380 linux/x86_64 /u02/tidb-deploy/pd-2379,/u02/tidb-data/pd-2379
tikv 192.168.56.10 20160/20180 linux/x86_64 /u02/tidb-deploy/tikv-20160,/u02/tidb-data/tikv-20160
tikv 192.168.56.11 20160/20180 linux/x86_64 /u02/tidb-deploy/tikv-20160,/u02/tidb-data/tikv-20160
tikv 192.168.56.12 20160/20180 linux/x86_64 /u02/tidb-deploy/tikv-20160,/u02/tidb-data/tikv-20160
tidb 192.168.56.10 4000/10080 linux/x86_64 /u02/tidb-deploy/tidb-4000
prometheus 192.168.56.10 9090 linux/x86_64 /u02/tidb-deploy/prometheus-9090,/u02/tidb-data/prometheus-9090
grafana 192.168.56.10 3000 linux/x86_64 /u02/tidb-deploy/grafana-3000
alertmanager 192.168.56.10 9093/9094 linux/x86_64 /u02/tidb-deploy/alertmanager-9093,/u02/tidb-data/alertmanager-9093
Attention:
1. If the topology is not what you expected, check your yaml ully, you can start it with command: `tiup cluster start tidb-test`
看到安装过程最后一行提示‘deployed successfully’,表示安装成功
启动tidb集群
-- 查看当前从系统中的TiDB集群列表
$ tiup cluster list
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster list
Name User Version Path PrivateKey
---- ---- ------- ---- ----------
tidb-test tidb v5.3.0 /home/tidb/.tiup/storage/cluster/clusters/tidb-test /home/tidb/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa
-- 启动tidb-test集群
$ tiup cluster start tidb-test
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster start tidb-test
Starting cluster tidb-test...
......中间的日志已省略
+ [ Serial ] - UpdateTopology: cluster=tidb-test
Started cluster `tidb-test` successfully
-- 查看集群状态
$ tiup cluster display tidb-test
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display tidb-test
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v5.3.0
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.56.11:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.56.10:9093 alertmanager 192.168.56.10 9093/9094 linux/x86_64 Up /u02/tidb-data/alertmanager-9093 /u02/tidb-deploy/alertmanager-9093
192.168.56.10:3000 grafana 192.168.56.10 3000 linux/x86_64 Up - /u02/tidb-deploy/grafana-3000
192.168.56.10:2379 pd 192.168.56.10 2379/2380 linux/x86_64 Up|L /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.11:2379 pd 192.168.56.11 2379/2380 linux/x86_64 Up|UI /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.12:2379 pd 192.168.56.12 2379/2380 linux/x86_64 Up /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.10:9090 prometheus 192.168.56.10 9090 linux/x86_64 Up /u02/tidb-data/prometheus-9090 /u02/tidb-deploy/prometheus-9090
192.168.56.10:4000 tidb 192.168.56.10 4000/10080 linux/x86_64 Up - /u02/tidb-deploy/tidb-4000
192.168.56.10:20160 tikv 192.168.56.10 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
192.168.56.11:20160 tikv 192.168.56.11 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
192.168.56.12:20160 tikv 192.168.56.12 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
Total nodes: 10
以上集群状态正常。
至此,TiDB集群安装完毕。
我们可以使用mysql 客户端工具连接TIDB数据库,进行数据库操作。
TIDB集群部署总结
在CentOS7.9 操作系统上进行TiDB v5.3.0数据库安装操作,参考官方文档的操作步骤基本可以顺利完成安装:
TiDB 软件和硬件环境建议配置 | PingCAP Docs
其中,TiDB 环境与系统配置检查部分,需要根据实际的安装环境进行调整。
附,安装TiDBv5.3.0过程中遇到的问题
按照官方文档将 server 和 toolkit 两个离线镜像合并时,报错了。但是不进行server 和 toolkit 的合并操作也可以顺利完成安装过程。
报错信息如下:
$ cd /u01/soft/tidb-community-server-v5.3.0-linux-amd64/
$ cp -rp keys ~/.tiup/
$ tiup mirror merge ../tidb-community-toolkit-v5.3.0-linux-amd64
Error: resource snapshot.json: not found
TiDB集群扩缩容
TiDB 集群可以在不中断线上服务的情况下进行扩容和缩容。
扩缩容操作参考官方文档:使用 TiUP 扩容缩容 TiDB 集群 | PingCAP Docs
扩容:
修改扩容配置文件,扩容一个tidb server节点和一个tikv节点
注:这里如果新TiDB节点信息中存在新加入TiDB集群的主机,在扩容前,需要按照上面TIDB安装步骤中系统配置部分的内容对新节点进行配置。
vi scale-out.yaml
tidb_servers:
- host: 192.168.56.11
进行扩容操作
--使用tiup cluster scale-out命令进行TiDB集群扩容
$ tiup cluster scale-out tidb-test ./scale-out.yaml -p -u root
Starting component `cluster`. Done
...省略中间日志
+ [ Serial ] - UpdateTopology: cluster=tidb-test
Scaled cluster `tidb-test` out successfully
--看到扩容过程最后一行提示Scaled cluster `集群名称` out successfully,表示扩容成功
-- 扩容后再次查看集群状态,确认新节点已经成功加入集群
$ tiup cluster display tidb-test
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display tidb-test
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v5.3.0
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.56.11:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.56.10:9093 alertmanager 192.168.56.10 9093/9094 linux/x86_64 Up /u02/tidb-data/alertmanager-9093 /u02/tidb-deploy/alertmanager-9093
192.168.56.10:3000 grafana 192.168.56.10 3000 linux/x86_64 Up - /u02/tidb-deploy/grafana-3000
192.168.56.10:2379 pd 192.168.56.10 2379/2380 linux/x86_64 Up /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.11:2379 pd 192.168.56.11 2379/2380 linux/x86_64 Up|UI /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.12:2379 pd 192.168.56.12 2379/2380 linux/x86_64 Up|L /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.10:9090 prometheus 192.168.56.10 9090 linux/x86_64 Up /u02/tidb-data/prometheus-9090 /u02/tidb-deploy/prometheus-9090
192.168.56.10:4000 tidb 192.168.56.10 4000/10080 linux/x86_64 Up - /u02/tidb-deploy/tidb-4000
192.168.56.11:4000 tidb 192.168.56.11 4000/10080 linux/x86_64 Up - /u02/tidb-deploy/tidb-4000
192.168.56.10:20160 tikv 192.168.56.10 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
192.168.56.11:20160 tikv 192.168.56.11 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
192.168.56.12:20160 tikv 192.168.56.12 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
192.168.56.150:20160 tikv 192.168.56.150 20160/20180 linux/x86_64 Up /u02/tidb-data/tikv-20160 /u02/tidb-deploy/tikv-20160
Total nodes: 12
缩容:
-- 使用tiup cluster scale-in命令进行TiDB集群缩容操作,--node参数指定要移除的节点的IP和端口号
$ tiup cluster scale-in tidb-test --node 192.168.56.150:20160
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-clust ...省略中间日志
Scaled cluster `tidb-test` in successfully
--看到缩容过程最后一行提示Scaled cluster `集群名称` in successfully,表示缩容成功
-- 扩容后再次查看集群状态,确认被移除节点状态已经更改为Tombstone
$ tiup cluster display tidb-test
Starting component `cluster`: /home/tidb/.tiup/components/cluster/v1.7.0/tiup-cluster display tidb-test
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v5.3.0
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.56.11:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.56.10:9093 alertmanager 192.168.56.10 9093/9094 linux/x86_64 Up /u02/tidb-data/alertmanager-9093 /u02/tidb-deploy/alertmanager-9093
192.168.56.10:3000 grafana 192.168.56.10 3000 linux/x86_64 Up - /u02/tidb-deploy/grafana-3000
192.168.56.10:2379 pd 192.168.56.10 2379/2380 linux/x86_64 Up /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.11:2379 pd 192.168.56.11 2379/2380 linux/x86_64 Up|UI /u02/tidb-data/pd-2379 /u02/tidb-deploy/pd-2379
192.168.56.12:2379 pd 192.168.56.12 2379/2380 linux/x86_64 Up|L /u02/tidb-data/pd-2379 /u02/t