确认下旧得就集群gc 时间超过 你备份得时间 , 如果是 调整下原集群gc时间 在重新来次
日志就是上面显示的已经恢复成功了,但是我查数据是没有同步过来。
https://docs.pingcap.com/zh/tidb/stable/backup-and-restore-use-cases
用这个案例测试,指定全量备份路径、增量备份路径、恢复到的目标时间。
可以这么用的,pitr 可以单独就刷增量部分。
比如主从 tidb 到 tidb 同步,延迟过高,用 pitr 备份的增量日志来刷数据。
现在tiflash2个节点直接启不动了,硬件是 64 (128 vCore) 128G SSD 上面跑2个tiflash节点。
tiflash 不要单机多实例部署。
备用集群已经调整为1个tiflash实例
但是在从增量恢复数据时,这个tiflash节点就会直接被KILL,然后就一直启不来了。
/var/log/messages 的日志显示:
Jan 13 13:30:53 localhost abrt-hook-ccpp: Process 39377 (tiflash) of user 1001 killed by SIGSEGV - dumping core
Jan 13 13:31:20 localhost abrt-server: Executable ‘/data/tidb-deploy/tiflash-9000/bin/tiflash/tiflash’ doesn’t belong to any package and ProcessUnpackaged is set to ‘no’
Jan 13 13:31:20 localhost abrt-server: ‘post-create’ on ‘/var/spool/abrt/ccpp-2018-01-13-13:30:53-39377’ exited with 1
Jan 13 13:31:20 localhost abrt-server: Deleting problem directory ‘/var/spool/abrt/ccpp-2018-01-13-13:30:53-39377’
Jan 13 13:31:21 localhost systemd: tiflash-9000.service: main process exited, code=killed, status=11/SEGV
Jan 13 13:31:21 localhost systemd: Unit tiflash-9000.service entered failed state.
Jan 13 13:31:21 localhost systemd: tiflash-9000.service failed.
Jan 13 13:31:36 localhost systemd: tiflash-9000.service holdoff time over, scheduling restart.
Jan 13 13:31:36 localhost systemd: Stopped tiflash service.
Jan 13 13:31:36 localhost systemd: Started tiflash service.
Jan 13 13:31:36 localhost bash: sync …
Jan 13 13:31:36 localhost bash: real#0110m0.095s
Jan 13 13:31:36 localhost bash: user#0110m0.000s
Jan 13 13:31:36 localhost bash: sys#0110m0.029s
Jan 13 13:31:36 localhost bash: ok
tiflash_error.log 日志
[2018/01/13 12:50:42.004 +08:00] [WARN] [SchemaGetter.cpp:208] [“The schema diff for version 1297, key Diff:1297 is empty.”] [source=SchemaGetter] [thread_id=186]
[2018/01/13 12:50:42.098 +08:00] [WARN] [SchemaGetter.cpp:208] [“The schema diff for version 1297, key Diff:1297 is empty.”] [source=SchemaGetter] [thread_id=187]
[2018/01/13 12:50:42.225 +08:00] [WARN] [SchemaGetter.cpp:208] [“The schema diff for version 1297, key Diff:1297 is empty.”] [source=SchemaGetter] [thread_id=188]
[2018/01/13 12:50:42.242 +08:00] [WARN] [DMFile.cpp:732] [“Existing temporary or dropped dmfile, removed: /data/tidb-data/tiflash-9000/data/t_4330/stable/.tmp.dmf_937”] [source=DMFile] [thread_id=182]
[2018/01/13 12:50:44.849 +08:00] [WARN] [SchemaGetter.cpp:208] [“The schema diff for version 1297, key Diff:1297 is empty.”] [source=SchemaGetter] [thread_id=189]
[2018/01/13 12:50:45.174 +08:00] [ERROR] [Region.cpp:660] [“[region 21236, applied: term 10 index 52] catch exception: Found existing key in hex: 748000000000002AFF305F728000000000FFBA928B0000000000FAF9B1428A41CFFA81, while applying CmdType::Put on [term 10, index 53], CF default”] [thread_id=193]
[2018/01/13 12:50:45.174 +08:00] [ERROR] [Region.cpp:660] [“[region 100528, applied: term 11 index 119] catch exception: Found existing key in hex: 7480000000000015FF425F728000000000FF1E03BA0000000000FAF9B142879187FD69, while applying CmdType::Put on [term 11, index 120], CF default”] [thread_id=194]
[2018/01/13 12:50:45.176 +08:00] [ERROR] [Region.cpp:660] [“[region 100524, applied: term 11 index 64] catch exception: Found existing key in hex: 7480000000000015FF425F728000000000FF02A9630000000000FAF9B14285C627FF05, while applying CmdType::Put on [term 11, index 65], CF default”] [thread_id=195]
[2018/01/13 12:50:47.224 +08:00] [ERROR] [Exception.cpp:89] [“Code: 49, e.displayText() = DB::Exception: Found existing key in hex: 7480000000000015FF425F728000000000FF1E03BA0000000000FAF9B142879187FD69, e.what() = DB::Exception, Stack trace:\n\n\n 0x1718afe\tDB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::_1::allocator > const&, int) [tiflash+24218366]\n \tdbms/src/Common/Exception.h:46\n 0x6ae505c\tDB::RegionCFDataBaseDB::RegionDefaultCFDataTrait::insert(DB::StringObject&&, DB::StringObject&&) [tiflash+112087132]\n \tdbms/src/Storages/Transaction/RegionCFDataBase.cpp:50\n 0x6add031\tauto DB::Region::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, DB::TMTContext&):
12::operator()(unsigned long) const [tiflash+112054321]\n \tdbms/src/Storages/Transaction/Region.cpp:650\n 0x6adc062\tDB::Region::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, DB::TMTContext&) [tiflash+112050274]\n \tdbms/src/Storages/Transaction/Region.cpp:716\n 0x6aa181e\tDB::KVStore::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, unsigned long, DB::TMTContext&) [tiflash+111810590]\n \tdbms/src/Storages/Transaction/KVStore.cpp:293\n 0x6ac0ee5\tHandleWriteRaftCmd [tiflash+111939301]\n \tdbms/src/Storages/Transaction/ProxyFFI.cpp:95\n 0x7f318958867d\t$LT$engine_store_ffi…observer…TiFlashObserver$u20$as$u20$raftstore…coprocessor…QueryObserver$GT$::post_exec_query::hee3ae1e50b0903bb [libtiflash_proxy.so+17598077]\n 0x7f318a49734d\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::apply_raft_cmd::hc04a972cfbc95174 [libtiflash_proxy.so+33387341]\n 0x7f318a4ae885\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::process_raft_cmd::ha5d3beac455ec8f3 [libtiflash_proxy.so+33482885]\n 0x7f318a4b3d54\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::handle_raft_committed_entries::hadfb35f9bbe11628 [libtiflash_proxy.so+33504596]\n 0x7f318a48933c\traftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_apply::hf9768fdc5778f678 [libtiflash_proxy.so+33329980]\n 0x7f318a48c968\traftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_tasks::h63b5f68ed022c4e3 [libtiflash_proxy.so+33343848]\n 0x7f3189b34774\t$LT$raftstore…store…fsm…apply…ApplyPoller$LT$EK$GT$$u20$as$u20$batch_system…batch…PollHandler$LT$raftstore…store…fsm…apply…ApplyFsm$LT$EK$GT$$C$raftstore…store…fsm…apply…ControlFsm$GT$$GT$::handle_normal::h9a48416ad5231861 [libtiflash_proxy.so+23545716]\n 0x7f3189ad8567\tbatch_system::batch::Poller$LT$N$C$C$C$Handler$GT$::poll::ha5a88b90d486e4f4 [libtiflash_proxy.so+23168359]\n 0x7f3189b88a12\tstd::sys_common::backtrace::__rust_begin_short_backtrace::h537f24f7d970de01 [libtiflash_proxy.so+23890450]\n 0x7f3189bc1f0e\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h135efe92fe35fce3 [libtiflash_proxy.so+24125198]\n 0x7f318ab6b6a5\tstd::sys::unix:![]()
![]()
:thread_start::hd2791a9cabec1fda [libtiflash_proxy.so+40548005]\n \t/rustc/96ddd32c4bfb1d78f0cd03eb068b1710a8cebeef/library/std/src/sys/unix/thread.rs:108\n 0x7f31882abea5\tstart_thread [libpthread.so.0+32421]\n 0x7f31876b096d\t__clone [libc.so.6+1042797]”] [source=“DB::EngineStoreApplyRes DB::HandleWriteRaftCmd(const DB::EngineStoreServerWrap *, DB::WriteCmdsView, DB::RaftCmdHeader)”] [thread_id=194]
tiflash.log
[2024/12/10 18:32:08.419 +08:00] [DEBUG] [DMVersionFilterBlockInputStream.h:80] [“Total rows: 137425, pass: 100.00%, complete pass: 100.00%, complete not pass: 0.00%, not clean: 0.00%, is deleted: 0.00%, effective: 100.00%, read tso: 454512067753017344”] [source=“mode=COMPACT”] [thread_id=197]
[2024/12/10 18:32:08.423 +08:00] [INFO] [SSTFilesToDTFilesOutputStream.cpp:184] [“Finished writing DTFile from snapshot data, region=[region 52076, applied: term 10 index 19] file_idx=0 file_rows=292705 file_bytes=117824825 data_range=[202581747235,202582338382) file_bytes_on_disk=14590740 file=/data/tidb-data/tiflash-9000/data/t_5474/stable/dmf_1303”] [source=“table_id=5474”] [thread_id=206]
[2024/12/10 18:32:08.423 +08:00] [INFO] [SSTFilesToDTFilesOutputStream.cpp:111] [“Transformed snapshot in SSTFile to DTFiles, region=[region 52076, applied: term 10 index 19] job_type=ApplySnapshot cost_ms=1981 rows=292705 bytes=117824825 write_cf_keys=292705 default_cf_keys=236694 lock_cf_keys=0 dt_files=[files_num=1 dmf_1303]”] [source=“table_id=5474”] [thread_id=206]
[2024/12/10 18:32:08.423 +08:00] [INFO] [SSTFilesToDTFilesOutputStream.cpp:184] [“Finished writing DTFile from snapshot data, region=[region 51060, applied: term 11 index 18] file_idx=0 file_rows=293849 file_bytes=118627928 data_range=[202429043430,202429632804) file_bytes_on_disk=15295938 file=/data/tidb-data/tiflash-9000/data/t_5474/stable/dmf_1302”] [source=“table_id=5474”] [thread_id=204]
[2024/12/10 18:32:08.423 +08:00] [DEBUG] [DMVersionFilterBlockInputStream.h:80] [“Total rows: 292705, pass: 100.00%, complete pass: 100.00%, complete not pass: 0.00%, not clean: 0.00%, is deleted: 0.00%, effective: 100.00%, read tso: 454512067753017344”] [source=“mode=COMPACT”] [thread_id=206]
[2024/12/10 18:32:08.423 +08:00] [INFO] [SSTFilesToDTFilesOutputStream.cpp:111] [“Transformed snapshot in SSTFile to DTFiles, region=[region 51060, applied: term 11 index 18] job_type=ApplySnapshot cost_ms=2002 rows=293849 bytes=118627928 write_cf_keys=293849 default_cf_keys=222821 lock_cf_keys=0 dt_files=[files_num=1 dmf_1302]”] [source=“table_id=5474”] [thread_id=204]
[2024/12/10 18:32:08.423 +08:00] [DEBUG] [DMVersionFilterBlockInputStream.h:80] [“Total rows: 293849, pass: 100.00%, complete pass: 100.00%, complete not pass: 0.00%, not clean: 0.00%, is deleted: 0.00%, effective: 100.00%, read tso: 454512067753017344”] [source=“mode=COMPACT”] [thread_id=204]
[2024/12/10 18:32:08.425 +08:00] [DEBUG] [SSTFilesToBlockInputStream.cpp:218] [“Done loading from [CF=default] [offset=172032] [write_cf_offset=163840] [last_loaded_rowkey=7480000000000010EA5F728000000008BCAB03] [rowkey_to_be_included=7480000000000010EA5F728000000008BC88BE]”] [source=“table_id=4330”] [thread_id=201]
[2024/12/10 18:32:08.449 +08:00] [ERROR] [Exception.cpp:89] [“Code: 49, e.displayText() = DB::Exception: Found existing key in hex: 748000000000002AFF305F728000000000FFBA928B0000000000FAF9B1428A41CFFA81, e.what() = DB::Exception, Stack trace:\n\n\n 0x1718afe\tDB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::_1::allocator > const&, int) [tiflash+24218366]\n \tdbms/src/Common/Exception.h:46\n 0x6ae505c\tDB::RegionCFDataBaseDB::RegionDefaultCFDataTrait::insert(DB::StringObject&&, DB::StringObject&&) [tiflash+112087132]\n \tdbms/src/Storages/Transaction/RegionCFDataBase.cpp:50\n 0x6add031\tauto DB::Region::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, DB::TMTContext&):
12::operator()(unsigned long) const [tiflash+112054321]\n \tdbms/src/Storages/Transaction/Region.cpp:650\n 0x6adc062\tDB::Region::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, DB::TMTContext&) [tiflash+112050274]\n \tdbms/src/Storages/Transaction/Region.cpp:716\n 0x6aa181e\tDB::KVStore::handleWriteRaftCmd(DB::WriteCmdsView const&, unsigned long, unsigned long, unsigned long, DB::TMTContext&) [tiflash+111810590]\n \tdbms/src/Storages/Transaction/KVStore.cpp:293\n 0x6ac0ee5\tHandleWriteRaftCmd [tiflash+111939301]\n \tdbms/src/Storages/Transaction/ProxyFFI.cpp:95\n 0x7f2adf2d367d\t$LT$engine_store_ffi…observer…TiFlashObserver$u20$as$u20$raftstore…coprocessor…QueryObserver$GT$::post_exec_query::hee3ae1e50b0903bb [libtiflash_proxy.so+17598077]\n 0x7f2ae01e234d\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::apply_raft_cmd::hc04a972cfbc95174 [libtiflash_proxy.so+33387341]\n 0x7f2ae01f9885\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::process_raft_cmd::ha5d3beac455ec8f3 [libtiflash_proxy.so+33482885]\n 0x7f2ae01fed54\traftstore::store::fsm::apply::ApplyDelegate$LT$EK$GT$::handle_raft_committed_entries::hadfb35f9bbe11628 [libtiflash_proxy.so+33504596]\n 0x7f2ae01d433c\traftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_apply::hf9768fdc5778f678 [libtiflash_proxy.so+33329980]\n 0x7f2ae01d7968\traftstore::store::fsm::apply::ApplyFsm$LT$EK$GT$::handle_tasks::h63b5f68ed022c4e3 [libtiflash_proxy.so+33343848]\n 0x7f2adf87f774\t$LT$raftstore…store…fsm…apply…ApplyPoller$LT$EK$GT$$u20$as$u20$batch_system…batch…PollHandler$LT$raftstore…store…fsm…apply…ApplyFsm$LT$EK$GT$$C$raftstore…store…fsm…apply…ControlFsm$GT$$GT$::handle_normal::h9a48416ad5231861 [libtiflash_proxy.so+23545716]\n 0x7f2adf823567\tbatch_system::batch::Poller$LT$N$C$C$C$Handler$GT$::poll::ha5a88b90d486e4f4 [libtiflash_proxy.so+23168359]\n 0x7f2adf8d3a12\tstd::sys_common::backtrace::__rust_begin_short_backtrace::h537f24f7d970de01 [libtiflash_proxy.so+23890450]\n 0x7f2adf90cf0e\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h135efe92fe35fce3 [libtiflash_proxy.so+24125198]\n 0x7f2ae08b66a5\tstd::sys::unix:![]()
![]()
:thread_start::hd2791a9cabec1fda [libtiflash_proxy.so+40548005]\n \t/rustc/96ddd32c4bfb1d78f0cd03eb068b1710a8cebeef/library/std/src/sys/unix/thread.rs:108\n 0x7f2addff6ea5\tstart_thread [libpthread.so.0+32421]\n 0x7f2add3fb96d\t__clone [libc.so.6+1042797]”] [source=“DB::EngineStoreApplyRes DB::HandleWriteRaftCmd(const DB::EngineStoreServerWrap *, DB::WriteCmdsView, DB::RaftCmdHeader)”] [thread_id=191]
生产库上的tiflash节点上 46张表的总数据量是 268 G,会不会是因为tiflash同时从tikv上同步这46张表副本时,量太大了?导致tiflash OOM了,有没有什么办法能解决呢?
另外开个贴吧。你这是 tiflash 同步数据失败了?写一写操作背景。这个单的楼太高了。
此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。
