mirror of
https://github.com/scylladb/scylladb.git
synced 2026-06-03 21:47:10 +00:00
The following backtrace was reported by user when running repair and keeping restarting the node at the same time. #0 0x00007eff077281d7 in raise () from /lib64/libc.so.6 #1 0x00007eff07729a08 in abort () from /lib64/libc.so.6 #2 0x00007eff07721146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007eff077211f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000010ef2c2 in locator::token_metadata::first_token_index (this=0x641000214e98, start=...) at locator/token_metadata.cc:133 #5 0x00000000010ef2d9 in locator::token_metadata::first_token (this=0x641000214e98, start=...) at locator/token_metadata.cc:143 #6 0x00000000010e329d in locator::abstract_replication_strategy::get_natural_endpoints (this=0x641000494000, search_token=...) at locator/abstract_replication_strategy.cc:66 #7 0x0000000001481186 in get_neighbors (hosts=std::vector of length 0, capacity 0, data_centers=std::vector of length 0, capacity 0, range=<error reading variable: access outside bounds of object referenced via synthetic pointer>, ksname=..., db=...) at repair/repair.cc:196 #8 repair_range<nonwrapping_range<dht::token> > (range=..., ri=...) at repair/repair.cc:781 #9 <lambda(auto:99&)>::<lambda(auto:100&&)>::<lambda(auto:101&)>::<lambda()>::operator() (__closure=0x7efec07f7460) at repair/repair.cc:1005 #10 futurize<future<bool_class<stop_iteration_tag> > >::apply<repair_ranges(repair_info)::<lambda(auto:99&)>:: It is reproduced with 1) while true; do curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/repair_async/ks3"; done 2) start node 127.0.0.1, stop node 127.0.0.1 in a loop The problem is, during boot up, the token_metadata is not replicated to all shards until the node goes into NORMAL status. To fix, check until node is in NORMAL status before allowing repair. Fixes #2723