From 4d3c463536bf64a65c756f05f36bf9fb69f81eed Mon Sep 17 00:00:00 2001 From: Asias He Date: Mon, 24 Dec 2018 15:02:42 +0800 Subject: [PATCH] storage_service: Stop cql server before gossip We saw failure in dtest concurrent_schema_changes_test.py: TestConcurrentSchemaChanges.changes_while_node_down_test test. ====================================================================== ERROR: changes_while_node_down_test (concurrent_schema_changes_test.TestConcurrentSchemaChanges) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/asias/src/cloudius-systems/scylla-dtest/concurrent_schema_changes_test.py", line 432, in changes_while_node_down_test self.make_schema_changes(session, namespace='ns2') File "/home/asias/src/cloudius-systems/scylla-dtest/concurrent_schema_changes_test.py", line 86, in make_schema_changes session.execute('USE ks_%s' % namespace) File "cassandra/cluster.py", line 2141, in cassandra.cluster.Session.execute return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state).result() File "cassandra/cluster.py", line 4033, in cassandra.cluster.ResponseFuture.result raise self._final_exception ConnectionShutdown: Connection to 127.0.0.1 is closed The test: session = self.patient_cql_connection(node2) self.prepare_for_changes(session, namespace='ns2') node1.stop() self.make_schema_changes(session, namespace='ns2') --> ConnectionShutdown exception throws The problem is that, after receiving the DOWN event, the python Cassandra driver will call Cluster:on_down which checks if this client has any connections to the node being shutdown. If there is any connections, the Cluster:on_down handler will exit early, so the session to the node being shutdown will not be removed. If we shutdown the cql server first, the connection count will be zero and the session will be removed. Fixes: #4013 Message-Id: <7388f679a7b09ada10afe7e783d7868a58aac6ec.1545634941.git.asias@scylladb.com> --- service/storage_service.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/service/storage_service.cc b/service/storage_service.cc index 45f067f3e0..077b08eb09 100644 --- a/service/storage_service.cc +++ b/service/storage_service.cc @@ -1298,12 +1298,12 @@ future<> storage_service::stop_transport() { return seastar::async([&ss] { slogger.info("Stop transport: starts"); - gms::stop_gossiping().get(); - slogger.info("Stop transport: stop_gossiping done"); - ss.shutdown_client_servers().get(); slogger.info("Stop transport: shutdown rpc and cql server done"); + gms::stop_gossiping().get(); + slogger.info("Stop transport: stop_gossiping done"); + ss.do_stop_ms().get(); slogger.info("Stop transport: shutdown messaging_service done");