scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-25 09:11:10 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	e90e10112f	node_ops: repair: move node_ops_id to node_ops directory	2023-09-13 10:40:04 +02:00
Avi Kivity	2c810e221a	Merge 'Gossiper: replace seastar threads with coroutines' from Benny Halevy Many of the gossiper internal functions currently use seastar threads for historical reasons, but since they are short living, the cost of spawning a seastar thread for them is excessive and they can be simplified and made more efficient using coroutines. Closes #15364 * github.com:scylladb/scylladb: gossiper: reindent do_stop_gossiping gossiper: coroutinize do_stop_gossiping gossiper: reindent assassinate_endpoint gossiper: coroutinize assassinate_endpoint gossiper: coroutinize handle_ack2_msg gossiper: handle_ack_msg: always log warning on exception gossiper: reindent handle_ack_msg gossiper: coroutinize handle_ack_msg gossiper: reindent handle_syn_msg gossiper: coroutinize handle_syn_msg gossiper: message handlers: no need to capture shared_from_this gossiper: add_local_application_state: throw internal error if endpoint state is not found gossiper: coroutinize add_local_application_state	2023-09-12 21:50:52 +03:00
Benny Halevy	47dc287efd	gossiper: reindent do_stop_gossiping Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:33:09 +03:00
Benny Halevy	8fa65ed016	gossiper: coroutinize do_stop_gossiping Simplify the function. It does not need to spawn a seastar thread. While at it, declare it as private since it's called only internally by the gossiper (and on shard 0). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:33:09 +03:00
Benny Halevy	a792babbda	gossiper: reindent assassinate_endpoint Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:33:09 +03:00
Benny Halevy	5dbc168c03	gossiper: coroutinize assassinate_endpoint It has no need to spawn a seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:33:09 +03:00
Benny Halevy	29b9596050	gossiper: coroutinize handle_ack2_msg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:33:09 +03:00
Benny Halevy	cc030a5040	gossiper: handle_ack_msg: always log warning on exception Unlike handle_syn_msg, the warning is currently printed only `if (_ack_handlers.contains(from.addr))`. Unclear why. It is interesting in any case. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:32:40 +03:00
Benny Halevy	990ac23d19	gossiper: reindent handle_ack_msg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:27:08 +03:00
Benny Halevy	2ca2118130	gossiper: coroutinize handle_ack_msg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:26:03 +03:00
Benny Halevy	8c065bf023	gossiper: reindent handle_syn_msg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:24:14 +03:00
Benny Halevy	264f4daded	gossiper: coroutinize handle_syn_msg Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:23:09 +03:00
Benny Halevy	63ab5f1ab3	gossiper: message handlers: no need to capture shared_from_this The handlers future is waited on under `background_msg` which is closed in gossiper::stop so the instance is already guranteed to be kept valid. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:21:07 +03:00
Benny Halevy	8bfec81985	gossiper: add_local_application_state: throw internal error if endpoint state is not found If the function is called too early, the first get_endpoint_state_ptr would throw an exception that is later caught and degraded into a warning. But that endpoint_state should never disappear after yielding, so call on_internal_error in that case. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:21:07 +03:00
Benny Halevy	d1c67300d4	gossiper: coroutinize add_local_application_state There is no need for it to spawn a seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-12 19:20:41 +03:00
Avi Kivity	a1b2ca6184	Merge 'build: cmake: package cqlsh and fix the noarch postfix of python3 package' from Kefu Chai in this series, the packaging of tools modules are improved: - package cqlsh also. as cqlsh should be redistributed as a part of the unified package - use ${arch} in the postfix of the python3 package. the python3 package is not architecture independent. - set the version with tide for `Scylla_VERSION`, so it can be reused elsewhere. Refs #15241 Closes #15369 * github.com:scylladb/scylladb: build: cmake: build cqlsh as a submodule build: cmake: always use the version with tilde build: cmake: build python3 dist tarball with arch postfix build: cmake: use the default comment message	2023-09-12 16:27:03 +03:00
David Garcia	5177ddac17	Support advanced db config scenarios docs: skip html tags from description Closes #15338	2023-09-12 15:29:16 +03:00
Tomasz Grabiec	6e83e54b0d	Merge 'gossiper: get rid of uses_host_id' from Benny Halevy This function practically returned true from inception. In `d38deef499` it started using messaging_service().knows_version(endpoint) that also returns `true` unconditionally, to this day So there's no point calling it since we can assume that `uses_host_id` is true for all versions. Closes #15343 * github.com:scylladb/scylladb: storage_service: fixup indentation after last patch gossiper: get rid of uses_host_id	2023-09-12 12:44:56 +02:00
Kefu Chai	571fab4179	build: cmake: build cqlsh as a submodule since we also redistribute cqlsh, let's package it as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Kefu Chai	4ff5ce9933	build: cmake: always use the version with tilde since we always use tilde ("~") in the verson number, let's just cache it as an internal variable in CMake. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Kefu Chai	111d20958e	build: cmake: build python3 dist tarball with arch postfix now that `configure.py` always generate python3 dist tarball with ${arch} postfix, let's mirror this behavior. as `build_unified.sh` uses this naming convention. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Kefu Chai	760b7c8772	build: cmake: use the default comment message it turns out "Generating submodule python3 in python3" is not as informative as default one: "/home/kefu/dev/scylladb/tools/python3/build/scylla-python3-5.4.0~dev-0.20230908.1668d434e458.noarch.tar.gz" so let's drop the "COMMENT" argument. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-12 18:18:31 +08:00
Botond Dénes	bc4b3e4fa3	Merge 'build: cmake: add packaging support ' from Kefu Chai this change allows CMake to build the dist tarball for a certain build. Refs https://github.com/scylladb/scylladb/issues/15241 Closes #15352 * github.com:scylladb/scylladb: build: cmake: add packaging support build: cmake: enable build of seastar/apps/iotune	2023-09-12 09:59:53 +03:00
Avi Kivity	89ba4e4a5e	Merge 'Stop using anonymous minio bucket for tests' from Pavel Emelyanov Currently minio starts with a bucket that has public anonymous access. Respectively, all tests use unsigned S3 requests. That was done for simplicity, and its better to apply some policy to the bucket and, consequentially, make tests sign their requests. Other than the obvious benefit that we test requests signing in unit tests, another goal of this PR is to make it possible to simulate and test various error paths locally, e.g. #13745 and #13022 Closes #14525 * github.com:scylladb/scylladb: test/s3: Remove AWS_S3_EXTRA usage test/s3: Run tests over non-anonymous bucket test/minio: Create random temp user on start code: Rename S3_PUBLIC_BUCKET_FOR_TEST	2023-09-11 23:12:56 +03:00
Tomasz Grabiec	f77e90a0f0	tests: test_tablets: Reconnect the driver after server restart This is a workaround for the flakiness of the test where INSERT statements following the rolling restart fail with "No host available" exception. The hypothesis is that those INSERTS race with driver reconnecting to the cluster and if INSERTs are attempted before reconnection is finished, the driver will refuse to execute the statements. The real fix should be in the driver to join with reconnections but before that is ready we want to fix CI flakiness. Refs #14746 Closes #15355	2023-09-11 21:58:46 +03:00
Kefu Chai	34e3302c01	dbuild: use --userns option when using podman instead of fabricating a `/etc/password` manually, we can just leave it to podman to add an entry in `/etc/password` in container. as podman allows us to map user's account to the same UID in the container. see https://docs.podman.io/en/stable/markdown/options/userns.container.html. this is not only a cosmetic change, it also avoid the permission denied failure when accessing `/etc/passwd` in the container when selinux is enabled. without this change, we would otherwise need to either add the selinux lable to the bind volume with ':Z' option address the failure like: ``` type=AVC msg=audit(1693449115.261:2599): avc: denied { open } for pid=2298247 comm="bash" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 type=AVC msg=audit(1693449115.263:2600): avc: denied { open } for pid=2298249 comm="id" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 ``` found in `/var/log/audit/audit.log`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15230	2023-09-11 21:41:48 +03:00
Avi Kivity	b8a655f55e	Update tools/python3 submodule * tools/python3 45fbd05...3e833f1 (1): > install.sh: replace <tab> with spaces	2023-09-11 21:38:02 +03:00
Avi Kivity	628e6ffd33	Merge 'database, storage_proxy: Reconcile pages with dead rows and partitions incrementally' from Botond Dénes Currently, mutation query on replica side will not respond with a result which doesn't have at least one live row. This causes problems if there is a lot of dead rows or partitions before we reach a live row, which stem from the fact that resulting reconcilable_result will be large: 1. Large allocations. Serialization of reconcilable_result causes large allocations for storing result rows in std::deque 2. Reactor stalls. Serialization of reconcilable_result on the replica side and on the coordinator side causes reactor stalls. This impacts not only the query at hand. For 1M dead rows, freezing takes 130ms, unfreezing takes 500ms. Coordinator does multiple freezes and unfreezes. The reactor stall on the coordinator side is >5s 3. Too large repair mutations. If reconciliation works on large pages, repair may fail due to too large mutation size. 1M dead rows is already too much: Refs https://github.com/scylladb/scylladb/issues/9111. This patch fixes all of the above by making mutation reads respect the memory accounter's limit for the page size, even for dead rows. This patch also addresses the problem of client-side timeouts during paging. Reconciling queries processing long strings of tombstones will now properly page tombstones,like regular queries do. My testing shows that this solution even increases efficiency. I tested with a cluster of 2 nodes, and a table of RF=2. The data layout was as follows (1 partition): * Node1: 1 live row, 1M dead rows * Node2: 1M dead rows, 1 live row This was designed to trigger reconciliation right from the very start of the query. Before: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 140.0633503ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 66.7195275ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 873.5400742ms, pages: 2, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` After: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 136.9035122ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 69.5286021ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 162.6239498ms, pages: 100, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` Non-reconciling queries have almost identical duration (1 few ms changes can be observed between runs). Note how in the after case, the reconciling read also produces 100 pages, vs. just 2 pages in the before case, leading to a much lower duration (less than 1/4 of the before). Refs https://github.com/scylladb/scylladb/issues/7929 Refs https://github.com/scylladb/scylladb/issues/3672 Refs https://github.com/scylladb/scylladb/issues/7933 Fixes https://github.com/scylladb/scylladb/issues/9111 Closes #14923 * github.com:scylladb/scylladb: test/topology_custom: add test_read_repair.py replica/mutation_dump: detect end-of-page in range-scans tools/scylla-sstable: write: abort parser thread if writing fails test/pylib: add REST methods to get node exe and workdir paths test/pylib/rest_client: add load_new_sstables, keyspace_{flush,compaction} service/storage_proxy: add trace points for the actual read executor type service/storage_proxy: add trace points for read-repair storage_proxy: Add more trace-level logging to read-repair database: Fix accounting of small partitions in mutation query database, storage_proxy: Reconcile pages with no live rows incrementally	2023-09-11 19:20:19 +03:00
Kefu Chai	a0dcbb09c3	build: cmake: add packaging support this change allows CMake to build the dist tarball for a certain build. Refs #15241 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-11 23:05:30 +08:00
Kefu Chai	649c8f248d	build: cmake: enable build of seastar/apps/iotune scylla redistribute iotune, so let's enable the related building options, so that we can built iotune on demand. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-11 23:02:52 +08:00
Nadav Har'El	45ec76cfbf	Merge 'Enlighten native-transport shutdown' from Pavel Emelyanov When `nodetool disablebinary` command executes its handler aborts listening sockets, shuts down all client connections _and_ (!) then waits for the connections to stop existing. Effectively the command tries to make sure that no activity initiated by a CQL query continues, even though client would never see its result (client sockets are closed) This makes the disablebinary command hang for long sometimes, which is not really nice. The proposal is to wait for the connections to terminate in the background. So once disablebinary command exists what's guaranteed is that all client connections are aborted and new connections are not admitted, but some activity started by them may still be running (e.g. up until `nodetool drain` is issued). Driver-side sockets won't get the queries' results anyway. The behavior of `disablebinary` is not documented wrt whether it should wait for CQL processing to stop or not, so technically we're not breaking anything. However, it can happen that it's a disruptive change and some setups may behave differently after it. refs: #14031 refs: #14711 Closes #14743 * github.com:scylladb/scylladb: test/cql-pytest: Add enable\|disable-binary test case test.py: Add suite option to auto-dirty cluster after test test/pylib: Add nodetool enable\|disable-binary commands transport: Shutdown server on disablebinary generic_server: Introduce shutdown() generic_server: Decouple server stopped from connection stopped transport/controller: Coroutinize do_stop_server() transport/controller: Coroutinize stop_server()	2023-09-11 17:54:52 +03:00
Pavel Emelyanov	821a9c1fd4	test/cql-pytest: Add enable\|disable-binary test case The test checks that `nodetool disablebinary` makes subsequent queries fail and `nodetool enablebinary` lets client to establish new connections. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:38:49 +03:00
Pavel Emelyanov	375b8c6213	test.py: Add suite option to auto-dirty cluster after test ScyllaCluster can be marked as 'dirty' which means that the cluster is in unusable state (after test) and shouldn't be re-used by other tests launched by test.py. For now this is only implemented via the cluster manager class which is only available for topology tests. Add a less flexible short-cut for cql-pytest-s via suite.yaml marking. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	2c3b30b395	test/pylib: Add nodetool enable\|disable-binary commands Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	b42391bfbe	transport: Shutdown server on disablebinary ... and do the real "sharded::stop" in the background. On node shutdown it needs to pick up all dangling background stopping. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	4682c7f9a5	generic_server: Introduce shutdown() The method waits for listening sockets to stop listening and aborts the connected sockets, but doesn't wait for the established connections to finish processing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	6dcf653995	generic_server: Decouple server stopped from connection stopped The _stopped future resolves when all "sockets" stop -- listening and connected ones. Furure patching will need to wait for listening sockets to stop separately from connected ones. Rename the `_stopped` to reflect what it is now while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Pavel Emelyanov	bc2d44994a	transport/controller: Coroutinize do_stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Pavel Emelyanov	7701aa0789	transport/controller: Coroutinize stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Benny Halevy	7119c1d8cc	token_metadata: update_topology: make endpoint_dc_rack arg optional It's better to pass a disengaged optional when the caller doesn't have the information rather than passing the default dc_rack location so the latter will never implicitly override a known endpoint dc/rack location. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15300	2023-09-11 16:16:19 +02:00
Benny Halevy	08f8fd30ea	gossiper: get rid of comment about advertise_removing It was deleted in `66ff072540`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20230911140349.1809014-1-bhalevy@scylladb.com>	2023-09-11 16:14:26 +02:00
Benny Halevy	ed32ba7431	storage_service: fixup indentation after last patch Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-11 16:51:28 +03:00
Benny Halevy	f855479c9d	gossiper: get rid of uses_host_id This function practically returned true from inception. In `d38deef499` It started using messaging_service().knows_version(endpoint) that also returns `true` unconditionally, to this day So there's no point calling it since we can assume that `uses_host_id` is true for all versions. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-11 16:48:07 +03:00
Kefu Chai	87088b65b6	util: replace <tab> with spaces to be aligned with seastar's coding-style.md: scylladb uses seastar's coding-style.md. so let's adhere to it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15345	2023-09-11 14:38:46 +03:00
Botond Dénes	e5f724d5bd	Merge 'create-relocatable-package.py: add --node-exporter-dir and --build-dir options' from Kefu Chai this series adds `--node-exporter-dir` and `--build-dir` options to `create-relocatable-package.py`. this enables us to use create relocatable package from arbitrary build directories. Refs #15241 Closes #15299 * github.com:scylladb/scylladb: create-relocatable-package.py: add --node-exporter-dir option build: specify the build dir instead mode	2023-09-11 14:32:53 +03:00
Botond Dénes	0c107c2076	Merge 'dist/debian: add command line option for builddir ' from Kefu Chai so we can point `debian_files_gen.py` to builddir other than 'build', and can optionally use other output directory. this would help to reduce the number of "magic numbers" in our building system. Refs https://github.com/scylladb/scylladb/issues/15241 Closes #15282 * github.com:scylladb/scylladb: dist/debian: specify debian/* file encodings dist/debian: wrap lines whose length exceeds 100 chars dist/debian: add command line option for builddir dist/debian: modularize debian_files_gen.py	2023-09-11 14:31:33 +03:00
Botond Dénes	f770ff7a2b	test/topology_custom: add test_read_repair.py	2023-09-11 07:07:12 -04:00
Botond Dénes	b55cead5cd	replica/mutation_dump: detect end-of-page in range-scans The current read-loop fails to detect end-of-page and if the query result buider cuts the page, it will just proceed to the next partition. This will result in distorted query results, as the result builder will request for the consumption to stop after each clustering row. To fix, check if the page was cut before moving on to the next partition. A unit test reproducing the bug was also added.	2023-09-11 07:02:14 -04:00
Botond Dénes	82f4563757	tools/scylla-sstable: write: abort parser thread if writing fails Currently if writing the sstable fails, e.g. because the input data is out-of-order, the json parser thread hangs because its output is no longer consumed. This results in the entire application just freezing. Fix this by aborting the parsing thread explicitely in the json_mutation_stream_parser destructor. If the parser thread existed successfully, this will be a no-op, but on the error-path, this will ensure that the parser thread doesn't hang.	2023-09-11 07:02:14 -04:00
Botond Dénes	46e37436d0	test/pylib: add REST methods to get node exe and workdir paths	2023-09-11 07:02:14 -04:00

1 2 3 4 5 ...

38768 Commits