scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 22:25:48 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	38d0ea0916	batchlog_manager: Fix drain() reentrability Currently drain() is called twise -- first time from storage_service::drain() (on shutdown), second via batchlog_manager::stop(). The routine is unintentinally re-entrable, because: - explicit check for not aborting the abort source twise - breaking semaphore can be done multiple times - co-await-ing of the _started future works because the future is shared That's not extremely elegant, better to make the drain() bail out early if it was already called. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 16:30:07 +03:00
Botond Dénes	bc4b3e4fa3	Merge 'build: cmake: add packaging support ' from Kefu Chai this change allows CMake to build the dist tarball for a certain build. Refs https://github.com/scylladb/scylladb/issues/15241 Closes #15352 * github.com:scylladb/scylladb: build: cmake: add packaging support build: cmake: enable build of seastar/apps/iotune	2023-09-12 09:59:53 +03:00
Avi Kivity	89ba4e4a5e	Merge 'Stop using anonymous minio bucket for tests' from Pavel Emelyanov Currently minio starts with a bucket that has public anonymous access. Respectively, all tests use unsigned S3 requests. That was done for simplicity, and its better to apply some policy to the bucket and, consequentially, make tests sign their requests. Other than the obvious benefit that we test requests signing in unit tests, another goal of this PR is to make it possible to simulate and test various error paths locally, e.g. #13745 and #13022 Closes #14525 * github.com:scylladb/scylladb: test/s3: Remove AWS_S3_EXTRA usage test/s3: Run tests over non-anonymous bucket test/minio: Create random temp user on start code: Rename S3_PUBLIC_BUCKET_FOR_TEST	2023-09-11 23:12:56 +03:00
Tomasz Grabiec	f77e90a0f0	tests: test_tablets: Reconnect the driver after server restart This is a workaround for the flakiness of the test where INSERT statements following the rolling restart fail with "No host available" exception. The hypothesis is that those INSERTS race with driver reconnecting to the cluster and if INSERTs are attempted before reconnection is finished, the driver will refuse to execute the statements. The real fix should be in the driver to join with reconnections but before that is ready we want to fix CI flakiness. Refs #14746 Closes #15355	2023-09-11 21:58:46 +03:00
Kefu Chai	34e3302c01	dbuild: use --userns option when using podman instead of fabricating a `/etc/password` manually, we can just leave it to podman to add an entry in `/etc/password` in container. as podman allows us to map user's account to the same UID in the container. see https://docs.podman.io/en/stable/markdown/options/userns.container.html. this is not only a cosmetic change, it also avoid the permission denied failure when accessing `/etc/passwd` in the container when selinux is enabled. without this change, we would otherwise need to either add the selinux lable to the bind volume with ':Z' option address the failure like: ``` type=AVC msg=audit(1693449115.261:2599): avc: denied { open } for pid=2298247 comm="bash" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 type=AVC msg=audit(1693449115.263:2600): avc: denied { open } for pid=2298249 comm="id" path="/etc/passwd" dev="tmpfs" ino=5931 scontext=system_u:system_r:container_t:s0:c252,c259 tcontext=unconfined_u:object_r:user_tmp_t:s0 tclass=file permissive=0 ``` found in `/var/log/audit/audit.log`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15230	2023-09-11 21:41:48 +03:00
Avi Kivity	b8a655f55e	Update tools/python3 submodule * tools/python3 45fbd05...3e833f1 (1): > install.sh: replace <tab> with spaces	2023-09-11 21:38:02 +03:00
Avi Kivity	628e6ffd33	Merge 'database, storage_proxy: Reconcile pages with dead rows and partitions incrementally' from Botond Dénes Currently, mutation query on replica side will not respond with a result which doesn't have at least one live row. This causes problems if there is a lot of dead rows or partitions before we reach a live row, which stem from the fact that resulting reconcilable_result will be large: 1. Large allocations. Serialization of reconcilable_result causes large allocations for storing result rows in std::deque 2. Reactor stalls. Serialization of reconcilable_result on the replica side and on the coordinator side causes reactor stalls. This impacts not only the query at hand. For 1M dead rows, freezing takes 130ms, unfreezing takes 500ms. Coordinator does multiple freezes and unfreezes. The reactor stall on the coordinator side is >5s 3. Too large repair mutations. If reconciliation works on large pages, repair may fail due to too large mutation size. 1M dead rows is already too much: Refs https://github.com/scylladb/scylladb/issues/9111. This patch fixes all of the above by making mutation reads respect the memory accounter's limit for the page size, even for dead rows. This patch also addresses the problem of client-side timeouts during paging. Reconciling queries processing long strings of tombstones will now properly page tombstones,like regular queries do. My testing shows that this solution even increases efficiency. I tested with a cluster of 2 nodes, and a table of RF=2. The data layout was as follows (1 partition): * Node1: 1 live row, 1M dead rows * Node2: 1M dead rows, 1 live row This was designed to trigger reconciliation right from the very start of the query. Before: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 140.0633503ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 66.7195275ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 873.5400742ms, pages: 2, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` After: ``` Running query (node2, CL=ONE, cold cache) Query done, duration: 136.9035122ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 69.5286021ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 162.6239498ms, pages: 100, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] ``` Non-reconciling queries have almost identical duration (1 few ms changes can be observed between runs). Note how in the after case, the reconciling read also produces 100 pages, vs. just 2 pages in the before case, leading to a much lower duration (less than 1/4 of the before). Refs https://github.com/scylladb/scylladb/issues/7929 Refs https://github.com/scylladb/scylladb/issues/3672 Refs https://github.com/scylladb/scylladb/issues/7933 Fixes https://github.com/scylladb/scylladb/issues/9111 Closes #14923 * github.com:scylladb/scylladb: test/topology_custom: add test_read_repair.py replica/mutation_dump: detect end-of-page in range-scans tools/scylla-sstable: write: abort parser thread if writing fails test/pylib: add REST methods to get node exe and workdir paths test/pylib/rest_client: add load_new_sstables, keyspace_{flush,compaction} service/storage_proxy: add trace points for the actual read executor type service/storage_proxy: add trace points for read-repair storage_proxy: Add more trace-level logging to read-repair database: Fix accounting of small partitions in mutation query database, storage_proxy: Reconcile pages with no live rows incrementally	2023-09-11 19:20:19 +03:00
Kefu Chai	a0dcbb09c3	build: cmake: add packaging support this change allows CMake to build the dist tarball for a certain build. Refs #15241 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-11 23:05:30 +08:00
Kefu Chai	649c8f248d	build: cmake: enable build of seastar/apps/iotune scylla redistribute iotune, so let's enable the related building options, so that we can built iotune on demand. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-11 23:02:52 +08:00
Nadav Har'El	45ec76cfbf	Merge 'Enlighten native-transport shutdown' from Pavel Emelyanov When `nodetool disablebinary` command executes its handler aborts listening sockets, shuts down all client connections _and_ (!) then waits for the connections to stop existing. Effectively the command tries to make sure that no activity initiated by a CQL query continues, even though client would never see its result (client sockets are closed) This makes the disablebinary command hang for long sometimes, which is not really nice. The proposal is to wait for the connections to terminate in the background. So once disablebinary command exists what's guaranteed is that all client connections are aborted and new connections are not admitted, but some activity started by them may still be running (e.g. up until `nodetool drain` is issued). Driver-side sockets won't get the queries' results anyway. The behavior of `disablebinary` is not documented wrt whether it should wait for CQL processing to stop or not, so technically we're not breaking anything. However, it can happen that it's a disruptive change and some setups may behave differently after it. refs: #14031 refs: #14711 Closes #14743 * github.com:scylladb/scylladb: test/cql-pytest: Add enable\|disable-binary test case test.py: Add suite option to auto-dirty cluster after test test/pylib: Add nodetool enable\|disable-binary commands transport: Shutdown server on disablebinary generic_server: Introduce shutdown() generic_server: Decouple server stopped from connection stopped transport/controller: Coroutinize do_stop_server() transport/controller: Coroutinize stop_server()	2023-09-11 17:54:52 +03:00
Pavel Emelyanov	821a9c1fd4	test/cql-pytest: Add enable\|disable-binary test case The test checks that `nodetool disablebinary` makes subsequent queries fail and `nodetool enablebinary` lets client to establish new connections. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:38:49 +03:00
Pavel Emelyanov	375b8c6213	test.py: Add suite option to auto-dirty cluster after test ScyllaCluster can be marked as 'dirty' which means that the cluster is in unusable state (after test) and shouldn't be re-used by other tests launched by test.py. For now this is only implemented via the cluster manager class which is only available for topology tests. Add a less flexible short-cut for cql-pytest-s via suite.yaml marking. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	2c3b30b395	test/pylib: Add nodetool enable\|disable-binary commands Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	b42391bfbe	transport: Shutdown server on disablebinary ... and do the real "sharded::stop" in the background. On node shutdown it needs to pick up all dangling background stopping. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	4682c7f9a5	generic_server: Introduce shutdown() The method waits for listening sockets to stop listening and aborts the connected sockets, but doesn't wait for the established connections to finish processing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:37:48 +03:00
Pavel Emelyanov	6dcf653995	generic_server: Decouple server stopped from connection stopped The _stopped future resolves when all "sockets" stop -- listening and connected ones. Furure patching will need to wait for listening sockets to stop separately from connected ones. Rename the `_stopped` to reflect what it is now while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Pavel Emelyanov	bc2d44994a	transport/controller: Coroutinize do_stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Pavel Emelyanov	7701aa0789	transport/controller: Coroutinize stop_server() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-11 17:32:07 +03:00
Benny Halevy	7119c1d8cc	token_metadata: update_topology: make endpoint_dc_rack arg optional It's better to pass a disengaged optional when the caller doesn't have the information rather than passing the default dc_rack location so the latter will never implicitly override a known endpoint dc/rack location. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15300	2023-09-11 16:16:19 +02:00
Benny Halevy	08f8fd30ea	gossiper: get rid of comment about advertise_removing It was deleted in `66ff072540`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20230911140349.1809014-1-bhalevy@scylladb.com>	2023-09-11 16:14:26 +02:00
Kefu Chai	87088b65b6	util: replace <tab> with spaces to be aligned with seastar's coding-style.md: scylladb uses seastar's coding-style.md. so let's adhere to it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15345	2023-09-11 14:38:46 +03:00
Botond Dénes	e5f724d5bd	Merge 'create-relocatable-package.py: add --node-exporter-dir and --build-dir options' from Kefu Chai this series adds `--node-exporter-dir` and `--build-dir` options to `create-relocatable-package.py`. this enables us to use create relocatable package from arbitrary build directories. Refs #15241 Closes #15299 * github.com:scylladb/scylladb: create-relocatable-package.py: add --node-exporter-dir option build: specify the build dir instead mode	2023-09-11 14:32:53 +03:00
Botond Dénes	0c107c2076	Merge 'dist/debian: add command line option for builddir ' from Kefu Chai so we can point `debian_files_gen.py` to builddir other than 'build', and can optionally use other output directory. this would help to reduce the number of "magic numbers" in our building system. Refs https://github.com/scylladb/scylladb/issues/15241 Closes #15282 * github.com:scylladb/scylladb: dist/debian: specify debian/* file encodings dist/debian: wrap lines whose length exceeds 100 chars dist/debian: add command line option for builddir dist/debian: modularize debian_files_gen.py	2023-09-11 14:31:33 +03:00
Botond Dénes	f770ff7a2b	test/topology_custom: add test_read_repair.py	2023-09-11 07:07:12 -04:00
Botond Dénes	b55cead5cd	replica/mutation_dump: detect end-of-page in range-scans The current read-loop fails to detect end-of-page and if the query result buider cuts the page, it will just proceed to the next partition. This will result in distorted query results, as the result builder will request for the consumption to stop after each clustering row. To fix, check if the page was cut before moving on to the next partition. A unit test reproducing the bug was also added.	2023-09-11 07:02:14 -04:00
Botond Dénes	82f4563757	tools/scylla-sstable: write: abort parser thread if writing fails Currently if writing the sstable fails, e.g. because the input data is out-of-order, the json parser thread hangs because its output is no longer consumed. This results in the entire application just freezing. Fix this by aborting the parsing thread explicitely in the json_mutation_stream_parser destructor. If the parser thread existed successfully, this will be a no-op, but on the error-path, this will ensure that the parser thread doesn't hang.	2023-09-11 07:02:14 -04:00
Botond Dénes	46e37436d0	test/pylib: add REST methods to get node exe and workdir paths	2023-09-11 07:02:14 -04:00
Botond Dénes	dc269cb6bd	test/pylib/rest_client: add load_new_sstables, keyspace_{flush,compaction} To support the equivalent (roughly) of the following nodetool commands: * nodetool refresh * nodetool flush * nodetool compact	2023-09-11 07:01:20 -04:00
Botond Dénes	ff29f43060	service/storage_proxy: add trace points for the actual read executor type There is currently a trace point for when the read executor is created, but this only contains the initial replica set and doesn't mention which read executor is created in the end. This patch adds trace points for each different return path, so it is clear from the trace whether speculative read can happen or not.	2023-09-11 06:56:13 -04:00
Botond Dénes	727e519c3a	service/storage_proxy: add trace points for read-repair Currently the fact that read-repair was triggered can only be inferred from seeing mutation reads in the trace. This patch adds an explicit trace point for when read repair is triggered and also when it is finished or retried.	2023-09-11 06:56:13 -04:00
Tomasz Grabiec	f76f5f6bfe	storage_proxy: Add more trace-level logging to read-repair Extremely helpful in debugging.	2023-09-11 06:56:13 -04:00
Tomasz Grabiec	0d773c9f9f	database: Fix accounting of small partitions in mutation query The partition key size was ignored by the accounter, as well as the partition tombstone. As a result, a sequence of partitions with just tombstones would be accounted as taking no memory and page size limitter to not kick in. Fix by accounting the real size of accumulated frozen_mutation. Also, break pages across partitions even if there are no live rows. The coordinator can handle it now. Refs #7933	2023-09-11 06:56:13 -04:00
Tomasz Grabiec	2c8a0e4175	database, storage_proxy: Reconcile pages with no live rows incrementally Currently, mutation query on replica side will not respond with a result which doesn't have at least one live row. This causes problems if there is a lot of dead rows or partitions before we reach a live row, which stems from the fact that resulting reconcilable_result will be large: * Large allocations. Serialization of reconcilable_result causes large allocations for storing result rows in std::deque * Reactor stalls. Serialization of reconcilable_result on the replica side and on the coordinator side causes reactor stalls. This impacts not only the query at hand. For 1M dead rows, freezing takes 130ms, unfreezing takes 500ms. Coordinator does multiple freezes and unfreezes. The reactor stall on the coordinator side is >5s. * Large repair mutations. If reconciliation works on large pages, repair may fail due to too large mutation size. 1M dead rows is already too much: Refs #9111. This patch fixes all of the above by making mutation reads respect the memory accounter's limit for the page size, even for dead rows. This patch also addresses the problem of client-side timeouts during paging. Reconciling queries processing long strings of tombstones will now properly page tombstones,like regular queries do. My testing shows that this solution even increases efficiency. I tested with a cluster of 2 nodes, and a table of RF=2. The data layout was as follows (1 partition): Node1: 1 live row, 1M dead rows Node2: 1M dead rows, 1 live row This was designed to trigger reconciliation right from the very start of the query. Before: Running query (node2, CL=ONE, cold cache) Query done, duration: 140.0633503ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 66.7195275ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 873.5400742ms, pages: 2, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] After: Running query (node2, CL=ONE, cold cache) Query done, duration: 136.9035122ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (node2, CL=ONE, hot cache) Query done, duration: 69.5286021ms, pages: 101, result: [Row(pk=0, ck=3000000, v=0)] Running query (all-nodes, CL=ALL, reconcile, cold-cache) Query done, duration: 162.6239498ms, pages: 100, result: [Row(pk=0, ck=0, v=0), Row(pk=0, ck=3000000, v=0)] Non-reconciling queries have almost identical duration (1 few ms changes can be observed between runs). Note how in the after case, the reconciling read also produces 100 pages, vs. just 2 pages in the before case, leading to a much lower duration (less than 1/4 of the before). Refs #7929 Refs #3672 Refs #7933 Fixes #9111	2023-09-11 06:56:13 -04:00
Botond Dénes	685486a20d	Update tools/python3 submodule * tools/python3 30b8fc21...45fbd056 (1): > build_reloc: do not run SCYLLA-VERSION-GEN twice	2023-09-11 10:59:56 +03:00
Kefu Chai	2bbffccaca	SCYLLA-VERSION-GEN: do not print version by default actually, we never use the its output in our workflow. and the output is distracting when building the package. so, in this change, let's print it only on demand. this feature is preserved just in case some of us would want to use this script for getting the version number string. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15327	2023-09-11 10:50:50 +03:00
Kefu Chai	bcc05305ae	build: cmake: set the default CMAKE_BUILD_TYPE to Release if user fails to set "CMAKE_BUILD_TYPE", it would be empty. and CMake would fail with confusing error messages like ``` CMake Error at CMakeLists.txt:21 (list): list sub-command FIND requires three arguments. CMake Error at CMakeLists.txt:27 (include): include could not find requested file: mode. ``` so, in this change * the the default CMAKE_BUILD_TYPE to "Release" * quote the ${CMAKE_BUILD_TYPE} when searching it in the allowed build type lists. this should address the issues above. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15326	2023-09-11 10:49:28 +03:00
Botond Dénes	b062b245ad	Merge 'Don't cache dc:rack on system keyspace local cache' from Pavel Emelyanov The local node's dc:rack pair is cached on system keyspace on start. However, most of other code don't need it as they get dc:rack from topology or directly from snitch. There are few places left that still mess with sysks cache, but they are easy to patch. So after this patch all the core code uses two sources of dc:rack -- topology / snitch -- instead of three. Closes #15280 * github.com:scylladb/scylladb: system_keyspace: Don't require snitch argument on start system_keyspace: Don't cache local dc:rack pair system_keyspace: Save local info with explicit location storage_service: Get endpoint location from snitch, not system keyspace snitch: Introduce and use get_location() method repair: Local location variables instead of system keyspace's one repair: Use full endpoint location instead of datacenter part	2023-09-11 10:26:26 +03:00
Nadav Har'El	ea56c8efcd	test/alternator: reduce code duplication in test for list_append() A reviewer noted that test_update_expression_list_append_non_list_arguments has too much code duplication - the same long API call to run "SET a = list_append(...)" was repeated many times. So in this patch we add a short inner function "try_list_append" to avoid this duplication. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes: #15298	2023-09-11 10:09:35 +03:00
David Garcia	a14bcf7c6a	docs: improve configuration properties reference - Adds type for each option. - Filters out unused / invalid values, moves them to a separate section. - Adds the term "liveness" to the glossary. - Removes unused and invalid properties from the docs. - Updates to the latest version of pyaml. docs: rename config template directive Closes #15164	2023-09-11 09:47:16 +03:00
Botond Dénes	d92620868d	Merge 'docs: improve command line samples in unified-installer.rst' from Kefu Chai in this series, we try to improve `unified-installer.rst` - encourage user to install smaller package - run `./install.sh` directly instead relying on that `sh` points to `bash` Closes #15325 * github.com:scylladb/scylladb: doc: run install.sh directly doc: install headless jdk in sample command line	2023-09-11 09:34:14 +03:00
Botond Dénes	7385f93816	Merge 'Task manager repair tasks progress' from Aleksandra Martyniuk Find progress of repair tasks based on the number of ranges that have been repaired. Fixes: [#1156](https://github.com/scylladb/scylla-enterprise/issues/1156). Closes #14698 * github.com:scylladb/scylladb: test: repair tasks test repair: add methods making repair progress more precise tasks: make progress related methods virtual repair: add get_progress method to shard_repair_task_impl repair: add const noexcept qualifiers to shard_repair_task_impl::ranges_size() repair: log a name of a particular table repair is working on tasks: delete move and copy constructors from task_manager::task::impl	2023-09-11 09:32:23 +03:00
Raphael S. Carvalho	c7e02a1077	storage_service: Enforce tablet streaming runs on shard 0 SIGSEGV was caught during tablet streaming, and the reason was that storage_service::_group0 (via set_group0()) is only set on shard 0, therefore when streaming ran on any other shard, it tried to dereference garbage, which resulted in the crash. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #15307	2023-09-08 20:45:13 +03:00
Kefu Chai	ce6464b649	sstable: do not call into sstable in filesystem_storage::open() before this change, filesystem_storage::open() reuses `sstable::make_component_file_writer()` to create the temporary toc, it will rename the temporary toc to the real TOC when sealing the sstable. but this prevents us from reusing filesystem_storage in yet another storage backend. as the 1. create temporary 2. rename temporary to toc dance only applies to filesystem_storage. when filesystem_storage calls into sstable, it calls `sst.make_component_file_writer()`, which in turn calls the `_storage->make_component_sink()`. but at this moment, `_storage` is not necessarily `filesystem_storage` anymore. it could be a wrapper around `filesystem_storage`, which is not aware of the create-rename dance. and could do a lot more than create a temporary file when asked to "make_component_sink()". if we really want to go this way by reusing sstable's API in `filesystem_storage` to create a temporary toc, we will have to rename the whatever temporary toc component created by the wrapper backend to the toc with the seal() func. but again, this rename op is only implemented in the filesystem_storage backend. to mirror this operation in the wrapper backend does not make sense at all -- it does not have to be aware of the filesystem_storage's internals. so in this change, instead of reusing the `sstable::make_component_file_writer()`, we just inline its implementation in filesystem_storage to avoid this problem. this is also an improvement from the design perspective, as the storage should not call into its the higher abstraction -- sstable. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14443	2023-09-08 19:57:39 +03:00
Kefu Chai	ce291f4385	s3/client: do not use deprecated tls::connect() overload seastar has deprecated the overload which accepts `server_name`, let's use the one which accepts `tls::tls_options`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15324	2023-09-08 18:44:45 +03:00
Avi Kivity	0656810c28	Update tools/java submodule * tools/java 585b30fda6...9dddad27bf (1): > install-dependencies.sh: do not install weak dependencies Frozen toolchain regenerated. Closes #15322	2023-09-08 17:22:07 +03:00
Kamil Braun	26d9a82636	Merge 'raft topology: replace publish_cdc_generation with a bg fiber' from Patryk Jędrzejczak Currently, the topology coordinator has the `topology::transition_state::publish_cdc_generation` state responsible for publishing the already created CDC generations to the user-facing description tables. This process cannot fail as it would cause some CDC updates to be missed. On the other hand, we would like to abort the `publish_cdc_generation` state when bootstrap aborts. Of course, we could also wait until handling this state finishes, even in the case of the bootstrap abort, but that would be inefficient. We don't want to unnecessarily block topology operations by publishing CDC generations. The solution proposed by this PR is to remove the `publish_cdc_generation` state completely and introduce a new background fiber of the topology coordinator -- `cdc_generation_publisher` -- that continually publishes committed CDC generations. Apart from introducing the CDC generation publisher, we add `test_cdc_generation_publishing.py` that verifies its correctness and we adapt other CDC tests to the new changes. Fixes #15194 Closes #15281 * github.com:scylladb/scylladb: test: test_cdc: introduce wait_for_first_cdc_generation test: move cdc_streams_check_and_repair check test: add test_cdc_generation_publishing docs: remove information about publish_cdc_generation raft topology: introduce the CDC generation publisher system_keyspace: load unpublished_cdc_generations to topology raft topology: mark committed CDC generations as unpublished raft topology: add unpublished_cdc_generations to system.topology	2023-09-08 15:08:41 +02:00
Kamil Braun	8bff5843b5	Merge 'test: topology: add tests for gossiper/endpoint/live and gossiper/endpoint/down' from Aleksandra Martyniuk Add tests for gossiper/endpoint/live and gossiper/endpoint/down which run only in release mode. Enable test_remove_node_with_concurrent_ddl and fix types and variables names used by it, so that they can be reused in gossiper test. Fixes: #15223. Closes #15244 * github.com:scylladb/scylladb: test: topology: add gossiper test test: fix types and variable names in wait_for_host_down	2023-09-08 12:43:11 +02:00
Nadav Har'El	548386a0bb	treewide: reduce include of cql_statement.hh ClangBuildAnalyzer reports cql3/cql_statement.hh as being one of the most expensive header files in the project - being included (mostly indirectly) in 129 source files, and costing a total of 844 CPU seconds of compilation. This patch is an attempt, only partially successful, to reduce the number of times that cql_statement.hh is included. It succeeds in lowering the number 129 to 99, but not less :-( One of the biggest difficulties in reducing it further is that query_processor.hh includes a lot of templated code, which needs stuff from cql_statement.hh. The solution should be to un-template the functions in query_processor.hh and move them from the header to a source file, but this is beyond the scope of this patch and query_processor.hh appears problematic in other respects as well. Unfortunately the compilation speedup by this patch is negligible (the `du -bc build/dev/*/.o` metric shows less than 0.01% reduction). Beyond the fact that this patch only removes 30% of the inclusions of this header, it appears that most of the source files that no longer include cql_statement.hh after this patch, included anyway many of the other headers that cql_statement.hh included, so the saving is minimal. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15212	2023-09-08 13:23:50 +03:00
Kefu Chai	7591b1b384	doc: run install.sh directly strictly speaking, `sh` is not necessarily bash. while `install.sh` is written in the Bash dialect. and it errors out if it is not executed with Bash. and we don't need to add "-x" when running the script, if we have to, we should add it in `install.sh` not ask user to add this option. also, `install.sh` is executable with a shebang line using bash, so we can just execute it. so, in this change, we just launch this script in the command line sample. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-08 17:21:30 +08:00
Kefu Chai	1e0c7d14aa	doc: install headless jdk in sample command line in comparison with java-11-openjdk, java-11-openjdk-headless does not offer audio and video support, and has less dependencies. for instance, java-11-openjdk depends on the X11 libraries, and it also provides icons representing JDK. but since scylla is a server side application, we don't expect user to run a desktop on it. so there is no need to support audio and video. in this change, we just suggest the a "smaller" package, which is actually also a dependency of java-11-open-jdk. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-08 17:21:30 +08:00

1 2 3 4 5 ...

38745 Commits