scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 23:43:31 +00:00

Author	SHA1	Message	Date
Avi Kivity	c11f2c9bcd	Merge 'scylla-housekeeping: fix exception on parsing version string v2' from Takuya ASADA This reverts `65fbf72ed0` and introduce new version of the patch which fixes SCT breakage after the commit merged. ---- Since Python 3.12, version parsing becomes strict, parse_version() does not accept the version string like '6.1.0~dev'. To fix this, we need to pass acceptable version string to parse_version() like '6.1.0.dev0', which is allowed on Python version scheme. reference: https://packaging.python.org/en/latest/specifications/version-specifiers/ Fixes https://github.com/scylladb/scylladb/issues/19564 Closes https://github.com/scylladb/scylladb/pull/19572 Closes scylladb/scylladb#19670 * github.com:scylladb/scylladb: scylla-housekeeping: fix exception on parsing version string Revert "scylla-housekeeping: fix exception on parsing version string"	2024-07-14 16:24:41 +03:00
Botond Dénes	53a6ec05ed	Merge 'replica: remove rwlock for protecting iteration over storage group map' from Raphael "Raph" Carvalho rwlock was added to protect iterations against concurrent updates to the map. the updates can happen when allocating a new tablet replica or removing an old one (tablet cleanup). the rwlock is very problematic because it can result in topology changes blocked, as updating token metadata takes the exclusive lock, which is serialized with table wide ops like split / major / explicit flush (and those can take a long time). to get rid of the lock, we can copy the storage group map and guard individual groups with a gate (not a problem since map is expected to have a maximum of ~100 elements). so cleanup can close that gate (carefully closed after stopping individual groups such that migrations aren't blocked by long-running ops like major), and ongoing iterations (e.g. triggered by nodetool flush) can skip a group that was closed, as such a group is being migrated out. Fixes #18821. ``` WRITE ===== ./build/release/scylla perf-simple-query --smp 1 --memory 2G --initial-tablets 10 --tablets --write - BEFORE 65559.52 tps ( 59.6 allocs/op, 16.4 logallocs/op, 14.3 tasks/op, 52841 insns/op, 30946 cycles/op, 0 errors) 67408.05 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53018 insns/op, 30874 cycles/op, 0 errors) 67714.72 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53026 insns/op, 30881 cycles/op, 0 errors) 67825.57 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53015 insns/op, 30821 cycles/op, 0 errors) 67810.74 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53009 insns/op, 30828 cycles/op, 0 errors) throughput: mean=67263.72 standard-deviation=967.40 median=67714.72 median-absolute-deviation=547.02 maximum=67825.57 minimum=65559.52 instructions_per_op: mean=52981.61 standard-deviation=79.09 median=53014.96 median-absolute-deviation=36.54 maximum=53025.79 minimum=52840.56 cpu_cycles_per_op: mean=30869.90 standard-deviation=50.23 median=30874.06 median-absolute-deviation=42.11 maximum=30945.94 minimum=30820.89 - AFTER 65448.76 tps ( 59.5 allocs/op, 16.4 logallocs/op, 14.3 tasks/op, 52788 insns/op, 31013 cycles/op, 0 errors) 67290.83 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53025 insns/op, 30950 cycles/op, 0 errors) 67646.81 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53025 insns/op, 30909 cycles/op, 0 errors) 67565.90 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 53058 insns/op, 30951 cycles/op, 0 errors) 67537.32 tps ( 59.3 allocs/op, 16.0 logallocs/op, 14.3 tasks/op, 52983 insns/op, 30963 cycles/op, 0 errors) throughput: mean=67097.93 standard-deviation=931.44 median=67537.32 median-absolute-deviation=467.97 maximum=67646.81 minimum=65448.76 instructions_per_op: mean=52975.85 standard-deviation=108.07 median=53024.55 median-absolute-deviation=49.45 maximum=53057.99 minimum=52788.49 cpu_cycles_per_op: mean=30957.17 standard-deviation=37.43 median=30951.31 median-absolute-deviation=7.51 maximum=31013.01 minimum=30908.62 READ ===== ./build/release/scylla perf-simple-query --smp 1 --memory 2G --initial-tablets 10 --tablets - BEFORE 79423.36 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41840 insns/op, 26820 cycles/op, 0 errors) 81076.70 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41837 insns/op, 26583 cycles/op, 0 errors) 80927.36 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41829 insns/op, 26629 cycles/op, 0 errors) 80539.44 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41841 insns/op, 26735 cycles/op, 0 errors) 80793.10 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41864 insns/op, 26662 cycles/op, 0 errors) throughput: mean=80551.99 standard-deviation=661.12 median=80793.10 median-absolute-deviation=375.37 maximum=81076.70 minimum=79423.36 instructions_per_op: mean=41842.20 standard-deviation=13.26 median=41840.14 median-absolute-deviation=5.68 maximum=41864.50 minimum=41829.29 cpu_cycles_per_op: mean=26685.88 standard-deviation=93.31 median=26662.18 median-absolute-deviation=56.47 maximum=26820.08 minimum=26582.68 - AFTER 79464.70 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41799 insns/op, 26761 cycles/op, 0 errors) 80954.58 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41803 insns/op, 26605 cycles/op, 0 errors) 81160.90 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41811 insns/op, 26555 cycles/op, 0 errors) 81263.10 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41814 insns/op, 26527 cycles/op, 0 errors) 81162.97 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 41806 insns/op, 26549 cycles/op, 0 errors) throughput: mean=80801.25 standard-deviation=755.54 median=81160.90 median-absolute-deviation=361.72 maximum=81263.10 minimum=79464.70 instructions_per_op: mean=41806.47 standard-deviation=5.85 median=41806.05 median-absolute-deviation=4.05 maximum=41813.86 minimum=41799.36 cpu_cycles_per_op: mean=26599.22 standard-deviation=94.84 median=26554.54 median-absolute-deviation=50.51 maximum=26761.06 minimum=26527.05 ``` Closes scylladb/scylladb#19469 * github.com:scylladb/scylladb: replica: remove rwlock for protecting iteration over storage group map replica: get rid of fragile compaction group intrusive list	2024-07-12 15:45:36 +03:00
Piotr Dulikowski	3cdf549da2	Merge 'remove utils::in' from Avi Kivity utils::in uses std::aligned_storage, which is deprecated. Rather than fixing it, replace its only user with simpler code and remove it. No backport needed as this isn't fixing a bug. Closes scylladb/scylladb#19683 * github.com:scylladb/scylladb: utils: remove utils/in.hh gossiper: remove initializer-list overload of add_local_application_state()	2024-07-12 12:06:09 +02:00
Takuya ASADA	373a7825b5	scylla-housekeeping: fix exception on parsing version string Since Python 3.12, version parsing becomes strict, parse_version() does not accept the version string like '6.1.0~dev'. To fix this, we need to pass acceptable version string to parse_version() like '6.1.0.dev0', which is allowed on Python version scheme. Also, release canditate version like '6.0.0~rc3' has same issue, it should be replaced to '6.0.0rc3' to compare in parse_version(). reference: https://packaging.python.org/en/latest/specifications/version-specifiers/ Fixes #19564 Closes scylladb/scylladb#19572	2024-07-12 03:23:34 +09:00
Takuya ASADA	db04f8b16e	Revert "scylla-housekeeping: fix exception on parsing version string" This reverts commit `65fbf72ed0`, since it breaks scylla-housekeeping and SCT because the patch modified version string. We shoudn't modify version string directly, need to pass modified string just for parse_version() instead.	2024-07-12 03:23:34 +09:00
Emil Maskovsky	b9abad0515	test: raft: fix the topology failure recovery test flakiness Setting the error condition for all nodes in the cluster to avoid having to check which one is the coordinator. This should make the test more stable and avoid the flakiness observed when the coordinator node is the one that got the error condition injected. Randomizing the retrieved running servers to reproduce the issue more frequently and to avoid making any assumptions about the order of the servers. Note that only the "raft_topology_barrier_fail" needs to run on a non-coordinator node, the other error "stream_ranges_fail" can be injected on any node (including the coordinator). Fixes: scylladb/scylladb#18614 Closes scylladb/scylladb#19663	2024-07-11 16:23:26 +02:00
Piotr Dulikowski	188b4ac0fc	Merge 'service_level_controller: update configuration on raft change' from Michał Jadwiszczak This patch is a follow-up to scylladb/scylladb#16585. Once we have service levels on raft, we can get rid of update loop, which updates the configuration in a configured interval (default is 10s). Instead, this PR introduces methods to `group0_state_machine` which look through table ids in mutations in `write_mutation` and update submodules based on that ids. Fixes: scylladb/scylladb#18060 Closes scylladb/scylladb#18758 * github.com:scylladb/scylladb: test: remove `sleep()`s which were required to reload service levels configuration test/cql_test_env: remove unit test service levels data accessors service/storage_service: reload SL cache on topology_state_load() service/qos/service_level_controller: move semaphore breaking to stop service/qos/service_level_controller: maybe start and stop legacy update loop service/qos/service_level_controller: make update loop legacy raft/group0_state_machine: update submodules based on table_id service/storage_service: add a proxy method to reload sl cache	2024-07-11 16:18:48 +02:00
Kefu Chai	2a1c9ed7cb	github: use needs.read-toolchain.outputs.image for iwyu's container in `9a71543fd2`, we introduced a regression, which failed to use the proper value for the container image in which the iwyu workflow is run. in this change, we pass the correct value, as we do in clang-tidy.yaml workflow. Refs `9a71543fd2` Fixes scylladb/scylladb#19704 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19697	2024-07-11 17:17:37 +03:00
Tomas Nozicka	26466a3043	Allow configuring default loglevel with args for container images Closes scylladb/scylladb#19671	2024-07-11 12:37:53 +03:00
Piotr Dulikowski	19c5e1807c	Merge 'schema: fix describe of indexes on collections' from Michał Jadwiszczak If the index was created on collection (both frozen or not), its description wasn't a correct create statement. This patch fixes the bug and includes functions like `full()`, `keys()`, `values()`, ... used to create index on collections. Fixes scylladb/scylladb#19278 Closes scylladb/scylladb#19381 * github.com:scylladb/scylladb: cql-pytest/test_describe: add a test for describe indexes schema/schema: fix column names in index description	2024-07-11 09:11:01 +02:00
Kefu Chai	9a71543fd2	github: always use the tools/toolchain/image for lint workflows instead of hardwiring the toolchain image in github workflows, read it from `tools/toolchain/image`. a dedicated reusable workflow is added to read from this file, and expose its content with an output parameter. also, switch iwyu.yaml workflow to this image, more maintainable this way. please note, before this change, we are also using the latest stable build of clang, and since fedora 40 is also using the clang 18, so the behavior is not change. but with this change, we don't have the flexibility of using other clang versions provided https://apt.llvm.org in future. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19655	2024-07-10 23:45:35 +03:00
Avi Kivity	65a7fc9902	Merge 'transport, service: move definition of destructors into .cc' from Kefu Chai this changeset includes two changes: - service: move storage_service::~storage_service() into .cc - transport: move the cql_server::~cql_server() into .cc they intends to address the compile failures when building scylladb with clang-19. clang-19 is more picky when generating the defaulted destructors with incomplete types. but its behavior makes sense regarding to standard compliance. so let's update accordingly. --- it's a cleanup, hence no need to backport. Closes scylladb/scylladb#19668 * github.com:scylladb/scylladb: transport: move the cql_server::~cql_server() into .cc service: move storage_service::~storage_service() into .cc	2024-07-10 23:43:16 +03:00
Kefu Chai	06ba523818	sstable: extract file_writer out `sstables::write()` has multiple overloads, which are defined in `sstables/writer.hh`. two of these overloads are template functions, which have a template parameter named `W`, which has a type constraint requiring it to fulfill the `Writer` concept. but in `types.hh`, when the compiler tries to instantiate the template function with signature of `write(sstable_version_types v, W& out, const T& t)` with `file_writer` as the template parameter of `w`, `file_writer` is only forward-declared using `class file_writer` in the same header file, so this type is still an incomplete type at that moment. that's why the compiler is not able to determine if `file_writer` fulfills the constraint or not. actually, the declaration of `file_writer` is located in `sstables/writer.hh`, which in turn includes `types.hh`. so they form a cyclic dependency. in this change, in order to break this cycle, we extract file_writer out into a separate header file, so that both `sstables/writer.hh` and `sstables/types.hh` can include it. this address the build failure. Fixes scylladb/scylladb#19667 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19669	2024-07-10 23:32:47 +03:00
Michał Chojnowski	fdd8b03d4b	scylla-gdb.py: add $coro_frame() Adds a convenience function for inspecting the coroutine frame of a given seastar task. Short example of extracting a coroutine argument: ``` (gdb) p $coro_frame(seastar::local_engine->_current_task) $1 = { __resume_fn = 0x2485f80 <sstables::parse(schema const&, sstables::sstable_version_types, sstables::random_access_reader&, sstables::statistics&)>, ... PointerType_7 = 0x601008e67880, ... __coro_index = 0 '\000' ... (gdb) p $downcast_vptr($->PointerType_7) $2 = (schema ) 0x601008e67880 ``` Closes scylladb/scylladb#19479	2024-07-10 21:46:27 +03:00
Avi Kivity	45e27c0da2	config, enum_option: allow round-trip string conversion The default configuration for replication_strategy_warn_list is ["SimpleStrategy"], but one cannot set this via CQL: cqlsh> select * from system.config where name = 'replication_strategy_warn_list'; name \| source \| type \| value --------------------------------+---------+---------------------------+-------------------- replication_strategy_warn_list \| default \| replication strategy list \| ["SimpleStrategy"] (1 rows) cqlsh> update system.config set value = '[NetworkTopologyStrategy]' where name = 'replication_strategy_warn_list'; cqlsh> select * from system.config where name = 'replication_strategy_warn_list'; name \| source \| type \| value --------------------------------+--------+---------------------------+----------------------------- replication_strategy_warn_list \| cql \| replication strategy list \| ["NetworkTopologyStrategy"] (1 rows) cqlsh> update system.config set value = '["NetworkTopologyStrategy"]' where name = 'replication_strategy_warn_list'; WriteFailure: Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed for system.config - received 0 responses and 1 failures from 1 CL=ONE." info={'consistency': 'ONE', 'required_responses': 1, 'received_responses': 0, 'failures': 1} Fix by allowing quotes in enum_set parsing. Bug present since `8c464b2ddb` ("guardrails: restrict replication strategy (RS)", 6.0). Fixes #19604. Closes scylladb/scylladb#19605	2024-07-10 20:39:01 +03:00
Yaron Kaikov	e33126fc3e	.github/script/label_promoted_commit.py: add label only if ref is PR we got a failure during check-commit action: ``` Run python .github/scripts/label_promoted_commits.py --commit_before_merge `30e82a81e8` --commit_after_merge `f31d5e3204` --repository scylladb/scylladb --ref refs/heads/master Commit sha is: `d5a149fc01` Commit sha is: `415457be2b` Commit sha is: `d3b1ccd03a` Commit sha is: `1fca341514` Commit sha is: `f784be6a7e` Commit sha is: `80986c17c3` Commit sha is: `492d0a5c86` Commit sha is: `7b3f55a65f` Commit sha is: `78d6471ce4` Commit sha is: `7a69d9070f` Commit sha is: `a9e985fcc9` master branch, pr number is: 19213 Traceback (most recent call last): File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 87, in <module> main() File "/home/runner/work/scylladb/scylladb/.github/scripts/label_promoted_commits.py", line 81, in main pr = repo.get_pull(pr_number) File "/usr/lib/python3/dist-packages/github/Repository.py", line 2746, in get_pull headers, data = self._requester.requestJsonAndCheck( File "/usr/lib/python3/dist-packages/github/Requester.py", line 353, in requestJsonAndCheck return self.__check( File "/usr/lib/python3/dist-packages/github/Requester.py", line 378, in __check raise self.__createException(status, responseHeaders, output) github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/pulls/pulls#get-a-pull-request", "status": "404"} Error: Process completed with exit code 1. ``` The reason for this failure is since in one of the promoted commits (`a9e985fcc9`) had a reference of `Closes` to an issue. Fixes: https://github.com/scylladb/scylladb/issues/19677 Closes scylladb/scylladb#19678	2024-07-10 15:27:12 +03:00
Botond Dénes	9bdcba7a46	Merge 'conf: scylla.yaml: update documentation for tablets' from Benny Halevy Tablets are no longer in experimental_features since `83d491a`, so remove them from the experimental_features section documentation. Also, expand the documentation for the `enable_tablets` option. Fixes #19456 Needs backport to 6.0 Closes scylladb/scylladb#19516 * github.com:scylladb/scylladb: conf: scylla.yaml: enable_tablets: expand documentation conf: scylla.yaml: remove tablets from experimental_features doc comment	2024-07-10 14:32:40 +03:00
Avi Kivity	8b7a2661c1	utils: remove utils/in.hh It uses deprecated std::aligned_storage and had only one user (now removed) rather than maintain it, remove.	2024-07-10 14:11:27 +03:00
Avi Kivity	d50ba03965	gossiper: remove initializer-list overload of add_local_application_state() The initializer_list overload uses a too-clever technique to avoid copies. While copies here are unlikely to pose any real problem (we're allocating map nodes anyway), it's simple enough to provide a copy-less replacement that doesn't require questionable tricks. We replace the initializer_list<..., in<>> overload with a variadic template that constructs a temporary map.	2024-07-10 14:11:27 +03:00
Michał Jadwiszczak	375499b727	test: remove `sleep()`s which were required to reload service levels configuration Previously, some service levels tests requires to sleep in order to ensure in-memory configuration of service levels was updated. Now, when we are updating the configuration as the raft log is applied, doing read barrier (for instance to execute `DROP TABLE IF EXISTS non_existing_table`) is enough and the sleeps are not needed.	2024-07-10 10:42:21 +02:00
Michał Jadwiszczak	23bebb8037	test/cql_test_env: remove unit test service levels data accessors Unit test data accessors were created to avoid starting update loop in unit test and to update controller's configuration directly. With raft data accessor and configuration updates on applying raft log, we can get rid of unit test data accessors and use the raft one. This also make unit test env a bit like real Scylla environment.	2024-07-10 10:42:21 +02:00
Michał Jadwiszczak	de857d9ce3	service/storage_service: reload SL cache on topology_state_load() Since SL cache is no longer updated in a loop, it needs to be initialized on startup and because we are updating the cache while applying raft commands, we can initialize it on topology_state_load().	2024-07-10 10:42:20 +02:00
Jadw1	cf29242962	service/qos/service_level_controller: move semaphore breaking to stop Before this, the notification semaphore was broken() in do_abort(), which was triggered by early abort source. However we are going to reload sl cache on topology state reload and it can happen after the early abort source is triggered, so it may throw broken_semaphore exception. We can move semaphore breaking to stop() method. Legacy update loop is still stopped in do_abort(), so it doesn't change the order of service level controller shutdown.	2024-07-10 10:33:24 +02:00
Michał Jadwiszczak	85119b90df	service/qos/service_level_controller: maybe start and stop legacy update loop In previous commit, we marked the update loop as legacy. For compatibility reasons, we need to start legacy update loop when the cluster is in recovery mode or it hasn't been upgraded to raft topology. Then, in the update loop we check if all conditions are met and stop the loop. This commit also moves start of update loop later (after topology state is loaded) in main.cc. There is no risk in doing it later.	2024-07-10 10:23:04 +02:00
Michał Jadwiszczak	b0f76db9f2	service/qos/service_level_controller: make update loop legacy Rename method which started update loop to better reflect what it does. Previously the method was named `update_from_distributed_data`, however it doesn't update anything but only start the update loop, which we are making legacy.	2024-07-10 10:23:04 +02:00
Michał Jadwiszczak	5ddf5e3d7d	raft/group0_state_machine: update submodules based on table_id We want to update service levels cache when any new mutations are applied to service levels table. To not create new raft command type, this commit changes design of `write_mutations` to updated in-memory structures based on mutations' table_id.	2024-07-10 10:23:04 +02:00
Michał Jadwiszczak	b61047a3f8	service/storage_service: add a proxy method to reload sl cache In this series of patches, we want to reload service levels cache when any changes to SL table are applied. Firstly we need to have a way to trigger reload of the cache from `group0_state_machines`. To not introduce another dependency, we can use `storage_service` (which has access to SL controller) and add a proxy method to it.	2024-07-10 10:23:04 +02:00
Nadav Har'El	c6cffe36dd	Merge 'cql: forbid having counter columns in tablets tables' from Piotr Smaron Counter updates break under tablet migration (#18180), and for this reason counters need to be disabled until the problem is fixed. It's enough to forbid creating a table with counters, as altering a table without counters already cannot result in the table having counters: 1) Adding a counter column to a table without counters: ``` cqlsh> ALTER TABLE temp.cf ADD (col_name counter); ConfigurationException: Cannot add a counter column (col_name) in a non counter column family ``` 2) Altering a column to be of the counter type: ``` cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter; ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible. ``` Fixes: #19449 Fixes: https://github.com/scylladb/scylladb/issues/18876 Need to backport to 6.0, as this is broken there. Closes scylladb/scylladb#19518 * github.com:scylladb/scylladb: doc: add notes to feature pages which don't support tablets cql: adjust warning about tablets cql: forbid having counter columns in tablets tables	2024-07-10 10:18:30 +03:00
Michał Jadwiszczak	b65a4c66f0	cql-pytest/test_describe: add a test for describe indexes	2024-07-10 07:14:46 +02:00
Kefu Chai	7e4e685964	transport: move the cql_server::~cql_server() into .cc because transport/server.cc has the complete definition of event_notifier, the compiler can default-generate the destructor of `cql_server` with the necessary information. otherwise, clang-19 would fail to build, like: ``` FAILED: CMakeFiles/scylla.dir/Dev/main.cc.o /home/kefu/.local/bin/clang++ -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_PROGRAM_OPTIONS_NO_LIB -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Dev\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -I/home/kefu/dev/scylladb/build -isystem /home/kefu/dev/scylladb/build/rust -isystem /home/kefu/dev/scylladb/abseil -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -MD -MT CMakeFiles/scylla.dir/Dev/main.cc.o -MF CMakeFiles/scylla.dir/Dev/main.cc.o.d -o CMakeFiles/scylla.dir/Dev/main.cc.o -c /home/kefu/dev/scylladb/main.cc In file included from /home/kefu/dev/scylladb/main.cc:11: In file included from /usr/include/yaml-cpp/yaml.h:10: In file included from /usr/include/yaml-cpp/parser.h:11: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/memory:78: /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:91:16: error: invalid application of 'sizeof' to an incomplete type 'cql_transport::cql_server::event_notifier' 91 \| static_assert(sizeof(_Tp)>0, \| ^~~~~~~~~~~ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unique_ptr.h:398:4: note: in instantiation of member function 'std::default_delete<cql_transport::cql_server::event_notifier>::operator()' requested here 398 \| get_deleter()(std::move(__ptr)); \| ^ /home/kefu/dev/scylladb/transport/server.hh:135:7: note: in instantiation of member function 'std::unique_ptr<cql_transport::cql_server::event_notifier>::~unique_ptr' requested here 135 \| class cql_server : public seastar::peering_sharded_service<cql_server>, public generic_server::server { \| ^ /home/kefu/dev/scylladb/transport/server.hh:135:7: note: in implicit destructor for 'cql_transport::cql_server' first required here /home/kefu/dev/scylladb/transport/server.hh:149:11: note: forward declaration of 'cql_transport::cql_server::event_notifier' 149 \| class event_notifier; \| ^ 1 error generated. ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-07-10 12:52:51 +08:00
Kefu Chai	79ffde063a	service: move storage_service::~storage_service() into .cc as repair/repair.cc has the complete definition of node_ops_meta_data, the compiler can default-generate the destructor of `storage_service` with the necessary information. otherwise, clang-19 would fail to build, like: ``` FAILED: repair/CMakeFiles/repair.dir/Dev/repair.cc.o /home/kefu/.local/bin/clang++ -DDEVEL -DFMT_SHARED -DSCYLLA_BUILD_MODE=dev -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Dev\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -O2 -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -MD -MT repair/CMakeFiles/repair.dir/Dev/repair.cc.o -MF repair/CMakeFiles/repair.dir/Dev/repair.cc.o.d -o repair/CMakeFiles/repair.dir/Dev/repair.cc.o -c /home/kefu/dev/scylladb/repair/repair.cc In file included from /home/kefu/dev/scylladb/repair/repair.cc:9: In file included from /home/kefu/dev/scylladb/repair/repair.hh:11: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/unordered_map:41: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:33: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:35: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:34: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/tuple:38: /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:291:11: error: field has incomplete type 'service::node_ops_meta_data' 291 \| _T2 second; ///< The second member \| ^ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/ext/aligned_buffer.h:93:28: note: in instantiation of template class 'std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>' requested here 93 \| : std::aligned_storage<sizeof(_Tp), __alignof__(_Tp)> \| ^ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:334:43: note: in instantiation of template class '__gnu_cxx::__aligned_buffer<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>' requested here 334 \| __gnu_cxx::__aligned_buffer<_Value> _M_storage; \| ^ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:373:7: note: in instantiation of template class 'std::__detail::_Hash_node_value_base<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>' requested here 373 \| : _Hash_node_value_base<_Value> \| ^ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:1662:21: note: in instantiation of template class 'std::__detail::_Hash_node_value<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>, false>' requested here 1662 \| ._M_bucket_index(declval<const __node_value_type&>(), \| ^ /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:109:11: note: in instantiation of member function 'std::_Hashtable<utils::tagged_uuid<node_ops_id_tag>, std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>, std::allocator<std::pair<const utils::tagged_uuid<node_ops_id_tag>, service::node_ops_meta_data>>, std::__detail::_Select1st, std::equal_to<utils::tagged_uuid<node_ops_id_tag>>, std::hash<utils::tagged_uuid<node_ops_id_tag>>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true>>::~_Hashtable' requested here 109 \| class unordered_map \| ^ /home/kefu/dev/scylladb/service/storage_service.hh:109:7: note: forward declaration of 'service::node_ops_meta_data' 109 \| class node_ops_meta_data; \| ^ In file included from /home/kefu/dev/scylladb/repair/repair.cc:9: In file included from /home/kefu/dev/scylladb/repair/repair.hh:11: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/unordered_map:41: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/unordered_map.h:33: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable.h:35: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/hashtable_policy.h:34: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/tuple:38: In file included from /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/stl_pair.h:60: ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-07-10 12:52:51 +08:00
Michał Jadwiszczak	253feb6811	schema/schema: fix column names in index description Previously description of index didn't include functions for indexes on collections like full(), keys(), values(), etc...	2024-07-09 22:37:05 +02:00
Raphael S. Carvalho	c539b7c861	replica: remove rwlock for protecting iteration over storage group map rwlock was added to protect iterations against concurrent updates to the map. the updates can happen when allocating a new tablet replica or removing an old one (tablet cleanup). the rwlock is very problematic because it can result in topology changes blocked, as updating token metadata takes the exclusive lock, which is serialized with table wide ops like split / major / explicit flush (and those can take a long time). to get rid of the lock, we can copy the storage group map and guard individual groups with a gate (not a problem since map is expected to have a maximum of ~100 elements). so cleanup can close that gate (carefully closed after stopping individual groups such that migrations aren't blocked by long-running ops like major), and ongoing iterations (e.g. triggered by nodetool flush) can skip a group that was closed, as such a group is being migrated out. Check documentation added to compaction_group.hh to understand how concurrent iterations and updates to the map work without the rwlock. Yielding variants that iterate over groups are no longer returning group id since id stability can no longer be guaranteed without serializing split finalization and iteration. Fixes #18821. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-07-09 16:59:24 -03:00
Raphael S. Carvalho	ad5c5bca5f	replica: get rid of fragile compaction group intrusive list It was added to make integration of storage groups easier, but it's complicated since it's another source of truth and we could have problems if it becomes inconsistent with the group map. Fixes #18506. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-07-09 16:53:35 -03:00
Piotr Smaron	531659f8dc	doc: add notes to feature pages which don't support tablets There's already a page which lists which features are not working with tablets: architecture/tablets.html#limitations-and-unsupported-features, but it's also helpful for users to be warned about this when visiting a specific feature doc page.	2024-07-09 18:18:05 +02:00
Avi Kivity	f31d5e3204	Merge 'repair/streaming: enable toggling tombstone gc with a config item' from Botond Dénes We currently disable tombstone GC for compaction done on the read path of streaming and repair, because those expired tombstones can still prevent data resurrection. With time-based tombstone GC, missing a repair for long enough can cause data resurrection because a tombstone is potentially GC'd before it could be spread to every node by repair. So repair disseminating these expired tombstones helps clusters which missed repair for long enough. It is not a guarantee because compaction could have done the GC itself, but it is better than nothing. This last resort is getting less important with repair-based tombstone GC. Furthermore, we have seen this cause huge repair amplification in a cluster, where expired tombstones triggered repair replicating otherwise identical rows. This series makes tombstone GC on the streaming/repair compaction path configurable with a config item. This new config item defaults to `false` (current behaviour), setting it to `true`, will enable tombstone GC. Fixes: https://github.com/scylladb/scylladb/issues/19015 Not a regression, no backport needed Closes scylladb/scylladb#19016 * github.com:scylladb/scylladb: test/topology_custom/test_repair: add test for enable_tombstone_gc_for_streaming_and_repair replica/table: maybe_compact_for_streaming(): toggle tombstone GC based on the control flag replica: propagate enable_tombstone_gc_for_streaming_and_repair to maybe_compact_for_streaming() db/config: introduce enable_tombstone_gc_for_streaming_and_repair	2024-07-09 19:04:11 +03:00
Piotr Smaron	5bfabff9a0	cql: adjust warning about tablets Made it shorter, simpler and mentioned also that counters aren't supported with tablets. Fixes: #18876	2024-07-09 18:01:37 +02:00
Piotr Smaron	c70f321c6f	cql: forbid having counter columns in tablets tables Counter updates break under tablet migration (#18180), and for this reason they need to be disabled until the problem is fixed. It's enough to forbid creating a table with counters, as altering a table without counters already cannot result in the table having counters: 1) Adding a counter column to a table without counters: ``` cqlsh> ALTER TABLE temp.cf ADD (col_name counter); ConfigurationException: Cannot add a counter column (col_name) in a non counter column family ``` 2) Altering a column to be of the counter type: ``` cqlsh> ALTER TABLE temp.cf ALTER col_name TYPE counter; ConfigurationException: Cannot change col_name from type int to type counter: types are incompatible. ``` Fixes: #19449	2024-07-09 18:01:31 +02:00
Patryk Wrobel	a89e3d10af	code-cleanup: add missing header guards The following command had been executed to get the list of headers that did not contain '#pragma once': 'grep -rnw . -e "#pragma once" --include *.hh -L' This change adds missing include guard to headers that did not contain any guard. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#19626	2024-07-09 18:31:35 +03:00
Takuya ASADA	cae999c094	toolchain: change optimized clang install method to standard one Previously optimized clang installation was not used standard build script, it overwrites preinstalled Fedora's clang binaries instead. However this breaks on clang-18.1.8, since libLTO versioning convention. To avoid such problem, let's switch to standard installation method and swith install prefix to /usr/local. Fixes #19203 Closes scylladb/scylladb#19505	2024-07-09 14:22:42 +03:00
Tomasz Grabiec	252110bc54	Merge 'mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion' from Michał Chojnowski apply_monotonically() is run with reclaim disabled. So with some bad luck, sentinel insertion might fail with bad_alloc even on a perfectly healthy node. We can't deal with the failure of sentinel insertion, so this will result in a crash. This patch prevents the spurious OOM by reserving some memory (1 LSA segment) and only making it available right before the critical allocations. Fixes https://github.com/scylladb/scylladb/issues/19552 Closes scylladb/scylladb#19617 * github.com:scylladb/scylladb: mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion logalloc: add hold_reserve logalloc: generalize refill_emergency_reserve()	2024-07-09 13:09:01 +02:00
Anna Stuchlik	948459b1ac	doc: replace a link on the CDC+Kafka page This commit replaces a link to the installation section with a link to the getting started section. Closes scylladb/scylladb#19658	2024-07-09 12:35:43 +03:00
Michael Litvak	ed33e59714	storage_proxy: remove response handler if no targets When writing a mutation, it might happen that there are no live targets to send the mutation to, yet the request can be satisfied. For example, when writing with CL=ANY to a dead node, the request is completed by storing a local hint. Currently, in that case, a write response handler is created for the request and it remains active until it timeouts because it is not removed anywhere, even though the write is completed successfuly after storing the hint. The response handler should be removed usually when receiving responses from all targets, but in this case there are no targets to trigger the removal. In this commit we check if we don't have live targets to send the mutation to. If so, we remove the response handler immediately. Fixes scylladb/scylladb#19529 Closes scylladb/scylladb#19586	2024-07-09 12:11:05 +03:00
Kamil Braun	98c18d8904	Merge 'Add API for read barrier' from Emil Maskovsky Introduce REST API for triggering a read barrier. This is to make sure the database schema is up to date on the node where the read barrier is triggered. One of the use cases is the database backup via the Scylla Manager, which requires that the schema backed up is matching the data or newer (data can be migrated, but an older schema would cause issues). Fixes scylladb/scylladb#19213 Closes scylladb/scylladb#19597 * github.com:scylladb/scylladb: raft: add the read barrier REST API raft: use `raft_timeout` in trigger_snapshot raft: use bad_param_exception for consistency test: raft: verify schema updated after read barrier	2024-07-09 10:58:21 +02:00
Kefu Chai	6af989782c	test: sstable_directory_test: use THREADSAFE_BOOST_REQUIRE_EQUAL when appropriate for better debugging experience. before this change, we have ``` fatal error: in "sstable_directory_test_generation_sanity": critical check sst->generation() == sst1->generation() has failed ``` after this change, we have ``` fatal error: in "sstable_directory_test_generation_sanity": critical check sst->generation() == sst1->generation() has failed [3ghm_0ntw_29vj625yegw7jodysc != 3ghm_0ntw_29vj625yegw7jodysd] ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19639	2024-07-09 10:54:23 +03:00
Kefu Chai	30e82a81e8	test: do not define boost_test_print_type() for types with operator<< before this change, we provide `boost_test_print_type()` for all types which can be formatted using {fmt}. these types includes those who fulfill the concept of range, and their element can be formatted using {fmt}. if the compilation unit happens to include `fmt/ranges.h`. the ranges are formatted with `boost_test_print_type()` as well. this is what we expect. in other words, we use {fmt} to format types which do not natively support {fmt}, but they fulfill the range concept. but `boost::unit_test::basic_cstring` is one of them - it can be formatted using operator<<, but it does not provide fmt::format specialization - it fulfills the concept of range - and its element type is `char const`, which can be formatted using {fmt} that's why it's formatted like: ``` test/boost/sstable_directory_test.cc(317): fatal error: in "sstable_directory_test_generation_sanity": critical check ['s', 's', 't', '-', '>', 'g', 'e', 'n', 'e', 'r', 'a', 't', 'i', 'o', 'n', '(', ')', ' ', '=', '=', ' ', 's', 's', 't', '1', '-', '>', 'g', 'e', 'n', 'e', 'r', 'a', 't', 'i', 'o', 'n', '(', ')'] has failed` ``` where the string is formatted as a sequence-alike container. this is far from readable. so, in this change, we do not define `boost_test_print_type()` for the types which natively support `operator<<` anymore. so they can be printed with `operator<<` when boost::test prints them. Fixes scylladb/scylladb#19637 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#19638	2024-07-09 10:34:37 +03:00
Botond Dénes	9544c364be	scylla-gdb.py: introduce scylla large-objects The equivalent of small-objects, but for large objects (spans). Allows listing object of a large-class, and therefore investigating a run-away class, by attempting to identify the owners of the objects in it. Written to investigate #16493 Closes scylladb/scylladb#16711	2024-07-09 10:21:09 +03:00
Emil Maskovsky	a9e985fcc9	raft: add the read barrier REST API This will allow to trigger the read barrier directly via the API, instead of doing work-arounds (like dropping a non-existent table). The intended use-case is in the Scylla Manager, to make sure that the database schema is up to date after the data has been backed up and before attempting to backup the database schema. The database schema in particular is being backed up just on a single node, which might not yet have the schema at least as new as the data (data can be migrated to a newer schema, but not a vice-versa). The read barrier issued on the node should ensure that the node should have the schema at least as new as the data or newer. Closes #19213	2024-07-08 18:16:27 +02:00
Emil Maskovsky	7a69d9070f	raft: use `raft_timeout` in trigger_snapshot Migrate the "trigger_snapshot" to use the standardized `raft_timeout` approach.	2024-07-08 18:13:31 +02:00
Michał Chojnowski	78d6471ce4	mutation_partition_v2: in apply_monotonically(), avoid bad_alloc on sentinel insertion apply_monotonically() is run with reclaim disabled. So with some bad luck, sentinel insertion might fail with bad_alloc even on a perfectly healthy node. We can't deal with the failure of sentinel insertion, so this will result in a crash. This patch prevents the spurious OOM by reserving some memory (1 LSA segment) and only making it available right before the critical allocations. Fixes scylladb/scylladb#19552	2024-07-08 16:08:27 +02:00

1 2 3 4 5 ...

43518 Commits