scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-21 00:50:35 +00:00

Author	SHA1	Message	Date
Kefu Chai	da7ffd4e73	tools/scylla-types: print using managed_bytes instead of materializing the `managed_bytes_view` to a string, and print it, print it directly to stdout. this change helps to deprecate `to_hex()` helpers, we should materialize string only when necessary. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17463	2024-02-22 09:00:38 +02:00
Avi Kivity	1df5697bd7	Merge 'Refine some api/column_family endpoints' from Pavel Emelyanov Those that collect vectors with ks/cf names can reserve the vectors in advance. Also one of those can use range loop for shorter code Closes scylladb/scylladb#17433 * github.com:scylladb/scylladb: api: Reserve vectors in advance api: Use range-loop to iterate keyspaces	2024-02-21 19:19:28 +02:00
Tomasz Grabiec	ef9e5e64a3	locator: token_metadata: Introduce topology barrier stall detector When topology barrier is blocked for longer than configured threshold (2s), stale versions are marked as stalled and when they get released they report backtrace to the logs. This should help to identify what was holding for token metadata pointer for too long. Example log: token_metadata - topology version 30 held for 299.159 [s] past expiry, released at: 0x2397ae1 0x23a36b6 ... Closes scylladb/scylladb#17427	2024-02-21 15:05:34 +02:00
Nadav Har'El	e02cfd0035	Merge 'query.h: add fmt::formatter for types' from Kefu Chai query::specific_ranges * query::partition_slice * query::read_command * query::forward_request * query::forward_request::reduction_type * query::forward_request::aggregation_info * query::forward_result::printer * query::result_set * query::result_set_row * query::result::printer Refs #13245 Closes scylladb/scylladb#17440 * github.com:scylladb/scylladb: query-result.hh: add formatter for query::result::printer query-result-set: add formatter for query-result-set.hh types query-request: add formatter for query-request.hh types	2024-02-21 14:46:36 +02:00
Avi Kivity	4be70bfc2b	Merge 'multishard_mutation_query: add tablets support' from Botond Dénes When reading a list of ranges with tablets, we don't need a multishard reader. Instead, we intersect the range list with the local nodes tablet ranges, then read each range from the respective shard. The individual ranges are read sequentially, with database::query[_mutations](), merging the results into a single instance. This makes the code simple. For tablets multishard_mutation_query.cc is no longer on the hot paths, range scans on tables with tablets fork off to a different code-path in the coordinator. The only code using multishard_mutation_query.cc are forced, replica-local scans, like those used by SELECT * FROM MUTATION_FRAGMENTS(). These are mainly used for diagnostics and tests, so we optimize for simplicity, not performance. Fixes: #16484 Closes scylladb/scylladb#16802 * github.com:scylladb/scylladb: test/cql-pytest: remove skip_with_tablets fixture test/cql-pytest: test_select_from_mutation_fragments.py parameterize tests test/cql-pytest: test_select_from_mutation_fragments.py: remove skip_with_tablets multishard_mutation_query: add tablets support multishard_mutation_query: remove compaction-state from result-builder factory multishard_mutation_query: do_query(): return foreign_ptr<lw_shared_ptr<result>> mutation_query: reconcilable_result: add merge_disjoint() locator: introduce tablet_range_spliter dht/i_partitioner: to_partition_range(): don't assume input is fully inclusive interval: add before() overload which takes another interval	2024-02-21 13:40:55 +02:00
Botond Dénes	94dac43b2f	tools/utils: configure tools to use the epoll reactor backend The default AIO backend requires AIO blocks. On production systems, all available AIO blocks could have been already taken by ScyllaDB. Even though the tools only require a single unit, we have seen cases where not even that is available, ScyllDB having siphoned all of the available blocks. We could try to ensure all deployments have some spare blocks, but it is just less friction to not have to deal with this problem at all, by just using the epoll backend. We don't care about performance in the case of the tools anyway, so long as they are not unreasonably slow. And since these tools are replacing legacy tools written in Java, the bar is low. Closes scylladb/scylladb#17438	2024-02-21 11:58:09 +02:00
Kefu Chai	1263494dd1	query-result.hh: add formatter for query::result::printer before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for following types * query::result::printer Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-21 17:57:18 +08:00
Kefu Chai	e5a930e7c6	query-result-set: add formatter for query-result-set.hh types before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for following types * query::result_set * query::result_set_row Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-21 17:54:48 +08:00
Kefu Chai	4383ca431c	query-request: add formatter for query-request.hh types before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for following types * query::specific_ranges * query::partition_slice * query::read_command * query::forward_request * query::forward_request::reduction_type * query::forward_request::aggregation_info * query::forward_result::printer Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-21 17:54:41 +08:00
Botond Dénes	ca585903b7	test/cql-pytest: remove skip_with_tablets fixture All tests that used it are fixed, and we should not add any new tests failing with tablets from now on, so remove.	2024-02-21 02:08:49 -05:00
Botond Dénes	8df82d4781	test/cql-pytest: test_select_from_mutation_fragments.py parameterize tests To run with both vnodes and tablets. For this functionality, both replication methods should be covered with tests, because it uses different ways to produce partition lists, depending on the replication method. Also add scylla_only to those tests that were missing this fixture before. All tests in this suite are scylla-only and with the parameterization, this is even more apparent.	2024-02-21 02:08:49 -05:00
Botond Dénes	b09b949159	test/cql-pytest: test_select_from_mutation_fragments.py: remove skip_with_tablets The underlying functionality was fixed, the tests should now pass with tablets.	2024-02-21 02:08:49 -05:00
Botond Dénes	ce472b33b8	multishard_mutation_query: add tablets support When reading a list of ranges with tablets, we don't need a multishard reader. Instead, we intersect the range list with the local nodes tablet ranges, then read each range from the respective shard. The individual ranges are read sequentially, with database::query[_mutations](), merging the results into a single instance. This makes the code simple. For tablets, multishard_mutation_query.cc is no longer on the hot paths, range scans on tables with tablets fork off to a different code-path in the coordinator. The only code using multishard_mutation_query.cc are forced, replica-local scans, like those used by SELECT * FROM MUTATION_FRAGMENTS(). These are mainly used for diagnostics and tests, so we optimize for simplicity, not performance.	2024-02-21 02:08:48 -05:00
Botond Dénes	d160a179ee	multishard_mutation_query: remove compaction-state from result-builder factory This param was used by the query-result builder, to set the last-position on end-of-stream. Instead, do this via a new ResultBuilder method, maybe_set_last_position(), which is called from read_page(), which has access to the compaction-state. With this, the ResultBuilder can be created without a compaction-state at hand. This will be important in the next patch.	2024-02-21 02:08:48 -05:00
Botond Dénes	95bc0cb1c0	multishard_mutation_query: do_query(): return foreign_ptr<lw_shared_ptr<result>> Makes future patching easier.	2024-02-21 02:08:48 -05:00
Botond Dénes	35e6cbf42e	mutation_query: reconcilable_result: add merge_disjoint() Merging two disjoint reconcilable_result instances.	2024-02-21 02:08:48 -05:00
Botond Dénes	7bdd0c2cae	locator: introduce tablet_range_spliter Given a list of partition-ranges, yields the intersection of this range-list, with that of that tablet-ranges, for tablets located on the given host. This will be used in multishard_mutation_query.cc, to obtain the ranges to read from the local node: given the read ranges, obtain the ranges belonging to tablets who have replicas on the local node.	2024-02-21 02:08:48 -05:00
Botond Dénes	4993d0e30a	dht/i_partitioner: to_partition_range(): don't assume input is fully inclusive Consider the inclusiveness of the token-range's start and end bounds and copy the flag to the output bounds, instead of assuming they are always inclusive.	2024-02-21 02:08:48 -05:00
Botond Dénes	239484f259	interval: add before() overload which takes another interval The current point variant cannot take inclusiveness into account, when said point comes from another interval bound. This method had no tests at all, so add tests covering both overloads.	2024-02-21 02:08:48 -05:00
Avi Kivity	605bf6e221	range.hh: retire range.hh was deprecated in `bd794629f9` (2020) since its names conflict with the C++ library concept of an iterator range. The name ::range also mapped to the dangerous wrapping_interval rather than nonwrapping_interval. Complete the deprecation by removing range.hh and replacing all the aliases by the names they point to from the interval library. Note this now exposes uses of wrapping intervals as they are now explicit. The unit tests are renamed and range.hh is deleted. Closes scylladb/scylladb#17428	2024-02-21 00:24:25 +02:00
Wojciech Mitros	4c767c379c	mv: adjust the overhead estimation for view updates In order to avoid running out of memory, we can't underestimate the memory used when processing a view update. Particularly, we need to handle the remote view updates well, because we may create many of them at the same time in contrast to local updates which are processed synchronously. After investigating a coredump generated in a crash caused by running out of memory due to these remote view updates, we found that the current estimation is much lower than what we observed in practice; we identified overhead of up to 2288 bytes for each remote view update. The overhead consists of: - 512 bytes - a write_response_handler - less than 512 bytes - excessive memory allocation for the mutation in bytes_ostream - 448 bytes - the apply_to_remote_endpoints coroutine started in mutate_MV() - 192 bytes - a continuation to the coroutine above - 320 bytes - the coroutine in result_parallel_for_each started in mutate_begin() - 112 bytes - a continuation to the coroutine above - 192 bytes - 5 unspecified allocations of 32, 32, 32, 48 and 48 bytes This patch changes the previous overhead estimate of 256 bytes to 2288 bytes, which should take into account all allocations in the current version of the code. It's worth noting that changes in the related pieces of code may result in a different overhead. The allocations seem to be mostly captures for the background tasks. Coroutines seem to allocate extra, however testing shows that replacing a coroutine with continuations may result in generating a few smaller futures/continuations with a larger total size. Besides that, considering that we're waiting for a response for each remote view update, we need the relatively large write_response_handler, which also includes the mutation in case we needed to reuse it. The change should not majorly affect workloads with many local updates because we don't keep many of them at the same time anyway, and an added benefit of correct memory utilization estimation is avoiding evictions of other memory that would be otherwise necessary to handle the excessive memory used by view updates. Fixes #17364 Closes scylladb/scylladb#17420	2024-02-21 00:05:49 +02:00
Tomasz Grabiec	e63d8ae272	Merge 'Handle tablet migration failure while streaming' from Pavel Emelyanov It can happen that a node is lost during tablet migration involving that node. Migration will be stuck, blocking topology state machine. To recover from this, the current procedure is for the admin to execute nodetool removenode or replacing the node. This marks the node as "ignored" and tablet state machine can pick this up and abort the migration. This PR implements the handling for streaming stage only and adds a test for it. Checking other stages needs more work with failure injection to inject failures into specific barrier. To handle streaming failure two new stages are introduced -- cleanup_target and revert_migration. The former is to clean the pending replica that could receive some data by the time streaming stopped working, the latter is like end_migration, but doesn't commit the new_replicas into replicas field. refs: #16527 Closes scylladb/scylladb#17360 * github.com:scylladb/scylladb: test/topology: Add checking error paths for failed migration topology.tablets_migration: Handle failed streaming topology.tablets_migration: Add cleanup_target transition stage topology.tablets_migration: Add revert_migration transition stage storage_service: Rewrap cleanup stage checking in cleanup_tablet() test/topology: Move helpers to get tablet replicas to pylib	2024-02-20 18:50:55 +01:00
Anna Stuchlik	37237407f6	doc: remove info about outdated versions This PR removes information about outdated versions, including disclaimers and information when a given feature was added. Now that the documentation is versioned, information about outdated versions is unnecessary (and makes the docs harder to read). Fixes https://github.com/scylladb/scylladb/issues/12110 Closes scylladb/scylladb#17430	2024-02-20 19:32:13 +02:00
Pavel Emelyanov	ceac65be1e	api: Reserve vectors in advance Some endpoints in api/column_family fill vectors with data obtained from database and return them back. Since the amount of data is known in advance, it's good to reserve the vector. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 19:13:05 +03:00
Pavel Emelyanov	f3e58cb806	api: Use range-loop to iterate keyspaces The code uses standard for (;;) loop, but range version is nicer Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 19:12:12 +03:00
Avi Kivity	93af3dd69b	Merge 'Maintenance socket: set filesystem permissions to 660' from Mikołaj Grzebieluch Set filesystem permissions for the maintenance socket to 660 (previously it was 755) to allow a scyllaadm's group to connect. Split the logic of creating sockets into two separate functions, one for each case: when it is a regular cql controller or used by maintenance_socket. Fixes https://github.com/scylladb/scylladb/issues/16487. Closes scylladb/scylladb#17113 * github.com:scylladb/scylladb: maintenance_socket: add option to set owning group transport/controller: get rid of magic number for socket path's maximal length transport/controller: set unix_domain_socket_permissions for maintenance_socket transport/controller: pass unix_domain_socket_permissions to generic_server::listen transport/controller: split configuring sockets into separate functions	2024-02-20 15:09:54 +02:00
Botond Dénes	73a3a3faf3	Merge 'tools/scylla-nodetool: implement tablestats' from Kefu Chai Refs #15588 Closes scylladb/scylladb#17387 * github.com:scylladb/scylladb: tools/scylla-nodetool: implement tablestats utils/rjson: add templated streaming_writer::Write()	2024-02-20 14:46:07 +02:00
Botond Dénes	8c228bffc8	Merge 'repair: accelerate repair load_history time' from Xu Chang Using `parallel_for_each_table` instance of `for_each_table_gently` on `repair_service::load_history`, to reduced bootstrap time. Using uuid_xor_to_uint32 on repair load_history dispatch to shard. Ref: https://github.com/scylladb/scylladb/issues/16774 Closes scylladb/scylladb#16927 * github.com:scylladb/scylladb: repair: resolve load_history shard load skew repair: accelerate repair load_history time	2024-02-20 13:45:26 +02:00
Kefu Chai	b0bb3ab5b0	topology: print `node` with node_printer in `da53854b66`, we added formatter for printing a `node`, and switched to this formatter when printing `node*`. but we failed to update some caller sites when migrating to the new formatter, where a `unique_ptr<node>` is printed instead. this is not the behavior before the change, and is not expected. so, in this change, we explicitly instantiate `node_printer` instances with the pointer held by `unique_ptr<node>`, to restore the behavior before `da53854b66`. this issue was identified when compiling the tree using {fmt} v10 and compile-time format-string check enabled, which is yet upstreamed to Seastar. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17418	2024-02-20 14:35:56 +03:00
Kefu Chai	c627d9134e	tools/scylla-nodetool: implement tablestats Refs #15588 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-20 18:12:35 +08:00
Kefu Chai	a7a2cf64cc	utils/rjson: add templated streaming_writer::Write() so we can use it in a templated context. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-20 18:12:35 +08:00
Botond Dénes	050c6dcad7	api: storage_service/keyspaces: add replication filter To allow to filter the returned keyspaces based by the replication they use: tablets or vnodes. The filter can be disabled by omitting the parameter or passing "all". The default is "all". Fixes: #16509 Closes scylladb/scylladb#17319	2024-02-20 09:04:41 +01:00
Kefu Chai	57ede58a64	raft: add fmt::formatter for raft::fsm before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `raft::fsm`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17414	2024-02-20 09:02:02 +02:00
Kefu Chai	acefde0735	mutation: add fmt::formatter for mutation_partition::printer before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `mutation_partition::printer`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17419	2024-02-20 09:01:22 +02:00
Kefu Chai	0b13de52de	sstable/mx: add fmt::formatter for cached_promoted_index::promoted_index_block before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `cached_promoted_index::promoted_index_block`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17415	2024-02-20 09:00:32 +02:00
Botond Dénes	2a494b6c47	Merge 'test/nodetool: parameterize test_ring' from Kefu Chai so we exercise the cases where state and status are not "normal" and "up". turns out the MBean is able to cache some objects. so the requets retrieving datacenter and rack are now marked `ANY`. * filter out the requests whose `multiple` is `ANY` * include the unconsumed requets in the raised `AssertionError`. this should help with debugging. Fixes #17401 Closes scylladb/scylladb#17417 * github.com:scylladb/scylladb: test/nodetool: parameterize test_ring test/nodetool: fail a test only with leftover expected requests	2024-02-20 08:48:11 +02:00
Anna Stuchlik	69ead0142d	doc: remove outdated/invalid entries from FAQ This commit removes outdated or invalid FAQ entries specified in https://github.com/scylladb/scylladb/issues/16631 In addition, the questions about Cassandra compatibility are removed as they are already answered on the forum: https://forum.scylladb.com/t/which-cassandra-version-is-scylladb-it-compatible-with/84 Also, the incorrect entry about the cache has been removed and the correct answer is added to the forum. Fixes https://github.com/scylladb/scylladb/issues/17003 The question about troubleshooting performance issues has also been removed, as it's already covered on the Forum. Also, it removes the Apache copyright entry, which should not be added to the FAQ page. Closes scylladb/scylladb#17200	2024-02-20 08:43:58 +02:00
Anna Stuchlik	4f8f183736	doc: remove SSTable2json from the docs This commit removes the SSTable2json documentation, as well as the links to the removed page. In addition, it adds a redirection for that page to prevent 404. Fixes https://github.com/scylladb/scylladb/issues/17204 Closes scylladb/scylladb#17340	2024-02-20 08:43:27 +02:00
Kefu Chai	64f9d90f7b	tools/scylla-nodetool: implement toppartitions Refs #15588 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17357	2024-02-20 08:16:43 +02:00
Pavel Emelyanov	1440eddc58	test/topology: Add checking error paths for failed migration For now only fail streaming stage and check that migration doesn't get stuck and doesn't make tablet appear on dead node. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:59:06 +03:00
Pavel Emelyanov	cb02297642	topology.tablets_migration: Handle failed streaming In case pending or leaving replica is marked as ignored by operator, streaming cannot be retried and should jump to "cleanup_target" stage after a barrier. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:59:06 +03:00
Pavel Emelyanov	72f3b1d5fe	topology.tablets_migration: Add cleanup_target transition stage The new stage will be used to revert migration that fails at some stages. The goal is to cleanup the pending replica, which may already received some writes by doing the cleanup RPC to the pending replica, then jumping to "revert_migration" stage introduced earlier. If pending node is dead, the call to cleanup RPC is skipped. Coordinators use old replicas. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:59:06 +03:00
Pavel Emelyanov	ced5bf56eb	topology.tablets_migration: Add revert_migration transition stage It's like end_migration, but old replicas intact just removing the transition (including new replicas). Coordinators use old replicas. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:53:36 +03:00
Pavel Emelyanov	a0a33e8be1	storage_service: Rewrap cleanup stage checking in cleanup_tablet() Next patch will need to teach this code to handle new cleanup_target stage, this change prepares this place for smoother patching Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:53:36 +03:00
Pavel Emelyanov	c06cbc391f	test/topology: Move helpers to get tablet replicas to pylib These are very useful and will be used across different test files soon Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-02-20 08:53:36 +03:00
Kefu Chai	3a94a7c1ff	test/nodetool: parameterize test_ring so we exercise the cases where state and status are not "normal" and "up". turns out the MBean is able to cache some objects. so the requets retrieving datacenter and rack are now marked `ANY`. Fixes #17401 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-20 12:59:59 +08:00
Kefu Chai	3d8a6956fc	test/nodetool: fail a test only with leftover expected requests if there are unconsumed requests whose `multiple` is -1, we should not consider it a required, the test can consume it or not. but if it does not, we should not consider the test a failure just because these requests are sitting at the end of queue. so, in this change, we * filter out the requests whose `multiple` is `ANY` * include the unconsumed requets in the raised `AssertionError`. this should help with debugging. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-20 12:59:59 +08:00
Patryk Wrobel	82104b6f50	test_tablets: tablet count metric - remove assumption about tablets existence The mentioned test failed on CI. It sets up two nodes and performs operations related to creation and dropping of tables as well as moving tablets. Locally, the issue was not visible - also, the test was passing on CI in majority of cases. One of steps in the test case is intended to select the shard that has some tablets on host_0 and then move them to (host_1, shard_3). It contains also a precondition that requires the tablets count to be greater than zero - to ensure, that move_tablets operation really moves tablets. The error message in the failed CI run comes from the precondition related to tablets count on (host0, src_shard) - it was zero. This indicated that there were no tablets on entire host_0. The following commit removes the assumption about the existence of tablets on host_0. In case when there are no tablets there, the procedure is rerun for host_1. Now the logic is as follows: - find shard that has some tablets on host_0 - if such shard does not exist, then find such shard on host_1 - depending on the result of search set src/dest nodes - verify that reported tablet count metric is changed when move_tablet operation finishes Refs: scylladb#17386 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#17398	2024-02-19 21:26:08 +01:00
Kefu Chai	3c84f08b93	alternator: add formatter for attribute_path_map_node<update_expression::action> before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define formatters for `attribute_path_map_node<update_expression::action>`, and drop its operator<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17270	2024-02-19 20:09:11 +02:00
Petr Gusev	f83df24108	test_decommission: fix log messages Closes scylladb/scylladb#17396	2024-02-19 12:09:43 +02:00

1 2 3 4 5 ...

41381 Commits