scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-28 18:50:53 +00:00

Author	SHA1	Message	Date
Takuya ASADA	7e690bac62	install-dependencies.sh: update node_exporter to 1.5.0 Update node_exporter to 1.5.0. Closes scylladb/scylla-pkg#3190 Closes #12793 [avi: regenerate frozen toolchain] Closes #12813	2023-02-13 16:30:24 +02:00
Pavel Emelyanov	fa5f5a3299	sstable_test_env: Remove working_sst helper It's only used by the single test and apparently exists since the times seastar was missing the future::discard_result() sugar Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12803	2023-02-13 16:30:24 +02:00
Nadav Har'El	a24600a662	Merge 'test/pylib: split and refactor topology tests' from Alecco Move long running topology tests out of `test_topology.py` and into their own files, so they can be run in parallel. While there, merge simple schema tests. Closes #12804 * github.com:scylladb/scylladb: test/topology: rename topology test file test/topology: lint and type for topology tests test/topology: move topology ip tests to own file test/topology: move topology test remove garbaje... test/topology: move topology rejoin test to own file test/topology: merge topology schema tests and... test/topology: isolate topology smp params test test/topology: move topology helpers to common file	2023-02-12 17:53:48 +02:00
Alejo Sanchez	8bf2d515de	test/topology: rename topology test file Rename test_topology.py to reflect current tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:31 +01:00
Alejo Sanchez	11691ba7f5	test/topology: lint and type for topology tests Fix minor lint and type hints. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:31 +01:00
Alejo Sanchez	49baf6789c	test/topology: move topology ip tests to own file Move slow topology IP related tests to a separate file. Add docstrings. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:19 +01:00
Alejo Sanchez	3fcef63a0f	test/topology: move topology test remove garbaje... group0 members to own file Move slow test for removenode with nodes not present in group0 to a server after a sudden stop to a separate file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:48:39 +01:00
Nadav Har'El	10ca08e8ac	Merge 'Sequence CDC preimage select with Paxos learn write' from Kamil Braun `paxos_response_handler::learn_decision` was calling `cdc_service::augment_mutation_call` concurrently with `storage_proxy::mutate_internal`. `augment_mutation_call` was selecting rows from the base table in order to create the preimage, while `mutate_internal` was writing rows to the table. It was therefore possible for the preimage to observe the update that it accompanied, which doesn't make any sense, because the preimage is supposed to show the state before the update. Fix this by performing the operations sequentially. We can still perform the CDC mutation write concurrently with the base mutation write. `cdc_with_lwt_test` was sometimes failing in debug mode due to this bug and was marked flaky. Unmark it. Also fix a comment in `cdc_with_lwt_test`. Fixes #12098 Closes #12768 * github.com:scylladb/scylladb: test/cql-pytest: test_cdc: regression test for #12098 test/cql: cdc_with_lwt_test: fix comment service: storage_proxy: sequence CDC preimage select with Paxos learn	2023-02-12 13:28:34 +02:00
Alejo Sanchez	655e1587e3	test/topology: move topology rejoin test to own file Move slow test for rejoining a server after a sudden stop to a separate file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:47 +01:00
Alejo Sanchez	7cc669f5a5	test/topology: merge topology schema tests and... ... move them to their own file. Schema verification tests for restart, add, and hard stop of server can be done with the same cluster. Merge them in the same test case. While there, move them to a separate file to be run independently as this is a slow test. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:40 +01:00
Alejo Sanchez	93de79d214	test/topology: isolate topology smp params test Move slow test for different smp parameters to its own file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:32 +01:00
Alejo Sanchez	293550ca5c	test/topology: move topology helpers to common file Move helper functions to a common file ahead of splitting topology tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:16 +01:00
Nadav Har'El	2653865b34	Merge 'test.py: improve test failure handling' from Kamil Braun Improve logging by printing the cluster at the end of each test. Stop performing operations like attempting queries or dropping keyspaces on dirty clusters. Dirty clusters might be completely dead and these operations would only cause more "errors" to happen after a failed test, making it harder to find the real cause of failure. Mark cluster as dirty when a test that uses it fails - after a failed test, we shouldn't assume that the cluster is in a usable state, so we shouldn't reuse it for another test. Rely on the `is_dirty` flag in `PythonTest`s and `CQLApprovalTest`s, similarly to what `TopologyTest`s do. Closes #12652 * github.com:scylladb/scylladb: test.py: rely on ScyllaCluster.is_dirty flag for recycling clusters test/topology: don't drop random_tables keyspace after a failed test test/pylib: mark cluster as dirty after a failed test test: pylib, topology: don't perform operations after test on a dirty cluster test/pylib: print cluster at the end of test	2023-02-12 12:13:25 +02:00
Kamil Braun	ca4db9bb72	Merge 'test/raft: test snapshot threshold' from Alecco Force snapshot with schema changes while server down. Then verify schema when bringing back up the server. Closes #12726 * github.com:scylladb/scylladb: pytest/topology: check snapshot transfer raft conf error injection for snapshot test/pylib: one-shot error injection helper	2023-02-10 15:24:46 +01:00
Kamil Braun	540f6d9b78	test/cql-pytest: test_cdc: regression test for #12098 Perform multiple LWT inserts to different keys ensuring none of them observes a preimage. On my machine this test reproduces the problem more than 50% of the time in debug mode.	2023-02-10 14:35:49 +01:00
Botond Dénes	423df263f5	Merge 'Sanitize with_sstable_directory() helper in tests' from Pavel Emelyanov The helping wrapper facilitates the usage of sharded<sstable_directory> for several test cases and the helper and its callers had deserved some cleanup over time. Closes #12791 * github.com:scylladb/scylladb: sstable_directory_test: Reindent and de-multiline sstable_directory_test: Enlighten and rename sstable_from_existing_file sstable_directory_test: Remove constant parallelizm parameter	2023-02-10 07:11:38 +02:00
Tomasz Grabiec	402d5fd7e3	cache: Fix empty partition entries being left in cache in some cases Merging rows from different partition versions should preserve the LRU link of the entry from the newer version. We need this in case we're merging two last dummy entries where the older dummy is already unlinked from the LRU. The newer dummy could be the last entry which is still holding the partition entry linked in the LRU. The mutation_partition_v2 merging didn't take the LRU link from the newer entry, and we could end up with the partition entry not having any entries linked in the LRU. Introduced in `f73e2c992f`. Fixes #12778 Closes #12785	2023-02-09 23:03:23 +02:00
Kamil Braun	e2064f4762	Merge 'repair: finish repair immediately on local keyspaces' from Aleksandra Martyniuk System keyspace is a keyspace with local replication strategy and thus it does not need to be repaired. It is possible to invoke repair of this keyspace through the api, which leads to runtime error since peer_events and scylla_table_schema_history have different sharding logic. For keyspaces with local replication strategy repair_service::do_repair_start returns immediately. Closes #12459 * github.com:scylladb/scylladb: test: rest_api: check if repair of system keyspace returns before corresponding task is created repair: finish repair immediately on local keyspaces	2023-02-09 18:44:37 +01:00
Pavel Emelyanov	f0212c7b68	sstable_directory_test: Reindent and de-multiline Many tests using sstable directory wrapper have broken indentation with previous patching. Fix it. No functional changes. Also, while at it, convert multiline wrapper calls into one-line, after previous patch these are short enough for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 16:00:53 +03:00
Pavel Emelyanov	ec02b0f706	sstable_directory_test: Enlighten and rename sstable_from_existing_file It used to be the sstable maker for sstable::test_env / cql_test_env, now sstables for tests are made via sstables manager explicitly, so the guy can be remaned to something more relevant to its current status. Also, de-mark its constructors as explicit to make callers look shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 15:59:23 +03:00
Pavel Emelyanov	c843f7937b	sstable_directory_test: Remove constant parallelizm parameter It's 1 (one) all the time, just hard-code it internally Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 15:59:01 +03:00
Avi Kivity	fd4ee4878a	Revert "storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops" This reverts commit `e7d5e508bc`. It ends up failing continuous integration tests randomly. We don't know if it's uncovering an existing bug, or if RBNO itself is broken, but for now we need to revert it to unblock progress.	2023-02-09 10:30:26 +02:00
Botond Dénes	b62d84fdba	Merge 'Keep reshape and reshard logic in distributed loader' from Pavel Emelyanov Now it's scattered between dist. loader and sstable directory code making the latter quite bloated. Keeping everything in distributed loader makes the sstable_directory code compact and easier to patch to support object storage backend. Closes #12771 * github.com:scylladb/scylladb: sstable_directory: Rename remove_input_sstables_from_reshaping() sstable_directory: Make use of remove_sstables() helper sstable_directory: Merge output sstables collecting methods distributed_loader: Remove max_compaction_threshold argument from reshard() distributed_loader: Remove compaction_manager& argument from reshard() sstable_directory: Move the .reshard() to distributed_loader sstable_directory: Add helper to load foreign sstable sstable_directory: Add io-prio argument to .reshard() sstable_directory: Move reshard() to distributed_loader.cc distributed_loader: Remove compaction_manager& argument from reshape() sstable_directory: Move the .reshape() to distributed loader sstable_directory: Add helper to retrive local sstables sstable_directory: Add io-prio argument to .reshape() sstable_directory: Move reshape() to distributed_loader.cc	2023-02-09 10:01:44 +02:00
Botond Dénes	1c333e2102	Merge 'Transport server error handling fixes' from Gusev Petr CQL transport sever error handling fixes and improvements: * log failed requests with `DEBUG` level for easier debugging; * in case of unhandled errors, deliver them to the client as `SERVER_ERROR`'s * fix for `protocol_error`'s in case of shedded big requests; * explicit tests have been written for the error handling problems above. Closes #11949 * github.com:scylladb/scylladb: transport server: fix "request size too large" handling transport server: log failed requests with debug level transport server: fix unexpected server errors handling transport server: log client errors with debug level	2023-02-09 09:02:22 +02:00
Anna Stuchlik	c7778dd30b	doc: related https://github.com/scylladb/scylladb/issues/12754 , add the requirement to upgrade Monitoring to version 4.3 Closes #12784	2023-02-09 07:10:34 +02:00
Botond Dénes	746b009db0	Merge 'dist/debian: bump up debhelper compatibility level to 10 and cleanups' from Kefu Chai - dist/debian: bump up debhelper compatibility level to 10 - dist/debian: drop unused Makefile variable Closes #12723 * github.com:scylladb/scylladb: dist/debian: drop unused Makefile variable dist/debian: bump up debhelper compatibility level to 10	2023-02-09 07:04:20 +02:00
Pavel Emelyanov	40de737b36	sstable_directory: Rename remove_input_sstables_from_reshaping() It unlinks unshared sstables filtering some of them out. Name it according to what it does without mentioning reshape/reshard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:44 +03:00
Pavel Emelyanov	a1dc251214	sstable_directory: Make use of remove_sstables() helper Currently it's called remove_input_sstables_from_resharding() but it's just unlinks sstables in parallel from the given list. So rename it not to mention reshard and also make use of this "new" helper in the remove_input_sstables_from_reshaping(), it needs exactly the same functionality. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:44 +03:00
Pavel Emelyanov	cb36f5e581	sstable_directory: Merge output sstables collecting methods There are two of them collecting sstables from resharding and reshaping. Both doing the same job except for the latter doesn't expect the list to contain remote sstables. This patch merges them together with the help of an extra sanity boolean to check for the remote sstable not in the list. And renames the method not to mention reshape/reshard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:41 +03:00
Avi Kivity	0f15ff740d	cql3: expr: simplify user/debug formatting We have a cql3::expr::expression::printer wrapper that annotates an expression with a debug_mode boolean prior to formatting. The fmt library, however, provides a much simpler alterantive: a custom format specifier. With this, we can write format("{:user}", expr) for user-oriented prints, or format("{:debug}", expr) for debug-oriented prints (if nothing is specified, the default remains debug). This is done by implementing fmt::formatter::parse() for the expression type, can using expression::printer internally. Since sometimes we pass expression element types rather than the expression variant, we also provide a custom formatter for all ExpressionElement Types. Uses for expression::printer are updated to use the nicer syntax. In one place we eliminate a temporary that is no longer needed since ExpressionElement:s can be formatted directly. Closes #12702	2023-02-08 12:24:58 +02:00
Petr Gusev	3263523b54	transport server: fix "request size too large" handling Calling _read_buf.close() doesn't imply eof(), some data may have already been read into kernel or client buffers and will be returned next time read() is called. When the _server._max_request_size limit was exceeded and the _read_buf was closed, the process_request method finished and we started processing the next request in connection::process. The unread data from _read_buf was treated as the header of the next request frame, resulting in "Invalid or unsupported protocol version" error. The existing test_shed_too_large_request was adjusted. It was originally written with the assumption that the data of a large query would simply be dropped from the socket and the connection could be used to handle the next requests. This behaviour was changed in scylladb#8800, now the connection is closed on the Scylla side and can no longer be used. To check there are no errors in this case, we use Scylla metrics, getting them from the Scylla Prometheus API.	2023-02-08 00:07:08 +04:00
Petr Gusev	0904f98ebf	transport server: log failed requests with debug level These logs can be helpful for debugging, e.g. if an error was not handled correctly by the client driver, or another error occurred while handling it.	2023-02-08 00:07:08 +04:00
Petr Gusev	a4cf509c3d	transport server: fix unexpected server errors handling If request processing ended with an error, it is worth sending the error to the client through make_error/write_response. Previously in this case we just wrote a message to the log and didn't handle the client connection in any way. As a result, the only thing the client got in this case was timeout error. A new test_batch_with_error is added. It is quite difficult to reproduce error condition in a test, so we use error injection instead. Passing injection_key in the body of the request ensures that the exception will be thrown only for this test request and will not affect other requests that the driver may send in the background. Closes: scylladb#12104	2023-02-08 00:07:02 +04:00
Pavel Emelyanov	73d458cf89	distributed_loader: Remove max_compaction_threshold argument from reshard() Since the whole reshard() is local to dist. loader code now, the caller of the reshard helper may let this method get the threshold itself Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	25aaa45256	distributed_loader: Remove compaction_manager& argument from reshard() It can be obtained from the table& Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	15547f1b5b	sstable_directory: Move the .reshard() to distributed_loader Now all the reshading logic is accumulated in distributed loader and the sstable_directory is just the place where sstables are collected. The changes summary is: - add sstable_directory as argument (used to be "this") - replace all "this" captures with &dir ones - remove temporary namespace gap and declaration from sst-dir class Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	ab5f48d496	sstable_directory: Add helper to load foreign sstable This is to generalize the code duplication between .reshard() and existing .load_foreign_sstables() (plural form). Make it coroutinized right at once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	e6e65c87d5	sstable_directory: Add io-prio argument to .reshard() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:41 +03:00
Pavel Emelyanov	a32d2b6d6a	sstable_directory: Move reshard() to distributed_loader.cc Just move the code and create temporary namespace gap for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:12 +03:00
Pavel Emelyanov	1de8c85acd	distributed_loader: Remove compaction_manager& argument from reshape() It can be obtained from the table& Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:12 +03:00
Pavel Emelyanov	d734b6b7c1	sstable_directory: Move the .reshape() to distributed loader Now all the reshaping logic is accumulated in distributed loader and the sstable_directory is just the place where sstables are collected. The changes summary is: - add sstable_directory as argument (used to be "this") - replace all "this" captures with &dir ones - remove temporary namespace gap and declaration from sst-dir class Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:30:55 +03:00
Pavel Emelyanov	b906d34807	sstable_directory: Add helper to retrive local sstables There are methods to retrive shared local sstables and foreign sstables, so here's one more to the family Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:23:40 +03:00
Pavel Emelyanov	420fc8d4df	sstable_directory: Add io-prio argument to .reshape() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:22:27 +03:00
Pavel Emelyanov	a70d6017f8	sstable_directory: Move reshape() to distributed_loader.cc Just move the code and create temporary namespace gap for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:21:54 +03:00
Kamil Braun	97b2971bf1	test/cql: cdc_with_lwt_test: fix comment The comment mentioned an entry that shouldn't be there (and it wasn't in the actual expected result).	2023-02-07 16:12:18 +01:00
Kamil Braun	1ef113691a	service: storage_proxy: sequence CDC preimage select with Paxos learn `paxos_response_handler::learn_decision` was calling `cdc_service::augment_mutation_call` concurrently with `storage_proxy::mutate_internal`. `augment_mutation_call` was selecting rows from the base table in order to create the preimage, while `mutate_internal` was writing rows to the table. It was therefore possible for the preimage to observe the update that it accompanied, which doesn't make any sense, because the preimage is supposed to show the state before the update. Fix this by performing the operations sequentially. We can still perform the CDC mutation write concurrently with the base mutation write. `cdc_with_lwt_test` was sometimes failing in debug mode due to this bug and was marked flaky. Unmark it. Fixes #12098	2023-02-07 16:12:18 +01:00
Alejo Sanchez	cf3b8d7edc	pytest/topology: check snapshot transfer Test snapshot transfer by reducing the snapshot threshold on initial servers (3 and 1 trailing). Then creates a table, and does 3 extra schema changes (add column), triggering at least 2 snapshots. Then brings a new server to the cluster, which will get the schema through a snapshot. Then the test stops the initial servers and verifies the table schema is up to date on the new server. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-07 16:09:07 +01:00
Petr Gusev	95bf8eebe0	query_ranges_to_vnodes_generator: fix for exclusive boundaries Let the initial range passed to query_partition_key_range be [1, 2) where 2 is the successor of 1 in terms of ring_position order and 1 is equal to vnode. Then query_ranges_to_vnodes_generator() -> [[1, 1], (1, 2)], so we get an empty range (1,2) and subsequently will make a data request with this empty range in storage_proxy::query_partition_key_range_concurrent, which will be redundant. The patch adds a check for this condition after making a split in the main loop in process_one_range. The patch does not attempt to handle cases where the original ranges were empty, since this check is the responsibility of the caller. We only take care not to add empty ranges to the result as an unintentional artifact of the algorithm in query_ranges_to_vnodes_generator. A test case is added in test_get_restricted_ranges. The helper lambda check is changed so that not to limit the number of ranges to the length of expected ranges, otherwise this check passes without the change in process_one_range. Fixes: #12566 Closes #12755	2023-02-07 16:02:31 +02:00
Kefu Chai	afd1221b53	commitlog: mark request_controller_timeout_exception_factory::timeout() noexcept request_controller_timeout_exception_factory::timeout() creates an instance of `request_controller_timed_out_error` whose ctor is default-created by compiler from that of timed_out_error, which is in turn default-created from the one of `std::exception`. and `std::exception::exception` does not throw. so it's safe to mark this factory method `noexcept`. with this specifier, we don't need to worry about the exception thrown by it, and don't need to handle them if any in `seastar::semaphore`, where `timeout()` is called for the customized exception. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12759	2023-02-07 14:38:54 +02:00
Botond Dénes	051da4e148	Merge 'Handle EDQUOT error just like ENOSPC' from Kefu Chai - main: consider EDQUOT as environmental failure also - main: use defer_verbose_shutdown() to shutdown compaction manager - replica/table: extract should_retry() int with_retry - replica/table: retry on EDQUOT when flushing memtable Fixes #12626 Closes #12653 * github.com:scylladb/scylladb: replica/table: retry on EDQUOT when flushing memtable replica/table: extract should_retry() int with_retry main: use defer_verbose_shutdown() to shutdown compaction manager main: consider EDQUOT as environmental failure also	2023-02-07 14:38:36 +02:00

1 2 3 4 5 ...

35011 Commits