scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 03:45:11 +00:00

Author	SHA1	Message	Date
Benny Halevy	97b002e13e	docs: debugging.md: add a sample gdbinit file This gdbinit contains recommended settings commonly useful for debugging scylla core dumps. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-11 10:23:08 +03:00
Nadav Har'El	2c39c4c284	Merge 'Handle errors during snapshot' from Benny Halevy This series refactors `table::snapshot` and moves the responsibility to flush the table before taking the snapshot to the caller. `flush_on_all` and `snapshot_on_all` helpers are added to replica::database (by making it a peering_sharded_service) and upper layers, including api and snapshot-ctl now call it instead of calling cf.snapshot directly. With that, error are handed in table::snapshot and propagated back to the callers. Failure to allocate the `snapshot_manager` object is fatal, similar to failure to allocate a continuation, since we can't coordinate across the shards without it. Test: unit(dev), rest_api(debug) Fixes #10500 Closes #10513 * github.com:scylladb/scylla: table: snapshot: handle errors table: snapshot: get rid of skip_flush param database: truncate: skip flush when taking snapshot test: rest_api: storage_service: verify_snapshot_details: add truncate database: snapshot_on_all: flush before snapshot if needed table: make snapshot method private database: add snapshot_on_all snapshot-ctl: run_snapshot_modify_operation: reject views and secondary index using the schema snapshot-ctl: refactor and coroutinize take_snapshot / take_column_family_snapshot api: storage_service: increase visibility of snapshot ops in the log api: storage_service: coroutinize take_snapshot and del_snapshot api: storage_service: take_snapshot: improve api help messages test: rest_api: storage_service: add test_storage_service_snapshot database: add flush_on_all variants test: rest_api: add test_storage_service_flush	2022-05-10 10:52:10 +03:00
Benny Halevy	1d39d803af	table: snapshot: handle errors Turn table::snapshot into a coroutine, catch exceptions, and return them to the caller. Make sure that coordination across shards would not break even if any of the shards hits an error, by always signaling semaphores other shards wait on. All errors except for failing to allocate the snapshot_manager objects are caught and propagated back. Failing to allocate the snapshot_manager is fatal similar to failing to allocate a continuation since we can't coordinate across the shards without it, so abort that fails. Fixes #10500 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	9e69089306	table: snapshot: get rid of skip_flush param Now that all callers flush on their own before calling table::snapshot. Refs #10500 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	31881273a1	database: truncate: skip flush when taking snapshot database::truncate already flushes the table on auto_snapshot so there is never a reason to flush it again in table::snapshot. Note that cf.can_flush() is false only if memtables are empty so there nothing to flush or there is is no seal_immediate_fn and then table::snapshot wouldn't be able to flush either. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	fc79787863	test: rest_api: storage_service: verify_snapshot_details: add truncate Truncate the test table and verify that the 'live' snapshot size is now non-zero. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	46c950fb31	database: snapshot_on_all: flush before snapshot if needed flush_on_all shards before taking the snapshot if !skip_flush so we can get rid of flushing in table::snapshot. Refs #10500 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	33bd52921e	table: make snapshot method private Only callable by database. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	e1d58d4422	database: add snapshot_on_all And move the logic from snapshot-ctl down to the replica::database layer. A following patch will move the flush phase from the replica::table::snapshot layer out to the caller. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	aa127a2dbb	snapshot-ctl: run_snapshot_modify_operation: reject views and secondary index using the schema Detecting a secondary index by checking for a dot in the table name is wrong as tables generated by Alternator may contain a dot in their name. Instead detect bot hmaterialized view and secondary indexes using the schema()->is_view() method. Fixes #10526 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:44:52 +03:00
Benny Halevy	1fbcdbd2e8	snapshot-ctl: refactor and coroutinize take_snapshot / take_column_family_snapshot There is no functional change in this patch. Only refactoring of the code. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:16:39 +03:00
Benny Halevy	01b1e54e22	api: storage_service: increase visibility of snapshot ops in the log snapshot operations over the api are rare but they contain significant state on disk in the form of sstables hard-linked to the snapshot directories. Also, we've seen snapshot operations hang in the field, requiring a core dump to analyse the issue, while there were no records in the log indicating when previous snapshot operations were last executed. This change promotes logging to info level when take_snapshot and del_snapshot start, and logs errors if in case they fail. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:15:46 +03:00
Benny Halevy	b9d972d029	api: storage_service: coroutinize take_snapshot and del_snapshot Before making any further changes in them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:02:52 +03:00
Benny Halevy	10b86ee5bd	api: storage_service: take_snapshot: improve api help messages Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:02:47 +03:00
Benny Halevy	e95ecbbea6	test: rest_api: storage_service: add test_storage_service_snapshot Test the snapshot operations via the rest api. Added test/rest_api/rest_util.py with new_test_snapshot that creates a new test snapshot and automagically deletes it when the `with` block if exited, similar to new_test_keyspace and new_test_table. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Benny Halevy	5b4eb44795	database: add flush_on_all variants Use by api layer. Will be used in a later patch to flush on all shards before taking a snapshot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Benny Halevy	05c7f4b832	test: rest_api: add test_storage_service_flush Add a basic rest_api test for keyspace_flush. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Nadav Har'El	1c6163d51f	Merge 'cql3: expr: allow bind markers in collection literals' from Michał Sala Allowing bind markers in collection literals is a change which causes minor differences in behavior between Scylla and Cassandra. Despite such an undesirable effect, I think allowing them is a good idea because it makes [refactoring work made by cvybhu](https://github.com/scylladb/scylla/pull/10409) easier - `469d03f8c2`. Also, making Scylla accept a superset of valid Cassandra cql expressions does not make us less compatible (maybe apart from test suit compatibility). Closes #10457 * github.com:scylladb/scylla: test/boost: cql_query_test: allow bound variables in test_list_of_tuples_with_bound_var test/boost: cql_query_test: test bound variables in collection literals cql3: expr: do not allow unset values inside collections cql3: expr: prepare_expr: allow bind markers in collection literals	2022-05-09 19:15:22 +03:00
Botond Dénes	fd27fbfe64	Merge "Add user types carrier helper" from Pavel Emelyanov " There's a cql_type_parser::parse() method that needs to get user types for a keyspace by its name. For this it uses the global storage proxy instance as a place to get database from. This set introduces an abstract user_types_storage helper object that's responsible in providing the user types for the caller. This helper, in turn, is provided to the parse() method by the database itself or by the schema_ctxt object that needs parse() to unfreeze schemas and doesn't have database at those times. This removes one more get_storage_proxy() call. " * 'br-user-types-storage' of https://github.com/xemul/scylla: cql_type_parser: Require user_types_storage& in parse() schame_tables: Add db/ctxt args here and there user_types: Carry storage on database and schema_ctxt data_dictionary: Introduce user types storage	2022-05-09 17:38:52 +03:00
Nadav Har'El	ca700bf417	scripts/pull_github_pr.sh: clean up after failed cherry-pick When pull_github_pr.sh uses git cherry-pick to merge a single-patch pull request, this cherry-pick can fail. A typical example is trying to merge a patch that has actually already been merged in the past, so cherry-pick reports that the patch, after conflict resolution, is empty. When cherry-pick fails, it leaves the working directory in an annoying mid-cherry-pick state, and today the user needs to manually call "git cherry-pick --abort" to return to the normal state. The script should it automatically - so this is what we do in this patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-05-09 17:23:34 +03:00
Pavel Emelyanov	598ce8111d	repair: Handle discarded stopping future When repair_meta stops it does so in the background and reports back a shared future into whose shared promise peer it resolves that background activity. There's a shorter way to forward a future result into another, even shared, promise. And this method doesn't need to discard a future. tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/253 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-09 17:23:12 +03:00
Pavel Emelyanov	3b4af86ad9	proxy (and suddenly redis): Don't check latency_counter.is_start() The lcs at those places are explicitly start()ed beforehand. The is_start() check is necessary when using the latency_counter with a histogram that may or may not start the counter (this is the case in several class table methods). tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-09 17:20:41 +03:00
Raphael S. Carvalho	48e3117ebc	compaction: move propagate_replacement() into private namespace propagate_replacement() is an internal function that shouldn't be in the public interface. No one besides an unit test for incremental compaction needs it. In the future, I want to revisit incremental compaction unit test to stop using it and only rely on public interfaces Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220506171647.81063-1-raphaelsc@scylladb.com>	2022-05-09 16:49:50 +03:00
David Garcia	3e0f81180e	docs: disable link checker Closes #10434	2022-05-09 12:45:28 +02:00
Avi Kivity	81af9342f1	Merge "Simplify gossiper state map API" from Pavel E " There's a enpoint->state map member of the gossiper class. First ugly thing about it is that the member is public. Next, there's a whole bunch of helpers around that map that export various bits of information from it. All of those helpers reshard to shard-0 to read from the state mape ignoring the fact that the map is replicated on all shards internally. Also, some of those helpers effectively duplicate each other for no real gain. Finally, most of them are specific to api/ code, and open-coding them often makes api/ handlers shorter and simpler. This set removes the unused, api-only or trivial state map accessors and marks the state map itself private (underscore prefix included). tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/233/ " * 'br-gossiper-sanitize-api-2' of https://github.com/xemul/scylla: gossiper: Add underscores to new private members code: Indentation fix after previous patch gossiper, code: Relax get_up/down/all_counters() helpers api: Fix indentation after previous patch gossiper, api: Remove get_arrival_samples() gossiper, api: Remove get/set phi convict threshold helpers gossiper, api: Move get_simple_states() into API code gossiper: In-line std::optional<> get_endpoint_state_for_endpoint() overload gossiper, api: Remove get_endpoint_state() helpers gossiper: Make state and locks maps private gossiper: Remove dead code	2022-05-08 22:56:23 +03:00
Avi Kivity	94f677b790	Merge 'sstables/index_reader: short-circuit fast-forward-to when at EOF' from Botond Dénes Attempting to call advance_to() on the index, after it is positioned at EOF, can result in an assert failure, because the operation results in an attempt to move backwards in the index-file (to read the last index page, which was already read). This only happens if the index cache entry belonging to the last index page is evicted, otherwise the advance operation just looks-up said entry and returns it. To prevent this, we add an early return conditioned on eof() to all the partition-level advance-to methods. A regression unit test reproducing the above described crash is also added. Fixes: #10403 Closes #10491 * github.com:scylladb/scylla: sstables/index_reader: short-circuit fast-forward-to when at EOF test/lib/random_schema: add a simpler overload for fixed partition count	2022-05-08 14:17:40 +03:00
Juliusz Stasiewicz	603dd72f9e	CQL: Replace assert by exception on invalid auth opcode One user observed this assertion fail, but it's an extremely rare event. The root cause - interlacing of processing STARTUP and OPTIONS messages - is still there, but now it's harmless enough to leave it as is. Fixes #10487 Closes #10503	2022-05-08 11:33:58 +03:00
Michał Chojnowski	fb1a9e97c9	cql3: restrictions: statement_restrictions: pass arguments to std::bind_front by reference Fix an accidental copy of query_options in range_or_slice_eq_null. Closes #10511	2022-05-08 11:32:53 +03:00
Avi Kivity	1ecb87b7a8	Merge 'Harden table truncate' from Benny Halevy This series fixes a few issue on the table truncate path: - "memtable_list: safely futurize clear_and_add" - reinstates an async version of table::clear_and_add, just safe against #10421 - a unit test reproducing #10421 was added to make sure the new version is indeed safe. - "table: clear: serialize with ongoing flush" fixes #10423 - a unit test reproducing #10423 was added Fixes #10281 Fixes #10423 Test: unit(dev), database_test. test_truncate_without_snapshot_during_{writes,flushes} (debug) Closes #10424 * github.com:scylladb/scylla: test: database_test: add test_truncate_without_snapshot_during_writes memtable_list: safely futurize clear_and_add table: clear: serialize with ongoing flush	2022-05-08 11:30:21 +03:00
Avi Kivity	287c01ab4d	Merge ' sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes()' from Michał Chojnowski primitive_consumer::read_bytes() destroys and creates a vector for every value it reads. This happens for every cell. We can save a bit of work by reusing the vector. Closes #10512 * github.com:scylladb/scylla: sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes() utils: fragmented_temporary_buffer: add release()	2022-05-08 11:26:31 +03:00
Raphael S. Carvalho	8e99d3912e	compaction: LCS: don't write to disengaged optional on compaction completion Dtest triggers the problem by: 1) creating table with LCS 2) disabling regular compaction 3) writing a few sstables 4) running maintenance compaction, e.g. cleanup Once the maintenance compaction completes, disengaged optional _last_compacted_keys triggers an exception in notify_completion(). _last_compacted_keys is used by regular for its round-robin file picking policy. It stores the last compacted key for each level. Meaning it's irrelevant for any other compaction type. Regular compaction is responsible for initializing it when it runs for the first time to pick files. But with it disabled, notify_completion() will find it uninitialized, therefore resulting in bad_optional_access. To fix this, the procedure is skipped if _last_compacted_keys is disengaged. Regular compaction, once re-enabled, will be able to fill _last_compacted_keys by looking at metadata of the files. compaction_test.py::TestCompaction::test_disable_autocompaction_doesnt_ block_user_initiated_compactions[CLEANUP-LeveledCompactionStrategy] now passes. Fixes #10378. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #10508	2022-05-08 11:23:13 +03:00
Raphael S. Carvalho	5682393693	compaction: Fix use-after-move when retrying maintenance compaction SSTable was moved into descriptor, so on failure, it couldn't be used without resulting in a segfault. Fix it by not moving sst, and changing signature to make it explicit we don't want to move the content. Fixes #10505. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #10506	2022-05-08 11:16:55 +03:00
Michał Chojnowski	ddc535a4a2	sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes() read_bytes destroys and creates a vector for every value it reads. This happens for every cell. We can save a bit of work by reusing the vector.	2022-05-07 13:04:16 +02:00
Michał Chojnowski	8cfbe9c9c1	utils: fragmented_temporary_buffer: add release() Add a release() method to fragmented_temporary_buffer. This method releases the underlying vector to allow for its reuse.	2022-05-07 13:04:16 +02:00
Pavel Emelyanov	9d364f19dc	gossiper: Add underscores to new private members The state map and guarding locks were moved to private and now should have a _ prefix Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 11:32:03 +03:00
Pavel Emelyanov	334d3434e7	code: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	5ac28a29d3	gossiper, code: Relax get_up/down/all_counters() helpers These helpers count elements in the endpoint state map. It makes sense to keep them in gossiper API, but it's worth removing the wrappers that do invoke_on(0). This makes code shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	5f53799ffb	api: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	0ef33b71ba	gossiper, api: Remove get_arrival_samples() It's empty too, but the API-side conversion probably has some value for the future, so keep it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	37d392c772	gossiper, api: Remove get/set phi convict threshold helpers These are empty anyway. API caller can place return stubs itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	ad786d6b4d	gossiper, api: Move get_simple_states() into API code The API method in question just tries to scan the state map. There's no need in doing invoke_on(0) and in a separate helper method in gossiper, the creation of the json return value can happen in the API handler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	49dd6b5371	gossiper: In-line std::optional<> get_endpoint_state_for_endpoint() overload The method helps updating enpoint state in handle_major_state_change by returning a copy of an endpoint state that's kept while the map's entry is being replaced with the new state. It can be replaced with a shorter code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	f278d84cfe	gossiper, api: Remove get_endpoint_state() helpers There are two of them -- one to do invoke_on(0) the other one to get the needed data. The former one is not needed -- the scanned endpoint state map is replicated accross shards and is the same everywhere. The latter is not needed, because there's only one user of it -- the API -- which can work with the existing gossiper API. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	0aea43a245	gossiper: Make state and locks maps private Locks are not needed outside gossiper, state map is sometimes read from, but there a const getter for such cases. Both methods now desrve the underbar prefix, but it doesn't come with this short patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Pavel Emelyanov	690b21aa4d	gossiper: Remove dead code Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Botond Dénes	9623589c77	Merge 'Futurize data_read_resolver::resolve and to_data_query_result' from Benny Halevy This series futurizes two synchronous functions used for data reconciliation: `data_read_resolver::resolve` and `to_data_query_result` and does so by introducing lower-level asynchronous infrastructure: `mutation_partition_view::accept_gently`, `frozen_mutation::unfreeze_gently` and `frozen_mutation::consume_gently`, and `mutation::consume_gently`. This trades some cycles on this cold path to prevent known reactor stalls. Fixes #2361 Fixes #10038 Closes #10482 * github.com:scylladb/scylla: mutation: add consume_gently frozen_mutation: add consume_gently query: coroutinize to_data_query_result frozen_mutation: add unfreeze_gently mutation_partition_view: add accept_gently methods storage_proxy: futurize data_read_resolver::resolve	2022-05-06 10:23:02 +03:00
Botond Dénes	e8f3d7dd13	sstables/index_reader: short-circuit fast-forward-to when at EOF Attempting to call advance_to() on the index, after it is positioned at EOF, can result in an assert failure, because the operation results in an attempt to move backwards in the index-file (to read the last index page, which was already read). This only happens if the index cache entry belonging to the last index page is evicted, otherwise the advance operation just looks-up said entry and returns it. To prevent this, we add an early return conditioned on eof() to all the partition-level advance-to methods. A regression unit test reproducing the above described crash is also added.	2022-05-05 14:42:37 +03:00
Botond Dénes	98f3d516a2	test/lib/random_schema: add a simpler overload for fixed partition count Some tests want to generate a fixed amount of random partitions, make their life easier.	2022-05-05 14:33:37 +03:00
Piotr Sarna	eeec502aee	Merge 'gms: feature_service: reduce boilerplate to add a cluster feature' from Avi Kivity Currently, adding a cluster feature requires editing several files and repeating the new feature name several times. This series reduces the boilerplate to a single line (for non-experimental features), and perhaps three for experimental features. Closes #10488 * github.com:scylladb/scylla: gms: feature_service: remove variable/helper function duplication gms: feature: make `operator bool` implicit gms: feature_service: remove feature variable duplication in enable() gms: feature_service: remove feature variable declaration/definition duplication gms: features: de-quadruplicate active feature names gms: features: de-quadruplicate deprecated feature names gms: feature_service: avoid duplicating feature names when listing known features	2022-05-05 12:43:15 +02:00
Benny Halevy	ca1b616092	mutation: add consume_gently Allow yielding when consuming a mutation, and use in to_data_query_result. Fixes #10038 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00

1 2 3 4 5 ...

31127 Commits