scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-04 05:53:13 +00:00

Author	SHA1	Message	Date
Botond Dénes	4bd4aa2e88	Merge 'memtable, cache: Eagerly compact data with tombstones' from Tomasz Grabiec When memtable receives a tombstone it can happen under some workloads that it covers data which is still in the memtable. Some workloads may insert and delete data within a short time frame. We could reduce the rate of memtable flushes if we eagerly drop tombstoned data. One workload which benefits is the raft log. It stores a row for each uncommitted raft entry. When entries are committed they are deleted. So the live set is expected to be short under normal conditions. Fixes #652. Closes #10807 * github.com:scylladb/scylla: memtable: Add counters for tombstone compaction memtable, cache: Eagerly compact data with tombstones memtable: Subtract from flushed memory when cleaning mvcc: Introduce apply_resume to hold state for partition version merging test: mutation: Compare against compacted mutations compacting_reader: Drop irrelevant tombstones mutation_partition: Extract deletable_row::compact_and_expire() mvcc: Apply mutations in memtable with preemption enabled test: memtable: Make failed_flush_prevents_writes() immune to background merging	2022-06-15 18:12:42 +03:00
Tomasz Grabiec	169025d9b4	memtable: Add counters for tombstone compaction	2022-06-15 11:30:25 +02:00
Pavel Emelyanov	9a88bc260c	Merge 'various group0 start/stop issues' from Gleb The series fixes a couple of crashes that were found during starting and stopping Scylla with raft while doing ddl operations. Most of them related to shutdown order between different components. Also in scylla-dev gleb/group0-fixes-v1 CI https://jenkins.scylladb.com/job/releng/job/Scylla-CI/749/ * origin-dev/gleb/group0-fixes-v1: migration manager: remove unused code db/system_distributed_keyspace: do not announce empty schema main: stop raft before the migration manager storage_service: do not pass the raft group manager to storage_service constructor main: destroy the group0_client after stopping the group0	2022-06-15 11:44:03 +03:00
Michael Livshin	aab4cd850c	allow pre-scrub snapshots of materialized views and secondary indices Previously, any attempt to take a materialized view or secondary index snapshot was considered a mistake and caused the snapshot operation to abort, with a suggestion to snapshot the base table instead. But an automatic pre-scrub snapshot of a view cannot be attributed to user error, so the operation should not be aborted in that case. (It is an open question whether the more correct thing to do during pre-scrub snapshot would be to silently ignore views. Or perhaps they should be ignored in all cases except when the user explicitly asks to snapshot them, by name) Closes #10760. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-15 11:30:58 +03:00
Avi Kivity	5129280f45	Revert "Merge 'memtable, cache: Eagerly compact data with tombstones' from Tomasz Grabiec" This reverts commit `e0670f0bb5`, reversing changes made to `605ee74c39`. It causes failures in debug mode in database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain, though with low probability. Fixes #10780 Reopens #652.	2022-06-14 18:06:22 +03:00
Avi Kivity	c80999fab4	cql3: expr: push is_satisfied_by regular and static column extraction to callers is_satisfied_by() rearranges the static and regular columns from query::result_row_view form (which is a use-once iterator) to std::vector<managed_bytes_opt> (which uses the standard value representation, and allows random access which expression evaluation needs). Doing it in is_saitisfied_by() means that it is done every time an expression is evaluated, which is wasteful. It's also done even if the expression doesn't need it at all. Push it out to callers, which already eliminates some calls. We still pass cql3::expr::selection, which is a layering violation, but that is left to another time. Note that in view.cc's check_if_matches(), we should have been able to move static_and_regular_columns calculation outside the loop. However, we get crashes if we do. This is likely due to a preexisting bug (which the zero iterations loop avoids). However, in selection.cc, we are able to avoid the computation when the code claims it is only handling partition keys or clustering keys.	2022-06-12 16:12:41 +03:00
Avi Kivity	4b715226fe	cql3: expr: convert is_satisfied_by() signature to evaluation_inputs Callers are converted, but the internals are kept using the old conventions until more APIs are converted. Although the new API allows passing no query_options, the view code keeps passing dummy query_options and improvement is left as a FIXME.	2022-06-12 12:53:44 +03:00
Gleb Natapov	727a9071d8	db/system_distributed_keyspace: do not announce empty schema	2022-06-09 09:40:55 +03:00
Tomasz Grabiec	0bc45f9666	memtable: Add counters for tombstone compaction	2022-06-06 19:25:41 +02:00
Michael Livshin	632b4e5a9a	fix "ninja dev-headers" Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-05-31 23:42:34 +03:00
Michael Livshin	029508b77c	flat_mutation_reader ist tot Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-05-31 23:42:34 +03:00
Avi Kivity	4b53af0bd5	treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime of the function object is less ambiguous, and so it is safer. Replace all eligible occurences (i.e. caller is a coroutine). One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra attention since there was a handle_exception() continuation attached. It is converted to a try/catch. Closes #10699	2022-05-31 09:06:24 +03:00
Pavel Emelyanov	7f2837824e	system_keyspace: Save coroutine's captured variable on stack Currently it works, but the newer version of seastar's map_reduce() is compiled in a way to trigger use-after-free on accessing captured value. tests: unit(dev), unit.alternator(debug on v1) Fixes #10689 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220523095409.6078-1-xemul@scylladb.com>	2022-05-30 17:46:32 +03:00
Benny Halevy	6677028212	sstables: mx/writer: auto-scale promoted index Add column_index_auto_scale_threshold_in_kb to the configuration (defaults to 10MB). When the promoted index (serialized) size gets to this threshold, it's halved by merging each two adjacent blocks into one and doubling the desired_block_size. Fixes #4217 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-24 13:32:35 +03:00
Avi Kivity	5285ccbb12	Merge 'Add prune ghost rows statement' from Piotr Sarna This series is split from another, bigger RFC series which provides manual remedies to deal with inconsistencies between the base table and its views. This part deals with ghost rows by providing a statement which fetches view rows from a given range, then reads its corresponding rows from the base table (cl=ALL), and finally removes rows which were not present in the base table at all, qualifying them as ghost rows. Motivations for introducing such a statement: * in case of detected inconsistencies, it can be used to fix materialized views without recreating them from scratch, which can take days and generates lots of throughput * a tool which periodically scrubs a materialized view can be easily created on top of this statement, especially that it's possible to remove ghost rows from a user-defined view token range; This series comes with a unit test. The reason for digging up this series is because it's still possible to end up with ghost rows in certain rather improbable scenarios, and we lack a way of fixing them without rebuilding the whole view. For instance, in case of a failed synchronous update to a local view, the user will be notified that the query failed, but a ghost row can be created nonetheless. The pruning statement introduced in this series would allow healing the failure locally, without rebuilding the whole view. Tests: unit(dev) Closes #10426 * github.com:scylladb/scylla: docs: add a paragraph on PRUNE MATERIALIZED VIEW statement service,test: add a test case for error during pruning tests: add ghost row deletion test case cql3: enable ghost row deletion via CQL cql3: add a statement for deleting ghost rows cql3: convert is_json statement parameter to enum pager: add ghost row deleting pager db,view: add delete ghost rows visitor	2022-05-19 17:21:35 +03:00
Piotr Sarna	c3a9658535	db,view: add delete ghost rows visitor The visitor is used to traverse view rows, and if it detects a ghost row it qualifies it for deletion. Qualification is based on a base table read with cl=ALL: if the corresponding row is not present in the base table, it is considered a ghost.	2022-05-19 10:11:50 +02:00
Pavel Emelyanov	f81f1c7ef7	format-selector: Remove .sync() point The feature listener callbacks are waited upon to finish in the middle of the cluster joining process. I particular -- before actually joining the cluster the format should have being selected. For that there's a .sync() method that locks the semaphore thus making sure that any update is finished and it's called right after the wait_for_gossip_to_settle() finishes. However, features are enabled inside the wait_for_gossip_to_settle() in a seastar::async() context that's also waited upon to finish. This waiting makes it possible for any feature listener to .get() any of its futures that should be resolved until gossip is settled. Said that, the format selection barrier can be moved -- instead of waiting on the semaphore, the respective part of the selection code can be .get()-ed (it all runs in async context). One thing to care about -- the remainder should continue running with the gate held. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-16 14:14:14 +03:00
Pavel Emelyanov	7fee50f1e3	format-selector: Coroutinize maybe_select_format() This method is run when a feature is enabled. It's a bit trickier than the others, also there are two methods actually, that are merged into one by this patch. By and large most of the care is about the _sel gate and _sem semaphore. The gate protects the whole selection code from the selector being freed from underneath it on stop. The semaphore is only needed to keep two different format selections from each other -- each update the system keyspace, local variable and replica::database instance on all shards. In the end there's a gossiper update, but it happens outside of the semaphore. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-16 14:13:59 +03:00
Pavel Emelyanov	93df88aac4	format-selector: Coroutinize simple methods These all are just straightfowrard usage of co_await's around the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-16 14:13:59 +03:00
Avi Kivity	528ab5a502	treewide: change metric calls from make_derive to make_counter make_derive was recently deprecated in favor of make_counter, so make the change throughput the codebase. Closes #10564	2022-05-14 12:53:55 +02:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Michael Livshin	00ed4ac74c	batchlog_manager: warn when a batch fails to replay Only for reasons other than "no such KS", i.e. when the failure is presumed transient and the batch in question is not deleted from batchlog and will be retried in the future. (Would info be more appropriate here than warning?) Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Closes #10556	2022-05-12 13:34:03 +03:00
Benny Halevy	e1d58d4422	database: add snapshot_on_all And move the logic from snapshot-ctl down to the replica::database layer. A following patch will move the flush phase from the replica::table::snapshot layer out to the caller. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	aa127a2dbb	snapshot-ctl: run_snapshot_modify_operation: reject views and secondary index using the schema Detecting a secondary index by checking for a dot in the table name is wrong as tables generated by Alternator may contain a dot in their name. Instead detect bot hmaterialized view and secondary indexes using the schema()->is_view() method. Fixes #10526 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:44:52 +03:00
Benny Halevy	1fbcdbd2e8	snapshot-ctl: refactor and coroutinize take_snapshot / take_column_family_snapshot There is no functional change in this patch. Only refactoring of the code. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:16:39 +03:00
Benny Halevy	5b4eb44795	database: add flush_on_all variants Use by api layer. Will be used in a later patch to flush on all shards before taking a snapshot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Botond Dénes	fd27fbfe64	Merge "Add user types carrier helper" from Pavel Emelyanov " There's a cql_type_parser::parse() method that needs to get user types for a keyspace by its name. For this it uses the global storage proxy instance as a place to get database from. This set introduces an abstract user_types_storage helper object that's responsible in providing the user types for the caller. This helper, in turn, is provided to the parse() method by the database itself or by the schema_ctxt object that needs parse() to unfreeze schemas and doesn't have database at those times. This removes one more get_storage_proxy() call. " * 'br-user-types-storage' of https://github.com/xemul/scylla: cql_type_parser: Require user_types_storage& in parse() schame_tables: Add db/ctxt args here and there user_types: Carry storage on database and schema_ctxt data_dictionary: Introduce user types storage	2022-05-09 17:38:52 +03:00
Pavel Emelyanov	0aea43a245	gossiper: Make state and locks maps private Locks are not needed outside gossiper, state map is sometimes read from, but there a const getter for such cases. Both methods now desrve the underbar prefix, but it doesn't come with this short patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-06 10:34:48 +03:00
Piotr Sarna	eeec502aee	Merge 'gms: feature_service: reduce boilerplate to add a cluster feature' from Avi Kivity Currently, adding a cluster feature requires editing several files and repeating the new feature name several times. This series reduces the boilerplate to a single line (for non-experimental features), and perhaps three for experimental features. Closes #10488 * github.com:scylladb/scylla: gms: feature_service: remove variable/helper function duplication gms: feature: make `operator bool` implicit gms: feature_service: remove feature variable duplication in enable() gms: feature_service: remove feature variable declaration/definition duplication gms: features: de-quadruplicate active feature names gms: features: de-quadruplicate deprecated feature names gms: feature_service: avoid duplicating feature names when listing known features	2022-05-05 12:43:15 +02:00
Pavel Emelyanov	0f698910e8	cql_type_parser: Require user_types_storage& in parse() Right now to get user types the method in question gets global proxy instance to get database from it and then peek a keyspace, its metadata and, finally, the user types. There's also a safety check for proxy not being initialized, which happens in tests. Instead of messing with the proxy, the parse() method now accepts the user_types_storage reference from which it gets the types. All the callers already have the needed storage at hand -- in most of the cases it's one shared between the database and schema_ctxt. In case of tests is's a dummy storage, in case of schema-loader it's its local one. The get_column_mapping() is special -- it doesn't expect any user-types to be parsed and passes "" keyspace into it, neither it has db/ctxt to get types storage from, so it can safely use the dummy one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-05 13:11:18 +03:00
Pavel Emelyanov	44f38d4de2	schame_tables: Add db/ctxt args here and there This is to have them in places that call cql_type_parser::parse. Pure churn reduction for the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-05 13:11:18 +03:00
Pavel Emelyanov	2104d90dd0	user_types: Carry storage on database and schema_ctxt The user types storage is needed in cql_type_parser::parse which is in turn called with either replica::database or scema_ctxt at hand. To facilitate the former case replica::database has its own user types storage created in database constructor. The latter case is a bit trickier. In many cases the ctxt is created as a temporary object and the database is available at those places. Also the ctxt object lives on the schema_registry instance which doesn't have database nearby. However, that ctxt lifetime is the same as the registry instance one and when it's created there's a database at hand (it's the database constructor that calls schema_registry.init() passing "this" into it). Thus, the solution is to make database's user types storage be a shared pointer that's shared between database itself and all the ctxts out there including the one that lives on schema_registry instance. When database goes away it .deactivate()s its user types storage so that any ctxts that may share it stay on the safe side and don't use database after free. This part will go away when the schema_registry will be deglobalized. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-05 13:06:04 +03:00
Avi Kivity	19ab3edd77	gms: feature_service: remove variable/helper function duplication Each feature has a private variable and a public accessor. Since the accessor effectively makes the variable public, avoid the intermediary and make the variable public directly. To ease mechanical translation, the variable name is chosen as the function name (without the cluster_supports_ prefix). References throughout the codebase are adjusted.	2022-05-04 18:59:56 +03:00
Michał Radwański	29e09a3292	db/config: command line arguments logger_stdout_timestamps and logger_ostream_type are no longer ignored Closes #10452	2022-05-04 14:40:52 +03:00
Pavel Emelyanov	063d26bc9e	system_keyspace/config: Swallow string->value cast exception When updating an updateable value via CQL the new value comes as a string that's then boost::lexical_cast-ed to the desired value. If the cast throws the respective exception is printed in logs which is very likely uncalled for. fixes: #10394 tests: manual Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220503142942.8145-1-xemul@scylladb.com>	2022-05-04 08:35:12 +03:00
Pavel Emelyanov	11c99fc41b	table: Don't use global gossiper The table::get_hit_rate needs gossiper to get hitrates state from. There's no way to carry gossiper reference on the table itself, so it's up to the callers of that method to provide it. Fortunately, there's only one caller -- the proxy -- but the call chain to carry the reference it not very short ... oh, well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:33:08 +03:00
Eliran Sinvani	a16b4e407d	internal queries: add caching to some queries Some of the internal queries didn't have caching enabled even though there are chances of the query executing in large bursts or relatively often, example of the former is `default_authorized::authorize` and for the later is `system_distributed_keyspace::get_service_levels`. Fixes #10335 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2022-05-01 13:30:02 +03:00
Eliran Sinvani	e0c7178e75	query_processor: remove default internal query caching behavior When executing internal queries, it is important that the developer will decide if to cache the query internally or not since internal queries are cached indefinitely. Also important is that the programmer will be aware if caching is going to happen or not. The code contained two "groups" of `query_processor::execute_internal`, one group has caching by default and the other doesn't. Here we add overloads to eliminate default values for caching behaviour, forcing an explicit parameter for the caching values. All the call sites were changed to reflect the original caching default that was there. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2022-05-01 08:33:55 +03:00
Eliran Sinvani	38b7ebf526	query_processor: make execute_internal caching parameter more verbose `execute_internal` has a parameter to indicate if caching a prepared statement is needed for a specific call. However this parameter was a boolean so it was easy to miss it's meaning in the various call sites. This replaces the parameter type to a more verbose one so it is clear from the call site what decision was made.	2022-05-01 08:33:55 +03:00
Avi Kivity	de0ee13f45	schema_tables: forward-declare user_function and user_aggerates These bring in wasm.hh (though they really shouldn't) and make everyone suffer. Forward declare instead and add missing includes where needed. Closes #10444	2022-04-28 07:22:02 +03:00
Benny Halevy	e88871f4ec	replica: database: move shard_of implementation to mutation layer We don't need the database to determine the shard of the mutation, only its schema. So move the implementation to the respecive definitions of mutation and frozen_mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10430	2022-04-27 14:40:24 +03:00
Botond Dénes	3051fc3cbc	Merge 'Fix some errors and issues found by gcc 12' from Avi Kivity gcc 12 checks some things that clang doesn't, resulting in compile errors. This series fixes some of theses issues, but still builds (and tests) with clang. Unfortunately, we still don't have a clean gcc build due to an outstanding bug [1]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056 Closes #10386 * github.com:scylladb/scylla: build: disable warnings that cause false-positive errors with gcc 12 utils: result_loop: remove invalid and incorrect constraint service: forward_service: avoid using deprecated std::bind1st and std::not1 repair: explicityl ignore tombstone gc update response treewide: abort() after switch in formatters db: view: explicitly ignore unused result compaction: leveled_compaction_strategy: avoid compares between signed and unsigned compaction_manager: compaction_reenabler: disambiguate compaction_state api: avoid function specialization in req_param alternator: ttl: avoid specializing class templates in non-namespace scope alternator: executor: fix signed/unsigned comparison in is_big()	2022-04-19 10:25:38 +03:00
Avi Kivity	a1df583dea	db: view: explicitly ignore unused result Otherwise, gcc complains.	2022-04-18 12:27:18 +03:00
Piotr Sarna	fea18943cd	schema_tables: drop leftover change to system_schema.keyspaces Series `59d56a3fd7` introduced an accidental backward incompatible regression by adding a column to system_schema.keyspaces and then not even using it for anything. It's a leftover from the original hackathon implementation and should never reach master in the first place. Fortunately, the series isn't part of any stable release yet. Fixes #10376 Tests: manual, verifying that the system_schema.keyspaces table no longer contains the extraneous column. Closes #10377	2022-04-18 12:00:43 +03:00
Kamil Braun	41f5b7e69e	Merge branch 'raft_group0_early_startup_v3' of https://github.com/ManManson/scylla into next * 'raft_group0_early_startup_v3' of https://github.com/ManManson/scylla: main: allow joining raft group0 before waiting for gossiper to settle service: raft_group0: make `join_group0` re-entrant service: storage_service: add `join_group0` method raft_group_registry: update gossiper state only on shard 0 raft: don't update gossiper state if raft is enabled early or not enabled at all gms: feature_service: add `cluster_uses_raft_mgmt` accessor method db: system_keyspace: add `bootstrap_needed()` method db: system_keyspace: mark getter methods for bootstrap state as "const"	2022-04-14 16:42:20 +02:00
Avi Kivity	8aec146dec	Merge "Remove qctx from repair" from Pavel E " Repair code keeps its history in system keyspace and uses the qctx global thing to update and query it. This set replaces the qctx with the explicit reference on the system_keyspace object. tests: unit(dev), dtest.repair_test(dev) " * 'br-repair-vs-qctx' of https://github.com/xemul/scylla: repair, system_keyspace: Query repair_history with a helper repair: Update loader code to use system_keyspace entry repair, system_keyspace: Update repair_history with a helper repair: Keep system keyspace reference	2022-04-12 17:08:41 +03:00
Avi Kivity	546ee814dd	Merge 'schema_tables, sstables: return instead of throwing' from Piotr Sarna This miniseries rewrites a few unnecessary throws into forwarding the exception directly. It's partially possible thanks to the new `co_await coroutine::return_exception` mechanism which allows returning from a coroutine early, without explicitly calling co_return (`d5843f6e88`). Closes #10360 * github.com:scylladb/scylla: sstables: : remove unnecessary throws schema_tables: remove unnecessary throws	2022-04-12 15:18:14 +03:00
Piotr Sarna	91f130bd9c	schema_tables: remove unnecessary throws Throws are translated to passing the exception directly.	2022-04-12 13:09:27 +02:00
Pavel Emelyanov	05eb9c9416	repair, system_keyspace: Query repair_history with a helper Querying the table is now done with the help of qctx directly. This patch replaces it with a querying helper that calls the consumer function with the entry struct as the argument. After this change repair code can stop including query_context and mess with untyped_result_set. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-12 14:04:21 +03:00
Pavel Emelyanov	9940016e05	repair, system_keyspace: Update repair_history with a helper Current code works directly on the qctx which is not nice. Instead, make it use the system keyspace reference. To make it work, the patch adds a helper method and introduces a helper struct for the table entry. This struct will also be used to query the table (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-12 13:57:57 +03:00

... 47 48 49 50 51 ...

4972 Commits