scylladb

Author	SHA1	Message	Date
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Tomasz Grabiec	691009bc1e	db, schema: Hide update_schema_version_and_announce()	2020-09-11 14:42:48 +02:00
Tomasz Grabiec	9f58dcc705	db, storage_service: Do not call into gossiper from the database layer The storage service computes gossiper states before it starts the gossiper. Among them, node's schema version. There are two problems with that. First is that computing the schema version and publishing it is not atomic, so is not safe against concurrent schema changes or schema version recalculations. It will not exclude with recalculate_schema_version() calls, and we could end up with the old (and incorrect) schema version being advertised in gossip. Second problem is that we should not allow the database layer to call into the gossiper layer before it is fully initialized, as this may produce undefined behavior. The solution for both problems is to break the cyclic dependency between the database layer and the storage_service layer by having the database layer not use the gossiper at all. The database layer publishes schema version inside the database class and allows installing listeners on changes. The storage_service layer asks the database layer for the current version when it initializes, and only after that installs a listener which will update the gossiper. This also allows us to drop unsafe functions like update_schema_version().	2020-09-11 14:42:41 +02:00
Tomasz Grabiec	ad0b674b13	db: Make schema version observable	2020-09-11 14:42:41 +02:00
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Avi Kivity	907b775523	Merge "Free compaction from storage service" from Pavel E " There's last call for global storage service left in compaction code, it comes from cleanup_compaction to get local token ranges for filtering. The call in question is a pure wrapper over database, so this set just makes use of the database where it's already available (perform_cleanup) and adds it where it's needed (perform_sstable_upgrade). tests: unit(dev), nodetool upgradesstables " * 'br-remove-ss-from-compaction-3' of https://github.com/xemul/scylla: storage_service: Remove get_local_ranges helper compaction: Use database from options to get local ranges compaction: Keep database reference on upgrade options compaction: Keep database reference on cleanup options db: Factor out get_local_ranges helper	2020-08-23 17:58:32 +03:00
Pavel Emelyanov	06f4828b93	db: Factor out get_local_ranges helper Storage service and repair code have identical helpers to get local ranges for keyspace. Move this helper's code onto database, later it will be reused by one more place. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-08-21 14:58:40 +03:00
Benny Halevy	dd6d771331	database: keep const token_metadata& No need to modify token_metadata form database code. Also, get rid of mutable get_token_metadata variant. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-20 16:20:34 +03:00
Benny Halevy	8b5c32c7a8	database: keyspace_metadata: pass const locator::token_metadata& around No need to modify token_metadata on this path. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-20 16:20:34 +03:00
Benny Halevy	4dba81cb92	replication_strategy: keep a const token_metadata& replication strategies don't need to change token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-20 16:20:34 +03:00
Tomasz Grabiec	f8df214836	db: view: Fix incorrect schema access during view building after base table schema changes The view building process was accessing mutation fragments using current table's schema. This is not correct, fragments must be accessed using the schema of the generating reader. This could lead to undefined behavior when the column set of the base table changes. out_of_range exceptions could be observed, or data in the view ending up in the wrong column. Refs #7061. The fix has two parts. First, we always use the reader's schema to access fragments generated by the reader. Second, when calling populate_views() we upgrade the fragment-wrapping reader's schema to the base table schema so that it matches the base table schema of view_and_base snapshots passed to populate_views().	2020-08-20 14:53:07 +02:00
Tomasz Grabiec	3a6ec9933c	db: views: Fix undefined behavior on base table schema changes The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema does not change when the base table is altered. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entitiy called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Refs #7061.	2020-08-20 14:53:07 +02:00
Raphael S. Carvalho	11df96718a	compaction: Prevent non-regular compaction from picking compacting SSTables After `8014c7124`, cleanup can potentially pick a compacting SSTable. Upgrade and scrub can also pick a compacting SSTable. The problem is that table::candidates_for_compaction() was badly named. It misleads the user into thinking that the SSTables returned are perfect candidates for compaction, but manager still need to filter out the compacting SSTables from the returned set. So it's being renamed. When the same SSTable is compacted in parallel, the strategy invariant can be broken like overlapping being introduced in LCS, and also some deletion failures as more than one compaction process would try to delete the same files. Let's fix scrub, cleanup and ugprade by calling the manager function which gets the correct candidates for compaction. Fixes #6938. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200811200135.25421-1-raphaelsc@scylladb.com>	2020-08-16 17:31:03 +03:00
Nadav Har'El	8135647906	merge: Add metrics to semaphores Merged pull request https://github.com/scylladb/scylla/pull/7018 by Piotr Sarna: This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added. More precise semaphore metrics would ease the testing and development of load shedding and admission control. view_builder: add metrics db, view: add view update generator metrics hints: track resource_manager sending queue length hints: add drain queue length to metrics table: add metrics for sstable deletion semaphore database: remove unused semaphore	2020-08-12 12:39:59 +03:00
Piotr Sarna	3b8fd11fa3	database: remove unused semaphore A semaphore for limiting the number of loaded sstables is completely unused, so it can be removed.	2020-08-11 09:48:12 +02:00
Benny Halevy	8e0e2c8a48	database: add set_format_by_config This is required for test applications that may select a sstable format different than the default mc format, like perf_fast_forward. These apps don't use the gossip-based sstables_format_selector to set the format based on the cluster feature and so they need to rely on the db config. Call set_format_by_config in single_node_cql_env::do_with. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-08-10 18:53:04 +03:00
Botond Dénes	9eab5bca27	query_*(): use the coordinator specified memory limit for unlimited queries It is important that all replicas participating in a read use the same memory limits to avoid artificial differences due to different amount of results. The coordinator now passes down its own memory limit for reads, in the form of max_result_size (or max_size). For unpaged or reverse queries this has to be used now instead of the locally set max_memory_unlimited_query configuration item. To avoid the replicas accidentally using the local limit contained in the `query_class_config` returned from `database::make_query_class_config()`, we refactor the latter into `database::get_reader_concurrency_semaphore()`. Most of its callers were only interested in the semaphore only anyway and those that were interested in the limit as well should get it from the coordinator instead, so this refactoring is a win-win.	2020-07-28 18:00:29 +03:00
Botond Dénes	159d37053d	storage_proxy: use read_command::max_result_size to pass max result size around Use the recently added `max_result_size` field of `query::read_command` to pass the max result size around, including passing it to remote nodes. This means that the max result size will be sent along each read, instead of once per connection. As we want to select the appropriate `max_result_size` based on the type of the query as well as based on the query class (user or internal) the previous method won't do anymore. If the remote doesn't fill this field, the old per-connection value is used.	2020-07-28 18:00:29 +03:00
Botond Dénes	a64d9b8883	database: add get_statement_scheduling_group()	2020-07-28 18:00:29 +03:00
Botond Dénes	d5cc932a0b	database: query_mutations(): obtain the memory accounter inside Instead of requesting callers to do it and pass it as a parameter. This is in line with data_query().	2020-07-28 18:00:29 +03:00
Botond Dénes	517a941feb	query_class_config: move into the query namespace It belongs there, its name even starts with "query".	2020-07-28 18:00:29 +03:00
Botond Dénes	cd849ed40d	database: add make_restricted_range_sstable_reader() A variant of `make_range_sstable_reader()` that wraps the reader in a restricting reader, hence making it wait for admission on the read concurrency semaphore, before starting to actually read.	2020-07-20 11:23:39 +03:00
Amnon Heiman	186301aff8	per table metrics: change estimated_histogram to time_estimated_histogram This patch changes the per table latencies histograms: read, write, cas_prepare, cas_accept, and cas_learn. Beside changing the definition type and the insertion method, the API was changed to support the new metrics. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2020-07-14 11:17:43 +03:00
Raphael S. Carvalho	ce210a4420	table: simplify add_sstable() get_shards_for_this_sstable() can be called inside table::add_sstable() because the shards for a sstable is precomputed and so completely exception safe. We want a central point for checking that table will no longer added shared SSTables to its sstable set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:23:32 -03:00
Raphael S. Carvalho	68b527f100	table: simplify update_stats_for_new_sstable() no longer need to conditionally track the SSTable metadata, as table will no longer accept shared SSTables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:22:04 -03:00
Raphael S. Carvalho	607c74dc95	table: remove unused open_sstable function Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:22:00 -03:00
Raphael S. Carvalho	60467a7e36	table: no longer keep track of sstables that need resharding Now that table will no longer accept shared SSTables, it no longer needs to keep track of them. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:38 -03:00
Raphael S. Carvalho	cd548c6304	table: Remove unused functions no longer used by resharding Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2020-06-29 14:21:36 -03:00
Pavel Emelyanov	f045cec586	snap: Get rid of storage_service reference in schema.cc Now when the snapshot stopping is correctly handled, we may pull the database reference all the way down to the schema::describe(). One tricky place is in table::napshot() -- the local db reference is pulled through an smp::submit_to call, but thanks to the shard checks in the place where it is needed the db is still "local" Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-06-26 20:28:25 +03:00
Avi Kivity	e5be3352cf	database, streaming, messaging: drop streaming memtables Before Scylla 3.0, we used to send streaming mutations using individual RPC requests and flush them together using dedicated streaming memtables. This mechanism is no longer in use and all versions that use it have long reached end-of-life. Remove this code.	2020-06-25 15:25:54 +02:00
Glauber Costa	b34c0c2ff6	distributed_loader: rework uploading of SSTables Uploading of SSTables is problematic: for historical reasons it takes a lock that may have to wait for ongoing compactions to finish, then it disables writes in the table, and then it goes loading SSTables as if it knew nothing about them. With the sstable_directory infrastructure we can do much better: * we can reshard and reshape the SSTables in place, keeping the number of SSTables in check. Because this is an background process we can be fairly aggressive and set the reshape mode to strict. * we can then move the SSTables directly into the main directory. Because we know they are few in number we can call the more elegant add_sstable_and_invalidate_cache instead of the open coding currently done by load_new_sstables * we know they are not shared (if they were, we resharded them), simplifying the load process even further. The major changes after this patch is applied is that all compactions (resharding and reshape) needed to make the SSTables in-strategy are done in the streaming class, which reduces the impact of this operation on the node. When the SSTables are loaded, subsequent reads will not suffer as we will not be adding shared SSTables in potential high numbers, nor will we reshard in the compaction class. There is also no more need for a lock in the upload process so in the fast path where users are uploading a set of SSTables from a backup this should essentially be instantaneous. The lock, as well as the code to disable and enable table writes is removed. A future improvement is to bypass the staging directory too, in which case the reshaping compaction would already generate the view updates. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-18 09:37:18 -04:00
Rafael Ávila de Espíndola	336d541f58	database: Use a flat_hash_map for _ks_cf_to_uuid Given that the key is a std::pair, we have to explicitly mark the hash and eq types as transparent for heterogeneous lookup to work. With that, pass std::string_view to a few functions that just check if a value is in the map. This increases the .text section by 11 KiB (0.03%). Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-14 08:18:39 -07:00
Rafael Ávila de Espíndola	6da9eef25f	database: Use flat_hash_map for _keyspaces This changes the hash map used for _keyspaces. Using a flat_hash_map allows using std::string_view in has_keyspace thanks to the heterogeneous lookup support. This add 200 KiB to .text, since this is the first use of absl and brings in files from the .a. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-06-14 08:18:39 -07:00
Glauber Costa	e48ad3dc23	remove manifest_file filter from table. When we are scanning an sstable directory, we want to filter out the manifest file in most situations. The table class has a filter for that, but it is a static filter that doesn't depend on table for anything. We are better off removing it and putting in another independent location. While it seems wasteful to use a new header just for that, this header will soon be populated with the sstable_directory class. Tests: unit (dev) Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-06-08 16:06:00 -04:00
Raphael S. Carvalho	077b4ee97d	table: Don't remove a SSTable from the backlog tracker if not previously added After `7f1a215`, a sstable is only added to backlog tracker if sstable::shared() returns true. sstable::shared() can return true for a sstable that is actually owned by more than one shard, but it can also incorrectly return true for a sstable which wasn't made explicitly unshared through set_unshared(). A recent work of mine is getting rid of set_unshared() because a sstable has the knowledge to determine whether or not it's shared. The problem starts with streaming sstable which hasn't set_unshared() called for it, so it won't be added to backlog tracker, but it can be eventually removed from the tracker when that sstable is compacted. Also, it could happen that a shared sstable, which was resharded, will be removed from the tracker even though it wasn't previously added. When those problems happen, backlog tracker will have an incorrect account of total bytes, which leads it to producing incorrect backlogs that can potentially go negative. These problems are fixed by making every add / removal go through functions which take into account sstable::shared(). Fixes #6227. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200512220226.134481-2-raphaelsc@scylladb.com>	2020-06-03 17:35:22 +03:00
Avi Kivity	0c6bbc84cd	Merge "Classify queries based on their initiator, rather than their target" from Botond " Currently we classify queries as "system" or "user" based on the table they target. The class of a query determines how the query is treated, currently: timeout, limits for reverse queries and the concurrency semaphore. The catch is that users are also allowed to query system tables and when doing so they will bypass the limits intended for user queries. This has caused performance problems in the past, yet the reason we decided to finally address this is that we want to introduce a memory limit for unpaged queries. Internal (system) queries are all unpaged and we don't want to impose the same limit on them. This series uses scheduling groups to distinguish user and system workloads, based on the assumption that user workloads will run in the statement scheduling group, while system workloads will run in the main (or default) scheduling group, or perhaps something else, but in any case not in the statement one. Currently the scheduling group of reads and writes is lost when going through the messaging service, so to be able to use scheduling groups to distinguish user and system reads this series refactors the messaging service to retain this distinction across verb calls. Furthermore, we execute some system reads/writes as part of user reads/writes, such as auth and schema sync. These processes are tagged to run in the main group. This series also centralises query classification on the replica and moves it to a higher level. More specifically, queries are now classified -- the scheduling group they run in is translated to the appropriate query class specific configuration -- on the database level and the configuration is propagated down to the lower layers. Currently this query class specific configuration consists of the reader concurrency semaphore and the max memory limit for otherwise unlimited queries. A corollary of the semaphore begin selected on the database level is that the read permit is now created before the read starts. A valid permit is now available during all stages of the read, enabling tracking the memory consumption of e.g. the memtable and cache readers. This change aligns nicely with the needs of more accurate reader memory tracking, which also wants a valid permit that is available in every layer. The series can be divided roughly into the following distinct patch groups: * 01-02: Give system read concurrency a boost during startup. * 03-06: Introduce user/system statement isolation to messaging service. * 07-13: Various infrastructure changes to prepare for using read permits in all stages of reads. * 14-19: Propagate the semaphore and the permit from database to the various table methods that currently create the permit. * 20-23: Migrate away from using the reader concurrency semaphore for waiting for admission, use the permit instead. * 24: Introduce `database::make_query_config()` and switch the database methods needing such a config to use it. * 25-31: Get rid of all uses of `no_reader_permit()`. * 32-33: Ban empty permits for good. * 34: querier_cache: use the queriers' permits to obtain the semaphore. Fixes: #5919 Tests: unit(dev, release, debug), dtest(bootstrap_test.py:TestBootstrap.start_stop_test_node), manual testing with a 2 node mixed cluster with extra logging. " * 'query-class/v6' of https://github.com/denesb/scylla: (34 commits) querier_cache: get semaphore from querier reader_permit: forbid empty permits reader_permit: fix reader_resources::operator bool treewide: remove all uses of no_reader_permit() database: make_multishard_streaming_reader: pass valid permit to multi range reader sstables: pass valid permits to all internal reads compaction: pass a valid permit to sstable reads database: add compaction read concurrency semaphore view: use valid permits for reads from the base table database: use valid permit for counter read-before-write database: introduce make_query_class_config() reader_concurrency_semaphore: remove wait_admission and consume_resources() test: move away from reader_concurrency_semaphore::wait_admission() reader_permit: resource_units: introduce add() mutation_reader: restricted_reader: work in terms of reader_permit row_cache: pass a valid permit to underlying read memtable: pass a valid permit to the delegate reader table: require a valid permit to be passed to most read methods multishard_mutation_query: pass a valid permit to shard mutation sources querier: add reader_permit parameter and forward it to the mutation_source ...	2020-05-29 10:11:44 +03:00
Piotr Sarna	77e943e9a3	db,views: unify time points used for update generation Until now, view updates were generated with a bunch of random time points, because the interface was not adjusted for passing a single time point. The time points were used to determine whether cells were alive (e.g. because of TTL), so it's better to unify the process: 1. when generating view updates from user writes, a single time point is used for the whole operation 2. when generating view updates via the view building process, a single time point is used for each build step NOTE: I don't see any reliable and deterministic way of writing test scenarios which trigger problems with the old code. After #6488 is resolved and error injection is integrated into view.cc, tests can be added. Fixes #6429 Tests: unit(dev) Message-Id: <f864e965eb2e27ffc13d50359ad1e228894f7121.1590070130.git.sarna@scylladb.com>	2020-05-28 12:56:09 +03:00
Botond Dénes	734e995639	database: add compaction read concurrency semaphore All reads will soon require a valid permit, including those done during compaction. To allow creating valid permits for these reads create a compaction specific semaphore. This semaphore is unlimited as compaction concurrency is managed by higher level layer, we use just for resource usage accounting.	2020-05-28 11:34:35 +03:00
Botond Dénes	992e697dd5	view: use valid permits for reads from the base table View update generation involves reading existing values from the base table, which will soon require a valid permit to be passed to it, so make sure we create and pass a valid permit to these reads. We use `database::make_query_class_config()` to obtain the semaphore for the read which selects the appropriate user/system semaphore based on the scheduling group the base table write is running in.	2020-05-28 11:34:35 +03:00
Botond Dénes	e4c591aa67	database: introduce make_query_class_config() And use it to obtain any query-class specific configuration that was obtained from `table::config` before, such as the read concurrency semaphore and the max memory limit for unlimited queries. As all users of these items get these from the query class config now, we can remove them from `table::config`.	2020-05-28 11:34:35 +03:00
Botond Dénes	cc5137ffe3	table: require a valid permit to be passed to most read methods Now that the most prevalent users (range scan and single partition reads) all pass valid permits we require all users to do so and propagate the permit down towards `make_sstable_reader()`. The plan is to use this permit for restricting the sstable readers, instead of the semaphore the table is configured with. The various `make_streaming_*reader()` overloads keep using the internal semaphores as but they also create the permit before the read starts and pass it to `make_sstable_reader()`.	2020-05-28 11:34:35 +03:00
Botond Dénes	14743c4412	data_query, mutation_query: use query_class_config We want to move away from the current practice of selecting the relevant read concurrency semaphore inside `table` and instead want to pass it down from `database` so that we can pass down a semaphore that is appropriate for the class of the query. Use the recently created `query_class_config` struct for this. This is added as a parameter to `data_query`, `mutation_query` and propagated down to the point where we create the `querier` to execute the read. We are already propagating down a parameter down the same route -- max_memory_reverse_query -- which also happens to be part of `query_class_config`, so simply replace this parameter with a `query_class_config` one. As the lower layers are not prepared for a semaphore passed from above, make sure this semaphore is the same that is selected inside `table`. After the lower layers are prepared for a semaphore arriving from above, we will switch it to be the appropriate one for the class of the query.	2020-05-28 11:34:35 +03:00
Botond Dénes	e0b98ba921	database: give system reads a concurrency boost during startup In the next patches we will match reads to the appropriate reader concurrency semaphore based on the scheduling group they run in. This will result in a lot of system reads that are executed during startup and that were up to now (incorrectly) using the user read semaphore to switch to the system read semaphore. This latter has a much more constrained concurrency, which was observed to cause system reads to saturate and block on the semaphore, slowing down startup. To solve this, boost the concurrency of the system read semaphore during startup to match that of the user semaphore. This is ok, as during startup there are no user reads to compete with. After startup, before we start serving user reads the concurrency is reverted back to the normal value.	2020-05-28 10:40:08 +03:00
Tomasz Grabiec	1424543e11	Merge "Move sstables_format on sstable_manager" from Pavel Emelyanov The format is currently sitting in storage_service, but the previous set patched all the users not to call it, instead they use sstables_manager to get the highest supported format. So this set finalizes this effort and places the format on sstables_manager(s). The set introduces the db::sstables_format_selector, that - starts with the lowest format (ka) - reads one on start from system tables - subscribes on sstables-related features and bumps up the selection if the respective feature is enabled During its lifetime the selector holds a reference to the sharded<database> and updates the format on it, the database, in turn, propagates it further to sstables_managers. The managers start with the highest known format (mc) which is done for tests. * https://github.com/xemul/scylla br-move-sstables-format-4: storage_service: Get rid of one-line helpers system_keyspace: Cleanup setup() from storage_service format_selector: Log which format is being selected sstables_manager: Keep format on format_selector: Make it standalone format_selector: Move the code into db/ format_selector: Select format locally storage_service: Introduce format_selector storage_service: Split feature_enabled_listener::on_enabled storage_service: Tossing bits around features: Introduce and use masked features features: Get rid of per-features booleans	2020-05-27 08:40:05 +03:00
Pavel Emelyanov	89a1b09214	sstables_manager: Keep format on Make the database be the format_selector target, so when the format is selected its set on database which in turn just forwards the selection into sstables managers. All users of the format are already patched to read it from those managers. The initial value for the format is the highest, which is needed by tests. When scylla starts the format is updated by format_selector, first after reading from system tables, then by selectiing it from features. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-05-25 14:17:28 +03:00
Piotr Sarna	92aadb94e5	treewide: propagate trace state to write path In order to add tracing to places where it can be useful, e.g. materialized view updates and hinted handoff, tracing state is propagated to all applicable call sites.	2020-05-18 16:05:23 +02:00
Avi Kivity	beaeda5234	database: remove variadic future from query() and query_mutations() Variadic futures are deprecated; replace with future<std::tuple<...>>. Tests: unit (dev)	2020-05-17 18:45:38 +02:00
Glauber Costa	7423ccc318	compaction_manager: allow early aborts through abort sources. The shutdown process of compaction manager starts with an explicit call from the database object. However that can only happen everything is already initialized. This works well today, but I am soon to change the resharding process to operate before the node is fully ready. One can still stop the database in this case, but reshardings will have to finish before the abort signal is processed. This patch passes the existing abort source to the construction of the compaction_manager and subscribes to it. If the abort source is triggered, the compaction manager will react to it firing and all compactions it manages will be stopped. We still want the database object to be able to wait for the compaction manager, since the database is the object that owns the lifetime of the compaction manager. To make that possible we'll use a future that is return from stop(): no matter what triggered the abort, either an early abort during initial resharding or a database-level event like drain, everything will shut down in the right order. The abort source is passed to the database, who is responsible from constructing the compaction manager. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-05-13 16:51:25 -04:00
Avi Kivity	76d21a0c22	Merge 'Make it possible to turn caching off per table and stop caching CDC Log' from Piotr J. " We inherited from Origin a `caching` table parameter. It's a map of named caching parameters. Before this PR two caching parameters were expected: `keys` and `rows_per_partition`. So far we have been ignoring them. This PR adds a new caching parameter called `enabled` which can be set to `true` or `false` and controls the usage of the cache for the table. By default, it's set to `true` which reflects Scylla behavior before this PR. This new capability is used to disable caching for CDC Log table. It is desirable because CDC Log entries are not expected to be read often. They also put much more pressure on memory than entries in Base Table. This is caused by the fact that some writes to Base Table can override previous writes. Every write to CDC Log is unique and does not invalidate any previous entry. Fixes #6098 Fixes #6146 Tests: unit(dev, release), manual " * haaawk-dont_cache_cdc: cdc: Don't cache CDC Log table table: invalidate disabled cache on memtable flush table: Add cache_enabled member function cf_prop_defs: persist caching_options in schema property_definitions: add get that returns variant feature: add PER_TABLE_CACHING feature caching_options: add enabled parameter	2020-05-10 15:39:42 +03:00
Avi Kivity	5b971397aa	Revert "compaction_manager: allow early aborts through abort sources." This reverts commit `e8213fb5c3`. It results in an assertion failure in remove_index_file_test. Fixes #6413.	2020-05-10 12:32:18 +03:00

1 2 3 4 5 ...

837 Commits