scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 04:06:59 +00:00

Author	SHA1	Message	Date
Benny Halevy	ca78a63873	database: truncate: get rid of the unused ks param Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-07 12:53:05 +03:00
Benny Halevy	46e2a7c83b	database: add truncate_table_on_all_shards As a first step to decouple truncate from flush and snpashot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-07 12:53:05 +03:00
Benny Halevy	5e8c05f1a8	database: drop_table_on_all_shards: do not accept a truncated_at timestamp_func Since in the drop_table case we want to discard ALL sstables in the table, not only those with `max_data_age()` up until drop started. Fixes #11232 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-07 12:52:51 +03:00
Benny Halevy	574909c78f	database: truncate: get optional snapshot_name from caller Before we change drop_table_on_all_shards to always pass db_clock::time_point::max() in the next patch, let it pass a unique snapshot name, otherwise the snapshot name will always be based on the constant, max time_point. Refs #11232 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-07 12:03:19 +03:00
Pavel Emelyanov	527b345079	Merge 'storage_proxy: introduce a `remote` "subservice"' from Kamil Braun Introduce a `remote` class that handles all remote communication in `storage_proxy`: sending and receiving RPCs, checking the state of other nodes by accessing the gossiper, and fetching schema. The `remote` object lives inside `storage_proxy` and right now it's initialized and destroyed together with `storage_proxy`. The long game here is to split the initialization of `storage_proxy` into two steps: - the first step, which constructs `storage_proxy`, initializes it "locally" and does not require references to `messaging_service` and `gossiper`. - the second step will take those references and add the `remote` part to `storage_proxy`. This will allow us to remove some cycles from the service (de)initialization order and in general clean it up a bit. We'll be able to start `storage_proxy` right after the `database` (without messaging/gossiper). Similar refactors are planned for `query_processor`. Closes #11088 * github.com:scylladb/scylladb: service: storage_proxy: pass `migration_manager*` to `init_messaging_service` service: storage_proxy: `remote`: make `_gossiper` a const reference gms: gossiper: mark some member functions const db: consistency_level: `filter_for_query`: take `const gossiper&` replica: table: `get_hit_rate`: take `const gossiper&` gms: gossiper: move `endpoint_filter` to `storage_proxy` module service: storage_proxy: pass `shared_ptr<gossiper>` to `start_hints_manager` service: storage_proxy: establish private section in `remote` service: storage_proxy: remove `migration_manager` pointer service: storage_proxy: remove calls to `storage_proxy::remote()` from `remote` service: storage_proxy: remove `_gossiper` field alternator: ttl: pass `gossiper&` to `expiration_service` service: storage_proxy: move `truncate_blocking` implementation to `remote` service: storage_proxy: introduce `is_alive` helper service: storage_proxy: remove `_messaging` reference service: storage_proxy: move `connection_dropped` to `remote` service: storage_proxy: make `encode_replica_exception_for_rpc` a static function service: storage_proxy: move `handle_write` to `remote` service: storage_proxy: move `handle_paxos_prune` to `remote` service: storage_proxy: move `handle_paxos_accept` to `remote` service: storage_proxy: move `handle_paxos_prepare` to `remote` service: storage_proxy: move `handle_truncate` to `remote` service: storage_proxy: move `handle_read_digest` to `remote` service: storage_proxy: move `handle_read_mutation_data` to `remote` service: storage_proxy: move `handle_read_data` to `remote` service: storage_proxy: move `handle_mutation_failed` to `remote` service: storage_proxy: move `handle_mutation_done` to `remote` service: storage_proxy: move `handle_paxos_learn` to `remote` service: storage_proxy: move `receive_mutation_handler` to `remote` service: storage_proxy: move `handle_counter_mutation` to `remote` service: storage_proxy: remove `get_local_shared_storage_proxy` service: storage_proxy: (de)register RPC handlers in `remote` service: storage_proxy: introduce `remote`	2022-08-04 17:50:20 +03:00
Kamil Braun	7b4146dd2a	replica: table: `get_hit_rate`: take `const gossiper&` It doesn't use any non-const members.	2022-08-04 12:16:09 +02:00
Benny Halevy	e4e92d44ae	main: start compaction_manager as a sharded service And pass a reference to it to the database rather than having the database construct its own compaction_manager. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 07:50:15 +03:00
Avi Kivity	2c0932cc41	Merge 'Reduce the amount of per-table metrics' from Amnon Heiman This series is the first step in the effort to reduce the number of metrics reported by Scylla. The series focuses on the per-table metrics. The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode. The following series uses multiple tools to reduce the number of metrics. 1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added. 2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node. 3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas). Closes #11058 * github.com:scylladb/scylla: Add summary_test database: Reduce the number of per-table metrics replica/table.cc: Do not register per-table metrics for system histogram_metrics_helper.hh: Add to_metrics_summary function Unified histogram, estimated_histogram, rates, and summaries Split the timed_rate_moving_average into data and timer utils/histogram.hh: should_sample should use a bitmask estimated_histogram: add missing getter method	2022-07-27 22:01:08 +03:00
Amnon Heiman	99a060126d	database: Reduce the number of per-table metrics This patch reduces the number of metrics that is reported per table, when the per-table flag is on. When possible, it moves from time_estimated_histogram and timed_rate_moving_average_and_histogram to use the unified timer. Instead of a histogram per shard, it will now report a summary per shard and a histogram per node. Counters, histograms, and summaries will not be reported if they were never used. The API was updated accordingly so it would not break. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2022-07-27 16:58:52 +03:00
Benny Halevy	f0a597a252	table: try_flush_memtable_to_sstable: propagate errors to seal_active_memtable And let seal_active_memtable decide about how to handle them as now all flush error handling logic is implemented there. In particular, unlike today, sstable write errors will cause internal error rather than loop forever. Also, check for shutdown earlier to ignore errors like semaphore_broken that might happen when the table is stopped. Refs #10498 (The issue will be considered fixed when going into maintenance mode on write errors rather than throwing internal error and potentially retrying forever) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 14:04:55 +03:00
Benny Halevy	d55a2ac762	dirty_memory_manager: flush_when_needed: move error handling to flush_one/seal_active_memtable Currently flush is retried both by dirty_memory_manager::flush_when_needed and table::seal_active_memtable, which may be called by other paths like table::flush. Unify the retry logic into seal_active_memtable so that we have similar error handling semantics on all paths. Refs #4174 Refs #10498 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	d3acd80cf5	memtable_list: mark functions noexcept Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	863e9d9e6a	dirty_memory_manager: flush_when_needed: target error handling at flush_one Now that everything prior to flush_one is noexcept make table::seal_active_memtable and the paths that call it noexcept, making sure that any errors are returned only as exceptional futures, and handle them in flush_when_needed(). The original handle_exception had a broader scope than now needed, so this change is mostly technical, to show that we can narrow down the error handling to the continuation of flush_one - and verify that the unit test is not broken. A later patch moves this error handling logic away to seal_active_memtable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Benny Halevy	73e50bc97d	database: delete unused seal_delayed_fn_type Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-27 13:43:17 +03:00
Avi Kivity	fbe8ea7727	logalloc, dirty_memory_manager: move region_group and associated code region_group is an abstraction that allows accounting for groups of regions, but the cost/benefit ratio of maintaining the abstraction is poor. Each time we need to change decision algorithm of memtable flushing (admittedly rarely), we need to distill that into an abstraction for region_groups and then use it. An example is virtual regions groups; we wanted to account for the partially flushed memtables and had to invent region groups to stand in their place. Rather than continuing to invest in the abstraction, break it now and move it to the memtable dirty memory manager which is responsible for making those decisions. The relevant code is moved to dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and a new unit test file is added as well. A downside of the change is that unit testing will be more difficult.	2022-07-26 11:12:10 +03:00
Igor Ribeiro Barbosa Duarte	3b19bcf1a1	memtable_flush: Make memtable_flush_static_shares liveupdateable This patch makes memtable_flush_static_shares liveupdateable to avoid having to restart the cluster after updating this config. Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>	2022-07-19 10:10:46 -03:00
Botond Dénes	9afd2dc428	Merge 'Make compaction manager switch to table abstraction ' from Raphael "Raph" Carvalho This work gets us a step closer to compaction groups. Everything in compaction layer but compaction_manager was converted to table_state. After this work, we can start implementing compaction groups, as each group will be represented by its own table_state. User-triggered operations that span the entire table, not only a group, can be done by calling the manager operation on behalf of each group and then merging the results, if any. Closes #11028 * github.com:scylladb/scylla: compaction: remove forward declaration of replica::table compaction_manager: make add() and remove() switch to table_state compaction_manager: make run_custom_job() switch to table_state compaction_manager: major: switch to table_state compaction_manager: scrub: switch to table_state compaction_manager: upgrade: switch to table_state compaction: table_state: add get_sstables_manager() compaction_manager: cleanup: switch to table_state compaction_manager: offstrategy: switch to table_state() compaction_manager: rewrite_sstables(): switch to table_state compaction_manager: make run_with_compaction_disabled() switch to table_state compaction_manager: compaction_reenabler: switch to table_state compaction_manager: make submit(T) switch to table_state compaction_manager: task: switch to table_state compaction: table_state: Add is_auto_compaction_disabled_by_user() compaction: table_state: Add on_compaction_completion() compaction: table_state: Add make_sstable() compaction_manager: make can_proceed switch to table_state compaction_manager: make stop compaction procedures switch to table_state compaction_manager: make get_compactions() switch to table_state compaction_manager: change task::update_history() to use table_state instead compaction_manager: make can_register_compaction() switch to table_state compaction_manager: make get_candidates() switch to table_state compaction_manager: make propagate_replacement() switch to table_state compaction: Move table::in_strategy_sstables() and switch to table_state compaction: table_state: Add maintenance sstable set compaction_manager: make has_table_ongoing_compaction() switch to table_state compaction_manager: make compaction_disabled() switch to table_state compaction_manager: switch to table_state for mapping of compaction_state compaction_manager: move task ctor into source	2022-07-18 15:18:29 +03:00
Benny Halevy	d7564b9081	database: make drop_column_family private Now that all users are converted to use the public entry point - drop_table_on_all. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-17 14:33:34 +03:00
Benny Halevy	e005629afb	database: add drop_table_on_all_shards Runs drop_column_family on all database shards. Will be extended later to consider removing the table directory. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-17 14:33:34 +03:00
Raphael S. Carvalho	1deeeff825	compaction: table_state: Add on_compaction_completion() The idea is that we'll have a single on-completion interface for both "in-strategy" and off-strategy compactions, so not to pollute table_state with one interface for each. replica::table::on_compaction_completion is being moved into private namespace. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	cb05142d58	compaction: Move table::in_strategy_sstables() and switch to table_state in_strategy_sstables() doesn't have to be implemented in table, as it's simply about main set with maintenance and staging files filtered out. Also, let's make it switch to table_state as part of ongoing work. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Aleksandra Martyniuk	7871989551	api: list of the user keyspaces contains only user keyspaces storage_service/keyspaces?type=user along with user keyspaces returned the keyspaces that were internal but non-system. The list of the keyspaces for the user option (storage_service/keyspaces?type=user) contains neither system nor internal but only user keyspaces. Fixes: #11042 Closes #11049	2022-07-15 20:42:30 +02:00
Raphael S. Carvalho	d3d9b13d9d	table: remove ref from on_compaction_completion() signature Now update_sstable_lists_on_off_strategy_completion() and on_compaction_completion() can be called from the same unified interface. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-13 11:25:51 -03:00
Raphael S. Carvalho	ca58054485	table: use compaction_completion_desc to describe changes for off-strategy To make it possible to add a single interface in table_state for updating sstable list on behalf of both off-strategy and in-strategy compactions, update_sstable_lists_on_off_strategy_completion() will work with compaction_completion_desc too for describing sstable set changes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-13 11:16:19 -03:00
Botond Dénes	f912f5f373	querier: remove {data,mutation}_querier aliases They now both mean the same thing: querier.	2022-07-12 08:41:51 +03:00
Tomasz Grabiec	6b316f267f	db: Avoid memtable flush latency on schema merge Currently, applying schema mutations involves flushing all schema tables so that on restart commit log replay is performed on top of latest schema (for correctness). The downside is that schema merge is very sensitive to fdatasync latency. Flushing a single memtable involves many syncs, and we flush several of them. It was observed to take as long as 30 seconds on GCE disks under some conditions. This patch changes the schema merge to rely on a separate commit log to replay the mutations on restart. This way it doesn't have to wait for memtables to be flushed. It has to wait for the commitlog to be synced, but this cost is well amortized. We put the mutations into a separate commit log so that schema can be recovered before replaying user mutations. This is necessary because regular writes have a dependency on schema version, and replaying on top of latest schema satisfies all dependencies. Without this, we could get loss of writes if we replay a write which depends on the latest schema on top of old schema. Also, if we have a separate commit log for schema we can delay schema parsing for after the replay and avoid complexity of recognizing schema transactions in the log and invoking the schema merge logic. One complication with this change is that replay_position markers are commitlog-domain specific and cannot cross domains. They are recorded in various places which survive node restart: sstables are annotated with the maximum replay position, and they are present inside truncation records. The former annotation is used by "truncate" operation to drop sstables. To prevent old replay positions from being interpreted in the context in the new schema commitlog domain, the change refuses to boot if there are truncation records, and also prohibits truncation of schema tables. The boot sequence needs to know whether the cluster feature associated with this change was enabled on all nodes. Fetaures are stored in system.scylla_local. Because we need to read it before initializing schema tables, the initialization of tables now has to be split into two phases. The first phase initializes all system tables except schema tables, and later we initialize schema tables, after reading stored cluster features. The commitlog domain is switched only when all nodes are upgraded, and only after new node is restarted. This is so that we don't have to add risky code to deal with hot-switching of the commitlog domain. Cold switching is safer. This means that after upgrade there is a need for yet another rolling restart round. Fixes #8272 Fixes #8309 Fixes #1459	2022-07-06 22:08:56 +02:00
Tomasz Grabiec	c5ad05c819	db: Allow splitting initiatlization of system tables We will need some system tables to be initialized earlier in the boot so that system.scylla_local can be read before schema tables are initialized.	2022-07-06 22:08:56 +02:00
Tomasz Grabiec	6444d959dc	db: Introduce multi-table atomic apply() Will be used to apply schema mutations atomically.	2022-07-06 22:08:56 +02:00
Avi Kivity	419fe65259	Revert "Merge 'Block flush until compaction finishes if sstables accumulate' from Mikołaj Sielużycki" This reverts commit `aa8f135f64`, reversing changes made to `9a88bc260c`. The patch causes hangs during flush. Also reverts parts of `411231da75` that impacted the unit test. Fixes #10897.	2022-07-06 12:19:02 +03:00
Botond Dénes	6c818f8625	Merge 'sstables: generation_type tidy-up' from Michael Livshin - Use `sstables::generation_type` in more places - Enforce conceptual separation of `sstables::generation_type` and `int64_t` - Fix `extremum_tracker` so that `sstables::generation_type` can be non-default-constructible Fixes #10796. Closes #10844 * github.com:scylladb/scylla: sstables: make generation_type an actual separate type sstables: use generation_type more soundly extremum_tracker: do not require default-constructible value types	2022-06-28 08:50:12 +03:00
Benny Halevy	81fa1ce9a1	Revert 'Compact staging sstables' This patch reverts the following patches merged in `78750c2e1a` "Merge 'Compact staging sstables' from Benny Halevy" > `597e415c38` "table: clone staging sstables into table dir" > `ce5bd505dc` "view_update_generator: discover_staging_sstables: reindent" > `59874b2837` "table: add get_staging_sstables" > `7536dd7f00` "distributed_loader: populate table directory first" The feature causes regressions seen with e.g. https://jenkins.scylladb.com/view/master/job/scylla-master/job/dtest-daily-release/41/testReport/materialized_views_test/TestMaterializedViews/Run_Dtest_Parallel_Cloud_Machines___FullDtest___full_split011___test_base_replica_repair/ ``` AssertionError: Expected [[0, 0, 'a', 3.0]] from SELECT * FROM t_by_v WHERE v = 0, but got [] ``` Where views aren't updated properly. Apparently since `table::stream_view_replica_updates` doesn't exclude the staging sstables anymore and since they are cloned to the base table as new sstables it seems to the view builder that no view updates are required since there's no changes comparing to the base table. Reopens #9559 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10890	2022-06-27 12:18:48 +03:00
Botond Dénes	78750c2e1a	Merge 'Compact staging sstables' from Benny Halevy This series decouples the staging sstables from the table's sstable set. The current behavior keeps the sstables in the staging directory until view building is done. They are readable as any other sstable, but fenced off from compaction, so they don't go away in the meanwhile. Currently, when views are built, the sstables are moved into the main table directory where they will then be compacted normally. The problem with this design is that the staging sstables are never compacted, in particular they won't get cleaned up or scrubbed. The cleanup scenario open a backdoor for data resurrection when the staging sstables are moved after view building while possibly containing stale partitions (#9559) which will not be cleaned up until next time cleanup compaction is performed. With this series, SSTables that are created in or moved to the staging sub-directory are "cloned" into the base table directory by hard-linking the components there and creating a new sstable object which loads the cloned files. The former, in the staging directory is used solely for view building and is not added to the table's sstable set, while the latter, its clone, behaves like any other sstable and is added either to the regular or maintenance set and is read and compacted normally. When view building is done, instead of moving the staging sstable into the table's base directory, it is simply unlinked. If its "clone" wasn't compacted away yet, then it will just remain where it is, exactly like it would be after it was moved there in the present state of things. If it was already compacted and no longer exists, then unlinking will then free its storage. Note that snapshot is based on the sstables listed by the table, which do not include the staging sstables with this change. But that shouldn't matter since even today, the sstables in the snapshot has no notion of "staging" directory and it is expected that the MV's are either updated view `nodetool refresh` if restoring sstables from snapshot using the uploads dir, or if restoring the whole table from backup - MV's are effectively expected to be rebuilt from scratch (they are not included in automatic snapshots anyway since we don't have snapshot-coherency across tables). A fundamental infrastructure change was done to achieve that which is to change the sstable_list which was a std::unordered_set<shared_sstable> into a std::unordered_map<generation_type, shared_sstable> that keeps the shared_sstable objects indexed by generation number (that must be unique). With this model, sstables are supposed to be searched by the generation number, not by their pointer, since when the staging sstable is clones, there will be 2 shared_sstable objects with the same generation (and different `dir()`) and we must distinguish between them. Special care was taken to throw a runtime_error exception if when looking up a shared sstable and finding another one with the same generation, since they must never exist in the same sstable_map. Fixes #9559 Closes #10657 * github.com:scylladb/scylla: table: clone staging sstables into table dir view_update_generator: discover_staging_sstables: reindent table: add get_staging_sstables view_update_generator: discover_staging_sstables: get shared table ptr earlier distributed_loader: populate table directory first sstables: time_series_sstable_set: insert: make exception safe sstables: move_to_new_dir: fix debug log message	2022-06-24 08:05:38 +03:00
Benny Halevy	597e415c38	table: clone staging sstables into table dir clone staging sstables so their content may be compacted while views are built. When done, the hard-linked copy in the staging subdirectory will be simply unlinked. Fixes #9559 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-23 16:55:27 +03:00
Benny Halevy	59874b2837	table: add get_staging_sstables We don't have to go over all sstables in the table to select the staging sstables out of them, we can get it directly from the _sstables_staging map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-23 16:55:27 +03:00
Piotr Dulikowski	13a5022499	database: add stats for per partition rate limiting Adds statistics which count how many times a replica has decided to reject a write ("total_writes_rate_limited") or a read ("total_reads_rate_limited").	2022-06-22 20:16:49 +02:00
Piotr Dulikowski	76e95e7ae8	storage_proxy: choose the right per partition rate limit info in write handler Now, write response handler calculates the appropriate rate limit info parameter and passes it to the mutation holder.	2022-06-22 20:16:49 +02:00
Piotr Dulikowski	cc9a2ad41f	database: apply per-partition rate limiting for reads/writes Adds the `db::rate_limiter` to the `database` class and modifies the `query` and `apply` methods so that they account the read/write operations in the rate limiter and optionally reject them.	2022-06-22 20:16:48 +02:00
Michael Livshin	ab13127761	sstables: use generation_type more soundly `generation_type` is (supposed to be) conceptually different from `int64_t` (even if physically they are the same), but at present Scylla code still largely treats them interchangeably. In addition to using `generation_type` in more places, we provide (no-op) `generation_value()` and `generation_from_value()` operations to make the smoke-and-mirrors more believable. The churn is considerable, but all mechanical. To avoid even more (way, way more) churn, unit test code is left untreated for now, except where it uses the affected core APIs directly. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-20 19:37:31 +03:00
Pavel Emelyanov	997a34bf8c	backlog_controller: Generalize scheduling groups Make struct scheduling_group be sub-class of the backlog controller. Its new meaning is now -- the group under controller maintenance. Both database and compaction manager derive their sched groups from this one. This makes backlog controller construction simpler, prepares the ground for sched groups unification in seastar and facilitates next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	12b2d6400d	database: Keep compound flushing sched group Similar to previous patch that made the same for compaction manager. The newly introduced private scheduling_group class is temporary and will go away in next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Mikołaj Sielużycki	4cd42f97d0	table: Prevent creating unbounded number of sstables If we reach a situation where flush rate exceeds compaction rate, we may end up with arbitrarily large number of sstables on disk. If a read is executed in such case, the amount of memory required is proportional to the number of sstables for the given shard, which in extreme cases can lead to OOM. In the wild, this was observed in 2 scenarios: - A node with >10 shards creates a keyspace with thousands of tables, drops the keyspace and shuts down before compaction finishes. Dropping keyspace drops tables, and each dropped table is smp::count writes to system.local table with flush after write, which creates tens of thousands of sstables. Bootstrap read from system.local will run OOM. - A failure to agree on table schema (due to a code bug) between nodes during repair resulted in excessive flushing of small sstables which compaction couldn't keep up with. In the unit test introduced in this patch series it can be proved that even hard setting maximum shares for compaction and minimum shares for flushing doesn't tilt the balance towards compaction enough to prevent the problem. Since it's a fast producer, slow consumer problem, the remaining solution is to block producer until the consumer catches up. If there are too many table runs originating from memtable, we block the current flush until the number of sstables is reduced (via ongoing compaction or a truncate operation).	2022-06-15 10:57:28 +02:00
Pavel Emelyanov	490bf65e11	table: Move sstables_manager from config onto table itself The manager reference is already available in constructor and thus can be copied to on-table member. The code that chooses the manager (user/system one) should be moved from make_column_family_config() into add_column_family() method. Once this happens, the get_sstables_manager() should be fixed to return the reference from its new location. While at it -- mark the method in question noexcept and add it's mutable overload. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-27 16:37:21 +03:00
Pavel Emelyanov	50e6810536	table, db, tests: Pass sstables_manager& into table constructor In core code there's only one place that constructs table -- in database.cc -- and this place currently has the sstables_manager pointer sitting on table config (despite it's a pointer, it's always non-null). All the tests always use the manager from one of _env's out there. For now the new contructor arg is unused. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-27 16:27:44 +03:00
Benny Halevy	4a5842787e	memtable_list: clear_and_add: let caller clear the old memtables As a follow up on `b8263e550a`, make clear_and_add synchronous yet again, and just return the swapped list of memtables so that the caller (table::clear) can clear them gently. Refs https://github.com/scylladb/scylla/pull/10424#discussion_r867455056 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10540	2022-05-11 14:46:30 +02:00
Piotr Sarna	209c2f5d99	sstables: define generation_type for sstables No functional changes intended - this series is quite verbose, but after it's in, it should be considerably easier to change the type of SSTable generations to something else - e.g. a string or timeUUID. Closes #10533	2022-05-11 14:46:30 +02:00
Benny Halevy	9e69089306	table: snapshot: get rid of skip_flush param Now that all callers flush on their own before calling table::snapshot. Refs #10500 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	33bd52921e	table: make snapshot method private Only callable by database. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	e1d58d4422	database: add snapshot_on_all And move the logic from snapshot-ctl down to the replica::database layer. A following patch will move the flush phase from the replica::table::snapshot layer out to the caller. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 10:45:14 +03:00
Benny Halevy	5b4eb44795	database: add flush_on_all variants Use by api layer. Will be used in a later patch to flush on all shards before taking a snapshot. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-10 09:56:44 +03:00
Botond Dénes	fd27fbfe64	Merge "Add user types carrier helper" from Pavel Emelyanov " There's a cql_type_parser::parse() method that needs to get user types for a keyspace by its name. For this it uses the global storage proxy instance as a place to get database from. This set introduces an abstract user_types_storage helper object that's responsible in providing the user types for the caller. This helper, in turn, is provided to the parse() method by the database itself or by the schema_ctxt object that needs parse() to unfreeze schemas and doesn't have database at those times. This removes one more get_storage_proxy() call. " * 'br-user-types-storage' of https://github.com/xemul/scylla: cql_type_parser: Require user_types_storage& in parse() schame_tables: Add db/ctxt args here and there user_types: Carry storage on database and schema_ctxt data_dictionary: Introduce user types storage	2022-05-09 17:38:52 +03:00

1 2

98 Commits