scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 07:53:20 +00:00

Author	SHA1	Message	Date
Benny Halevy	4406a2514e	large_data_handler: maybe_delete_large_data_entries: use sstable large data stats If the sstable has scylla_metadata::large_data_stats use them to determine whether to delete the corresponding large data records. Otherwise, defer to the current method of comparing the sstable data_size to the respective thresholds. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:19:42 +02:00
Benny Halevy	8cebe7776f	large_data_handler: maybe_delete_large_data_entries: accept shared_sstable Since the actual deletion if the large data entries is done in the background, and we don't captures the shared_sstable, we can safely pass it to maybe_delete_large_data_entries when deleting the sstable in sstable::unlink and it will be release as soon as maybe_delete_large_data_entries returns. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:19:42 +02:00
Benny Halevy	f7d0ae3d10	large_data_handler: maybe_delete_large_data_entries: move out of line It is called on the cold path, when the sstable is deleted. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:19:42 +02:00
Benny Halevy	8ab053bd44	large_data_handler: expose methods to get threshold To be used for keeping large_data statistics in sstable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:18:14 +02:00
Benny Halevy	dd7422a713	large_data_handler: indicate recording of large data entries Return true from the maybe_{record,log}_* methods if a large data record or log entry were emitted. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:18:14 +02:00
Benny Halevy	873107821b	large_data_handler: move constructor out of line No need for it to be inlined. Also, add debug logging to the large data handler options. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-12-01 15:18:14 +02:00
Avi Kivity	e8ff77c05f	Merge 'sstables: a bunch of refactors' from Kamil Braun 1. sstables: move `sstable_set` implementations to a separate module All the implementations were kept in sstables/compaction_strategy.cc which is quite large even without them. `sstable_set` already had its own header file, now it gets its own implementation file. The declarations of implementation classes and interfaces (`sstable_set_impl`, `bag_sstable_set`, and so on) were also exposed in a header file, sstable_set_impl.hh, for the purposes of potential unit testing. 2. mutation_reader: move `mutation_reader::forwarding` to flat_mutation_reader.hh Files which need this definition won't have to include mutation_reader.hh, only flat_mutation_reader.hh (so the inclusions are in total smaller; mutation_reader.hh includes flat_mutation_reader.hh). 3. sstables: move sstable reader creation functions to `sstable_set` Lower level functions such as `create_single_key_sstable_reader` were made methods of `sstable_set`. The motivation is that each concrete sstable_set may decide to use a better sstable reading algorithm specific to the data structures used by this sstable_set. For this it needs to access the set's internals. A nice side effect is that we moved some code out of table.cc and database.hh which are huge files. 4. sstables: pass `ring_position` to `create_single_key_sstable_reader` instead of `partition_range`. It would be best to pass `partition_key` or `decorated_key` here. However, the implementation of this function needs a `partition_range` to pass into `sstable_set::select`, and `partition_range` must be constructed from `ring_position`s. We could create the `ring_position` internally from the key but that would involve a copy which we want to avoid. 5. sstable_set: refactor `filter_sstable_for_reader_by_pk` Introduce a `make_pk_filter` function, which given a ring position, returns a boolean function (a filter) that given a sstable, tells whether the sstable may contain rows with the given position. The logic has been extracted from `filter_sstable_for_reader_by_pk`. Split from #7437. Closes #7655 * github.com:scylladb/scylla: sstable_set: refactor filter_sstable_for_reader_by_pk sstables: pass ring_position to create_single_key_sstable_reader sstables: move sstable reader creation functions to `sstable_set` mutation_reader: move mutation_reader::forwarding to flat_mutation_reader.hh sstables: move sstable_set implementations to a separate module	2020-11-24 09:23:57 +02:00
Pavel Emelyanov	fea4a5492f	system-keyspace: Remove dead code Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20201123151453.27341-1-xemul@scylladb.com>	2020-11-23 17:16:15 +02:00
Avi Kivity	1e170ebfc1	Merge 'Changing hints configuration followup' from Piotr Dulikowski Follow-up to https://github.com/scylladb/scylla/pull/6916. - Fixes wrong usage of `resource_manager::prepare_per_device_limits`, - Improves locking in `resource_manager` so that it is more safe to call its methods concurrently, - Adds comments around `resource_manager::register_manager` so that it's more clear what this method does and why. Closes #7660 * github.com:scylladb/scylla: hints/resource_manager: add comments to register_manager hints/resource_manager: fix indentation hints/resource_manager: improve mutual exclusion hints/resource_manager: correct prepare_per_device_limits usage	2020-11-22 15:06:35 +02:00
Piotr Sarna	5a9dc6a3cc	Merge 'Cleanup CDC tests after CDC became GA' from Piotr Jastrzębski Now that CDC is GA, it should be enabled in all the tests by default. To achieve that the PR adds a special db::config::add_cdc_extension() helper which is used in cql_test_envm to make sure CDC is usable in all the tests that use cql_test_env.m As a result, cdc_tests can be simplified. Finally, some trailing whitespaces are removed from cdc_tests. Tests: unit(dev) Closes #7657 * github.com:scylladb/scylla: cdc: Remove trailing whitespaces from cdc_tests cdc: Remove mk_cdc_test_config from tests config: Add add_cdc_extension function for testing cdc: Add missing includes to cdc_extension.hh	2020-11-20 13:56:29 +01:00
Kamil Braun	40d8bfa394	sstables: move sstable reader creation functions to `sstable_set` Lower level functions such as `create_single_key_sstable_reader` were made methods of `sstable_set`. The motivation is that each concrete sstable_set may decide to use a better sstable reading algorithm specific to the data structures used by this sstable_set. For this it needs to access the set's internals. A nice side effect is that we moved some code out of table.cc and database.hh which are huge files.	2020-11-19 17:52:39 +01:00
Avi Kivity	70689088fd	Merge "Remove reference on database from global qctx" from Pavel E " The qctx is global object that references query processor and database to let the rest of the code query system keyspace. As the first step of de-globalizing it -- remove the database reference from it. After the set the qctx remains a simple wrapper over the query processor (which is already de-globalized) and the query processor in turn is mostly needed only to parse the query string into prepared statement only. This, in turn, makes it possible to remove the qctx later by parsing the query strings on boot and carrying _them_ around, not the qctx itself. tests: unit(dev), dtest(simple_cluster_driver_test:dev), manual start/stop " * 'br-remove-database-from-qctx' of https://github.com/xemul/scylla: query-context: Remove database from qctx schema-tables: Use query processor referece in save_system(_keyspace)?_schema system-keyspace: Rewrite force_blocking_flush system-keyspace: Use cluster_name string in check_health system-keyspace: Use db::config in setup_version query-context: Kill global helpers test: Use cql_test_env::evecute_cql instead of qctx version code: Use qctx::evecute_cql methods, not global ones system-keyspace: Do not call minimal_setup for the 2nd time system-keyspace: Fix indentation after previous patch system-keyspace: Do not do invoke_on_all by hands system-keyspace: Remove dead code	2020-11-19 18:31:51 +02:00
Pavel Emelyanov	689fd029a1	query-context: Remove database from qctx No users of qctx::db are left. One global database reference less. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	464c8990d4	schema-tables: Use query processor referece in save_system(_keyspace)?_schema The save_system_schema and save_system_keyspace_schema are both called on start and can the needed get query processor reference from arguments. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	66dcc47571	system-keyspace: Rewrite force_blocking_flush The method is called after query_processor::execute_internal to flush the cf. Encapsulating this flush inside database and getting the database from query_processor lets removing database reference from global qctx object. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	6cad18ad33	system-keyspace: Use cluster_name string in check_health The check_help needs global qctx to get db.config.cluster_name, which is already available at the caller side. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	36a3ee6ad4	system-keyspace: Use db::config in setup_version This is the beginning of de-globalizing global qctx thing. The setup_version() needs global qctx to get config from. It's possible to get the config from the caller instead. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	43039a0812	query-context: Kill global helpers Now the db::execute_cql* callers are patched, the global helpers can be removed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	303ebe4a36	code: Use qctx::evecute_cql methods, not global ones There are global db::execute_cql() helpers that just forward the args into qctx::execute_cql(). The former are going away, so patch all callers to use qctx themselves. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	8bf6b1298c	system-keyspace: Do not call minimal_setup for the 2nd time THe system_keyspace::minimal_setup is called by main.cc by hands already, some steps before the regular ::setup(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	7b82ec2f9e	system-keyspace: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	1773dadc72	system-keyspace: Do not do invoke_on_all by hands The cache_truncation_record needs to run cf.cache_truncation_record on each shard's DB, so the invoke_on_all can be used. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Pavel Emelyanov	fb20d9cd1e	system-keyspace: Remove dead code Not called anywhare. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-11-19 18:39:05 +03:00
Piotr Dulikowski	60ac68b7a2	hints/resource_manager: add comments to register_manager Adds more comments to resource_manager::register_manager in order to better explain what this function is doing.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	c0c10b918c	hints/resource_manager: fix indentation Fixes indentation in prepare_per_device_limits.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	ead6a3f036	hints/resource_manager: improve mutual exclusion This commit causes start, stop and register_manager methods of the resource_manager to be serialized with respect to each other using the _operation_lock. Those function modify internal state, so it's best if they are protected with a semaphore. Additionally, those function are not going to be used frequently, therefore it's perfectly fine to protect them in such a coarse manner. Now, space_watchdog has a dedicated lock for serializing its on_timer logic with resource_manager::register_manager. The reason for separate lock is that resource_manager::stop cannot use the same lock as the space_watchdog - otherwise a situation could occur in which space_watchdog waits for semaphore units held by resource_manager::stop(), and resource_manager::stop() waits until the space_watchdog stops its asynchronous event loop.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	362aebee7b	hints/resource_manager: correct prepare_per_device_limits usage The resource_manager::prepare_per_device_limits function calculates disk quota for registered hints managers, and creates an association map: from a storage device id to those hints manager which store hints on that device (_per_device_limits_map) This function was used with an assumption that it is idempotent - which is a wrong assumption. In resource_manager::register_manager, if the resource_manager is already started, prepare_per_device_limits would be called, and those hints managers which were previously added to the _per_device_limits_map would be added again. This would cause the space used by those managers to be calculated twice, which would artificially lower the limit which we impose on the space hints are allowed to occupy on disk. This patch fixes this problem by changing the prepare_per_device_limits function to operate on a hints manager passed by argument. Now, we make sure that this function is called on each hints manager only once.	2020-11-19 16:34:37 +01:00
Piotr Jastrzebski	9ede193f0a	config: Add add_cdc_extension function for testing and use it in cql_test_env to enable cdc extension for all tests that use it. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-19 16:16:07 +01:00
Piotr Sarna	c0d72b4491	db,view: remove duplicate entries from the list of target endpoints If a list of target endpoints for sending view updates contains duplicates, it results in benign (but annoying) broken promise errors happening due to duplicated write response handlers being instantiated for a single endpoint. In order to avoid such errors, target remote endpoints are deduplicated from the list of pending endpoints. A similar issue (#5459) solved the case for duplicated local endpoints, but that didn't solve the general case. Fixes #7572 Closes #7641	2020-11-18 13:43:49 +02:00
Avi Kivity	d612ca78f3	Merge 'Allow changing hinted handoff configuration in runtime' from Piotr Dulikowski This PR allows changing the hinted_handoff_enabled option in runtime, either by modifying and reloading YAML configuration, or through HTTP API. This PR also introduces an important change in semantics of hinted_handoff_enabled: - Previously, hinted_handoff_enabled controlled whether _both writing and sending_ hints is allowed at all, or to particular DCs, - Now, hinted_handoff_enabled only controls whether _writing hints_ is enabled. Sending hints from disk is now always enabled. Fixes: #5634 Tests: - unit(dev) for each commit of the PR - unit(debug) for the last commit of the PR Closes #6916 * github.com:scylladb/scylla: api: allow changing hinted handoff configuration storage_proxy: fix wrong return type in swagger hints_manager: implement change_host_filter storage_proxy: always create hints manager config: plug in hints::host_filter object into configuration db/hints: introduce host_filter hints/resource_manager: allow registering managers after start hints: introduce db::hints::directory_initializer directories.cc: prepare for use outside main.cc	2020-11-18 13:41:02 +02:00
Piotr Jastrzebski	c0bc6b5795	size_estimates_virtual_reader: Remove std::iterator std::iterator is deprecated since C++17 so define all the required iterator_traits directly. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-17 16:53:20 +01:00
Piotr Dulikowski	0fd36e2579	api: allow changing hinted handoff configuration This commit makes it possible to change hints manager's configuration at runtime through HTTP API. To preserve backwards compatibility, we keep the old behavior of not creating and checking hints directories if they are not enabled at startup. Instead, hint directories are lazily initialized when hints are enabled for the first time through HTTP API.	2020-11-17 10:24:43 +01:00
Piotr Dulikowski	220a2ca800	hints_manager: implement change_host_filter Implements a function which is responsible for changing hints manager configuration while it is running. It first starts new endpoint managers for endpoints which weren't allowed by previous filter but are now, and then stops endpoint managers which are rejected by the new filter. The function is blocking and waits until all relevant ep managers are started or stopped.	2020-11-17 10:24:43 +01:00
Piotr Dulikowski	1302f1b5bf	storage_proxy: always create hints manager Now, the hints manager object for regular hints is always created, even if hints are disabled in configuration. Please note that the behavior of hints will be unchanged - no hints will be sent when they are disabled. The intent of this change is to make enabling and disabling hints in runtime easier to implement.	2020-11-17 10:24:43 +01:00
Piotr Dulikowski	cefe5214ff	config: plug in hints::host_filter object into configuration Uses db::hints::host_filter as the type of hinted_handoff_enabled configuration option. Previously, hinted_handoff_enabled used to be a string option, and it was parsed later in a separate function during startup. The function returned a std::optional<std::unordered_set<sstring>>, whose meaning in the context of hints is rather enigmatic for an observer not familiar with hints. Now, hinted_handoff_enabled has type of db::hints::host_filter, and it is plugged into the config parsing framework, so there is no need for later post-processing.	2020-11-17 10:24:42 +01:00
Piotr Dulikowski	5c3c7c946b	db/hints: introduce host_filter Adds a db::hints::host_filter structure, which determines if generating hints towards a given target is currently allowed. It supports serialization and deserialization between the hinted_handoff_enabled configuration/cli option. This patch only introduces this structure, but does not make other code use it. It will be plugged into the configuration architecture in the following commits.	2020-11-17 10:15:47 +01:00
Piotr Dulikowski	a4f03d72b3	hints/resource_manager: allow registering managers after start This change modifies db::hints::resource_manager so that it is now possible to add hints::managers after it was started. This change will make it possible to register the regular hints manager later in runtime, if it wasn't enabled at boot time.	2020-11-17 10:15:47 +01:00
Piotr Dulikowski	40710677d0	hints: introduce db::hints::directory_initializer Introduces a db::hints::directory_initializer object, which encapsulates the logic of initializing directories for hints (creating/validating directories, segment rebalancing). It will be useful for lazy initialization of hints manager.	2020-11-17 10:15:47 +01:00
Kamil Braun	d74f303406	cdc: ensure that CDC generation write is flushed to commitlog before ack When a node bootstraps or upgrades from a pre-CDC version, it creates a new CDC generation, writes it to a distributed table (system_distributed.cdc_generation_descriptions), and starts gossiping its timestamp. When other nodes see the timestamp being gossiped, they retrieve the generation from the table. The bootstrapping/upgrading node therefore assumes that the generation is made durable and other nodes will be able to retrieve it from the table. This assumption could be invalidated if periodic commitlog mode was used: replicas would acknowledge the write and then immediately crash, losing the write if they were unlucky (i.e. commitlog wasn't synced to disk before the write was acknowledged). This commit enforces all writes to the generations table to be synced to commitlog immediately. It does not matter for performance as these writes are very rare. Fixes https://github.com/scylladb/scylla/issues/7610. Closes #7619	2020-11-17 00:01:13 +02:00
Piotr Jastrzebski	d2897d8f8b	alternator: guard streams with an experimental flag Add new alternator-streams experimental flag for alternator streams control. CDC becomes GA and won't be guarded by an experimental flag any more. Alternator Streams stay experimental so now they need to be controlled by their own experimental flag. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-12 12:36:16 +01:00
Piotr Jastrzebski	e9072542c1	Mark CDC as GA Enable CDC by default. Rename CDC experimental feature to UNUSED_CDC to keep accepting cdc flag. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2020-11-12 12:36:13 +01:00
Piotr Sarna	d43ac783c6	db,view: degrade helper message from error to warn When a missing base column happens to be named `idx_token`, an additional helper message is printed in logs. This additional message does not need to have `error` severity, since the previous, generic message is already marked as `error`. This patch simply makes it easier to write tests, because in case this error is expected, only one message needs to be explicitly ignored instead of two. Closes #7597	2020-11-12 12:28:26 +02:00
Benny Halevy	3fab0f8694	storage_proxy: convert to shared_token_metadata get() the latest token_metadata_ptr from the shared_token_metadata before each use. expose get_token_metadata_ptr() rather than get_token_metadata() so that caller can keep it across continuations. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Benny Halevy	6d06853e6c	abstract_replication_strategy: convert to shared_token_metadata To facilitate that, keep a const shared_token_metadata& in class database rather than a const token_metadata& Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2020-11-11 14:20:23 +02:00
Benny Halevy	8bcdf39a18	hints/manager: scan_for_hints_dirs: fix use-after-move This use-after move was apprently exposed after switching to clang in commit `eb861e68e9`. The directory_entry is required for std::stoi(de.name.c_str()) and later in the catch{} clause. This shows in the node logs as a "Ignore invalid directory" debug log message with an empty name, and caused the hintedhandoff_rebalance_test to fail when hints files aren't rebalanced. Test: unit(dev) DTest: hintedhandoff_additional_test.py:TestHintedHandoff.hintedhandoff_rebalance_test (dev, debug) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20201106172017.823577-1-bhalevy@scylladb.com>	2020-11-09 16:32:54 +01:00
Piotr Wojtczak	72c7f25a29	db: add TransitionalAuthorizer and TransitionalAuthenticator... ... to config descriptions We allow setting the transitional auth as one of the options in scylla.yaml, but don't mention it at all in the field's description. Let's change that. Closes #7565	2020-11-09 10:51:54 +01:00
Avi Kivity	6b4a7fa515	Revert "Revert "config: Do not enable repair based node operations by default"" This reverts commit `71d0d58f8c`. Repair based node operations are still not ready and will be re-enabled after more testing and fixes.	2020-11-08 14:09:50 +02:00
Tomasz Grabiec	6d0d55aa72	Merge "Unglobal query processor instance" from Pavel Emelyanov The query processor is present in the global namespace and is widely accessed with global get(_local)?_query_processor(). There's a long-term task to get rid of this globality and make services and componenets reference each-other and, for and due-to this, start and stop in specific order. This set makes this for the query processor. The remaining users of it are -- alternator, controllers for client services, schema_tables and sys_dist_ks. All of them except for the schema_tables are fixed just by passing the reference on query processor with small patches. The schema tables accessing qp sit deep inside the paxos code, but can be "fixed" with the qctx thing until the qctx itself is de-globalized. * https://github.com/xemul/scylla/tree/br-rip-global-query-processor: code: RIP global query processor instance cql test env: Keep query processor reference on board system distributed keyspace: Start sharded service erarlier schema_tables: Use qctx to make internal requests transport: Keep sharded query processor reference on controller thrift: Keep sharded query processor reference on controller alternator: Use local query processor reference to get keys alternator: Keep local query processor reference in server	2020-11-06 14:24:41 +01:00
Piotr Sarna	b61d4bc8d0	db: degrade view building progress loading error to warning When the view builder cannot read view building progress from an internal CQL table it produces an error message, but that only confuses the user and the test suite -- this situation is entirely recoverable, because the builder simply assumes that there is no progress and the view building should start from scratch. Fixes #7527 Closes #7558	2020-11-06 10:19:11 +02:00
Nadav Har'El	7ff72b0ba5	Merge 'secondary_index: fix returned rows token ordering' from Piotr Grabowski Fixes returned rows ordering to proper signed token ordering. Before this change, rows were sorted by token, but using unsigned comparison, meaning that negative tokens appeared after positive tokens. Rename `token_column_computation` to `legacy_token_column_computation` and add some comments describing this computation. Added (new) `token_column_computation` which returns token as `long_type`, which is sorted using signed comparison - the correct ordering of tokens. Add new `correct_idx_token_in_secondary_index` feature, which flags that the whole cluster is able to use new `token_column_computation`. Switch token computation in secondary indexes to (new) `token_column_computation`, which fixes the ordering. This column computation type is only set if cluster supports `correct_idx_token_in_secondary_index` feature to make sure that all nodes will be able to compute new `token_column_computation`. Also old indexes will need to be rebuilt to take advantage of this fix, as new token column computation type is only set for new indexes. Fix tests according to new token ordering and add one new test to validate this aspect explicitly. Fixes #7443 Tested manually a scenario when someone created an index on old version of Scylla and then migrated to new Scylla. Old index continued to work properly (but returning in wrong order). Upon dropping and re-creating the index, it still returned the same data, but now in correct order. Closes #7534 * github.com:scylladb/scylla: tests: add token ordering test of indexed selects tests: fix tests according to new token ordering secondary_index: use new token_column_computation feature: add correct_idx_token_in_secondary_index column_computation: add token_column_computation token_column_computation: rename as legacy	2020-11-05 18:44:49 +01:00

1 2 3 4 5 ...

1903 Commits