scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 18:40:38 +00:00

Author	SHA1	Message	Date
Avi Kivity	3bead8cea0	feature: grandfather PER_TABLE_PARTITIONERS The PER_TABLE_PARTITIONERS feature was added in `90df9a44ce` (2020; 4.0) and can now be assumed to be always present. We also remove the associated schema_feature.	2024-05-18 00:15:07 +03:00
Avi Kivity	6b532fd40b	test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS The first digest tested was generated without the PER_TABLE_PARTITIONERS schema feature. We're about to make that feature mandatory, so we won't be able (and won't need) to generate a digest without it. Update the digest to include the feature. Note it wasn't untested before, we have a test with schema_features::full().	2024-05-18 00:14:43 +03:00
Avi Kivity	c4d8b17f4c	test: test_schema_change_digest: drop unneeded reference digests digests[0] was used by the VIEW_VIRTUAL_COLUMNS feature, which no longer exists. digests[1] is the same as digests[2], so drop it.	2024-05-17 20:41:20 +03:00
Avi Kivity	b5f6021a6b	feature: grandfather VIEW_VIRTUAL_COLUMNS The VIEW_VIRTUAL_COLUMNS feature was added in `a108df09f9` (2019; 3.1) and can now be assumed to be always present. The corresponding schema_feature is removed. Note schema_features are not sent over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.	2024-05-17 20:41:19 +03:00
Patryk Jędrzejczak	77342ffb34	test: lib: single_node_cql_env: restart a node in noninitial run_in_thread calls In the following commit, we make the `consistent-topology-changes` experimental feature unused. Then, all unit tests in the boost suite will start using the raft-based topology by default. Unfortunately, tests with multiple `single_node_cql_env::run_in_thread` calls (usually coming from the `do_with_cql_env_thread` calls) would fail. In a noninitial `run_in_thread` call, a node is started as if it booted for the first time. On the other hand, it has its persistent state from previous boots. Hence, the node can behave strangely and unexpectedly. In particular, `SYSTEM.TOPOLOGY` is not empty and the assertion that expects it to be empty when we boot for the first time fails. We fix this issue by making noninitial `run_in_thread` calls behave as normal restarts. After this change, `test_schema_digest_does_not_change_with_disabled_features` starts failing. This test copies the data directory before booting for the first time, so the new `_sys_ks.local().build_bootstrap_info().get();` makes the node incorrectly think it restarts. Then, after noticing it is not a part of group 0, the node would start the raft upgrade procedure if we didn't run it in the raft RECOVERY mode. This procedure would get stuck because it depends on messaging being enabled even if the node communicates only with itself and messaging is disabled in boost tests.	2024-04-25 14:33:21 +02:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Kefu Chai	97587a2ea4	test/boost: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17139	2024-02-06 13:22:16 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Kefu Chai	6c06751640	cdc: not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16725	2024-01-11 09:13:37 +02:00
Patryk Jędrzejczak	5ebfbf42bc	db: config: make consistent_cluster_management mandatory Code that executed only when consistent_cluster_management=false is removed. In particular, after this patch: - raft_group0 and raft_group_registry are always enabled, - raft_group0::status_for_monitoring::disabled becomes unused, - topology tests can only run with consistent_cluster_management.	2023-12-14 16:54:04 +01:00
Patryk Jędrzejczak	7dd7ec8996	test: boost: schema_change_test: replace disable_raft_schema_config In the following commits, we make consistent cluster management mandatory. This will make disable_raft_schema_config unusable, so we need to get rid of it. However, we don't want to remove tests that use it. The idea is to use the Raft RECOVERY mode instead of disabling consistent cluster management directly.	2023-12-14 16:54:04 +01:00
Kamil Braun	7dad31c78f	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-12-08 17:46:31 +01:00
Pavel Emelyanov	210b01a5ce	config: Make object storage config updateable_value_source Now its plain updateable_value, but without the ..._source object the updateable_value is just a no-op value holder. In order for the observers to operate there must be the value source, updating it would update the attached updateable values _and_ notify the observers. In order for the config to be the u.v._source, config entries should be comparable to each other, thus the <=> operator for it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-21 16:47:50 +03:00
Avi Kivity	35849fc901	Revert "Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun" This reverts commit `3d4398d1b2`, reversing changes made to `45dfce6632`. The commit causes some schema changes to be lost due to incorrect timestamps in some mutations. More information is available in [1]. Reopens: scylladb/scylladb#7620 Reopens: scylladb/scylladb#13957 Fixes scylladb/scylladb#15530. [1] https://github.com/scylladb/scylladb/pull/15687	2023-10-11 00:32:05 +03:00
Botond Dénes	f6575344df	Merge 'Collect dangling object-store sstables' from Pavel Emelyanov Sstables in transitional states are marked with the respective 'status' in the registry. Currently there are two of such -- 'creating' and 'removing'. And the 'sealed' status for sstables in use. On boot the distributed loader tries to garbage collect the dangling sstables. For filesystem storage it's done with the help of temorary sstables' dirs and pending deletion logs. For s3-backed sstables, the garbage collection means fetching all non-sealed entries and removing the corresponding objects from the storage. Test included (last patch) fixes #13024 Closes scylladb/scylladb#15318 * github.com:scylladb/scylladb: test: Extend object_store test to validate GC works sstable_directory: Garbage collect S3 sstables on reboot sstable_directory: Pass storage to garbage_collect() sstable_directory: Create storage instance too	2023-09-21 09:15:00 +03:00
Kamil Braun	c2beee348a	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-09-15 17:54:36 +02:00
Pavel Emelyanov	a957e97ab4	sstable_directory: Create storage instance too Right now the directory instance only creates lister, but lister is unaware on exact objects manipulations. The storage is, so create it too, it's going to be used by garbage collector in next patches This change also needs fixing the way cql_test_env is configured for schema_change_test. There are cases that try to pick up keyspace with S3 storage option from the pre-created sstables, and populating those would need to provide some (even empty) object storage endpoint Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-12 09:29:34 +03:00
Patryk Jędrzejczak	866c9a904d	test: always pass empty description to migration_manager::announce In the next commit, we remove the default value for the description parameter of migration_manager::announce to avoid using it in the future. However, many calls to announce in tests use the default value. We have to change it, but we don't really care about descriptions in the tests, so we pass the empty string everywhere.	2023-08-07 14:38:11 +02:00
Kamil Braun	84bb75ea0a	Merge 'service: migration_manager: change the prepare_ methods to functions' from Patryk Jędrzejczak The `migration_manager` service is responsible for schema convergence in the cluster - pushing schema changes to other nodes and pulling schema when a version mismatch is observed. However, there is also a part of `migration_manager` that doesn't really belong there - creating mutations for schema updates. These are the functions with `prepare_` prefix. They don't modify any state and don't exchange any messages. They only need to read the local database. We take these functions out of `migration_manager` and make them separate functions to reduce the dependency of other modules (especially `query_processor` and CQL statements) on `migration_manager`. Since all of these functions only need access to `storage_proxy` (or even only `replica::database`), doing such a refactor is not complicated. We just have to add one parameter, either `storage_proxy` or `database` and both of them are easily accessible in the places where these functions are called. This refactor makes `migration_manager` unneeded in a few functions: - `alternator::executor::create_keyspace`, - `cql3::statements::alter_type_statement::prepare_announcement_mutations`, - `cql3::statements::schema_altering_statement::prepare_schema_mutations`, - `cql3::query_processor::execute_thrift_schema_command:`, - `thrift::handler::execute_schema_command`. We remove the `migration_manager&` parameter from all these functions. Fixes #14339 Closes #14875 * github.com:scylladb/scylladb: cql3: query_processor::execute_thrift_schema_command: remove an unused parameter cql3: schema_altering_statement::prepare_schema_mutations: remove an unused parameter cql3: alter_type_statement::prepare_announcement_mutations: change parameters alternator: executor::create_keyspace: remove an unused parameter service: migration_manager: change the prepare_ methods to functions	2023-08-01 11:56:56 +02:00
Patryk Jędrzejczak	3468cbd66b	service: migration_manager: change the prepare_ methods to functions The migration_manager service is responsible for schema convergence in the cluster - pushing schema changes to other nodes and pulling schema when a version mismatch is observed. However, there is also a part of migration_manager that doesn't really belong there - creating mutations for schema updates. These are the functions with prepare_ prefix. They don't modify any state and don't exchange any messages. They only need to read the local database. We take these functions out of migration_manager and make them separate functions to reduce the dependency of other modules (especially query_processor and CQL statements) on migration_manager. Since all of these functions only need access to storage_proxy (or even only replica::database), doing such a refactor is not complicated. We just have to add one parameter, either storage_proxy or database and both of them are easily accessible in the places where these functions are called.	2023-07-28 13:55:27 +02:00
Tomasz Grabiec	1ecd3c1a9a	test: schema_change_test: Verify digests also with TABLE_DIGEST_INSENSITIVE_TO_EXPIRY enabled The new test cases are a mirror of old test cases, but with updated digests.	2023-07-12 21:21:55 +02:00
Tomasz Grabiec	f2ed9fcd7e	schema_mutations, migration_manager: Ignore empty partitions in per-table digest Schema digest is calculated by querying for mutations of all schema tables, then compacting them so that all tombstones in them are dropped. However, even if the mutation becomes empty after compaction, we still feed its partition key. If the same mutations were compacted prior to the query, because the tombstones expire, we won't get any mutation at all and won't feed the partition key. So schema digest will change once an empty partition of some schema table is compacted away. Tombstones expire 7 days after schema change which introduces them. If one of the nodes is restarted after that, it will compute a different table schema digest on boot. This may cause performance problems. When sending a request from coordinator to replica, the replica needs schema_ptr of exact schema version request by the coordinator. If it doesn't know that version, it will request it from the coordinator and perform a full schema merge. This adds latency to every such request. Schema versions which are not referenced are currently kept in cache for only 1 second, so if request flow has low-enough rate, this situation results in perpetual schema pulls. After `ae8d2a550d`, it is more liekly to run into this situation, because table creation generates tombstones for all schema tables relevant to the table, even the ones which will be otherwise empty for the new table (e.g. computed_columns). This change inroduces a cluster feature which when enabled will change digest calculation to be insensitive to expiry by ignoring empty partitions in digest calculation. When the feature is enabled, schema_ptrs are reloaded so that the window of discrepancy during transition is short and no rolling restart is required. A similar problem was fixed for per-node digest calculation in 18f484cc753d17d1e3658bcb5c73ed8f319d32e8. Per-table digest calculation was not fixed at that time because we didn't persist enabled features and they were not enabled early-enough on boot for us to depend on them in digest calculation. Now they are enabled before non-system tables are loaded so digest calculation can rely on cluster features. Fixes #4485.	2023-07-03 23:06:55 +02:00
Jan Ciolek	d2ef55b12c	test: use NetworkTopologyStrategy in all unit tests As described in https://github.com/scylladb/scylladb/issues/8638, we're moving away from `SimpleStrategy`, in the future it will become deprecated. We should remove all uses of it and replace them with `NetworkTopologyStrategy`. This change replaces `SimpleStrategy` with `NetworkTopologyStrategy` in all unit tests, or at least in the ones where it was reasonable to do so. Some of the tests were written explicitly to test the `SimpleStrategy` strategy, or changing the keyspace from `SimpleStrategy` to `NetworkTopologyStrategy`. These tests were left intact. It's still a feature that is supported, even if it's slowly getting deprecated. The typical way to use `NetworkTopologyStrategy` is to specify a replication factor for each datacenter. This could be a bit cumbersome, we would have to fetch the list of datacenters, set the repfactors, etc. Luckily there is another way - we can just specify a replication factor to use for or each existing datacenter, like this: ```cql CREATE KEYSPACE {} WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'replication_factor' : 1}; ``` This makes the change rather straightforward - just replace all instances of `'SimpleStrategy'', with `'NetworkTopologyStrategy'`. Refs: https://github.com/scylladb/scylladb/issues/8638 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13990	2023-05-23 08:52:56 +03:00
Kamil Braun	30cc07b40d	Merge 'Introduce tablets' from Tomasz Grabiec This PR introduces an experimental feature called "tablets". Tablets are a way to distribute data in the cluster, which is an alternative to the current vnode-based replication. Vnode-based replication strategy tries to evenly distribute the global token space shared by all tables among nodes and shards. With tablets, the aim is to start from a different side. Divide resources of replica-shard into tablets, with a goal of having a fixed target tablet size, and then assign those tablets to serve fragments of tables (also called tablets). This will allow us to balance the load in a more flexible manner, by moving individual tablets around. Also, unlike with vnode ranges, tablet replicas live on a particular shard on a given node, which will allow us to bind raft groups to tablets. Those goals are not yet achieved with this PR, but it lays the ground for this. Things achieved in this PR: - You can start a cluster and create a keyspace whose tables will use tablet-based replication. This is done by setting `initial_tablets` option: ``` CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 3, 'initial_tablets': 8}; ``` All tables created in such a keyspace will be tablet-based. Tablet-based replication is a trait, not a separate replication strategy. Tablets don't change the spirit of replication strategy, it just alters the way in which data ownership is managed. In theory, we could use it for other strategies as well like EverywhereReplicationStrategy. Currently, only NetworkTopologyStrategy is augmented to support tablets. - You can create and drop tablet-based tables (no DDL language changes) - DML / DQL work with tablet-based tables Replicas for tablet-based tables are chosen from tablet metadata instead of token metadata Things which are not yet implemented: - handling of views, indexes, CDC created on tablet-based tables - sharding is done using the old method, it ignores the shard allocated in tablet metadata - node operations (topology changes, repair, rebuild) are not handling tablet-based tables - not integrated with compaction groups - tablet allocator piggy-backs on tokens to choose replicas. Eventually we want to allocate based on current load, not statically Closes #13387 * github.com:scylladb/scylladb: test: topology: Introduce test_tablets.py raft: Introduce 'raft_server_force_snapshot' error injection locator: network_topology_strategy: Support tablet replication service: Introduce tablet_allocator locator: Introduce tablet_aware_replication_strategy locator: Extract maybe_remove_node_being_replaced() dht: token_metadata: Introduce get_my_id() migration_manager: Send tablet metadata as part of schema pull storage_service: Load tablet metadata when reloading topology state storage_service: Load tablet metadata on boot and from group0 changes db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata() migration_notifier: Introduce before_drop_keyspace() migration_manager: Make prepare_keyspace_drop_announcement() return a future<> test: perf: Introduce perf-tablets test: Introduce tablets_test test: lib: Do not override table id in create_table() utils, tablets: Introduce external_memory_usage() db: tablets: Add printers db: tablets: Add persistence layer dht: Use last_token_of_compaction_group() in split_token_range_msb() locator: Introduce tablet_metadata dht: Introduce first_token() dht: Introduce next_token() storage_proxy: Improve trace-level logging locator: token_metadata: Fix confusing comment on ring_range() dht, storage_proxy: Abstract token space splitting Revert "query_ranges_to_vnodes_generator: fix for exclusive boundaries" db: Exclude keyspace with per-table replication in get_non_local_strategy_keyspaces_erms() db: Introduce get_non_local_vnode_based_strategy_keyspaces() service: storage_proxy: Avoid copying keyspace name in write handler locator: Introduce per-table replication strategy treewide: Use replication_strategy_ptr as a shorter name for abstract_replication_strategy::ptr_type locator: Introduce effective_replication_map locator: Rename effective_replication_map to vnode_effective_replication_map locator: effective_replication_map: Abstract get_pending_endpoints() db: Propagate feature_service to abstract_replication_strategy::validate_options() db: config: Introduce experimental "TABLETS" feature db: Log replication strategy for debugging purposes db: Log full exception on error in do_parse_schema_tables() db: keyspace: Remove non-const replication strategy getter config: Reformat	2023-04-27 09:40:18 +02:00
Tomasz Grabiec	41e69836fd	db, migration_manager: Notify about tablet metadata changes via migration_listener::on_update_tablet_metadata()	2023-04-24 10:49:37 +02:00
Tomasz Grabiec	5b046043ea	migration_manager: Make prepare_keyspace_drop_announcement() return a future<> It will be extended with listener notification firing, which is an async operation.	2023-04-24 10:49:37 +02:00
Pavel Emelyanov	f953fb2f52	schema_change_test: Use proxy from cql_test_env There's one place where test case calls for storage proxy and currently does it via global refernece. Time to switch it to cql_test_env's one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:18:00 +03:00
Kefu Chai	596ea6d439	test: drop unused captured variables this should silence the warning like: ``` test/boost/multishard_mutation_query_test.cc:493:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] do_with_cql_env_thread([this] (cql_test_env& env) -> future<> { ^~~~ test/boost/multishard_mutation_query_test.cc:577:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] do_with_cql_env_thread([this] (cql_test_env& env) -> future<> { ^~~~ 2 errors generated. ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-22 21:21:04 +08:00
Nadav Har'El	843a5dfc15	Merge 'Allow setting permissions for user-defined functions' from Wojciech Mitros This series aims to allow users to set permissions on user-defined functions. The implementation is based on Cassandra's documentation and should be fully compatible: https://cassandra.apache.org/doc/latest/cassandra/cql/security.html#cql-permissions Fixes: #5572 Fixes: #10633 Closes #12869 * github.com:scylladb/scylladb: cql3: allow UDTs in permissions on UDFs cql3: add type_parser::parse() method taking user_types_metadata schema_change_test: stop using non-existent keyspace cql3: fix parameter names in function resource constructors cql3: handle complex types as when decoding function permissions cql3: enforce permissions for ALTER FUNCTION cql-pytest: add a (failing) test case for UDT in UDF cql-pytest: add a test case for user-defined aggregate permissions cql-pytest: add tests for function permissions cql3: enforce permissions on function calls selection: add a getter for used functions abstract_function_selector: expose underlying function cql3: enforce permissions on DROP FUNCTION cql3: enforce permissions for CREATE FUNCTION client_state: add functions for checking function permissions cql-pytest: add a case for serializing function permissions cql3: allow specifying function permissions in CQL auth: add functions_resource to resources	2023-03-12 14:04:34 +02:00
Wojciech Mitros	4182a221d6	schema_change_test: stop using non-existent keyspace The current implementation of CQL type parsing worked even when given a string representing a non-existent keyspace, as long as the parsed type was one of the "native" types. This implementation is going to change, so that we won't parse types given an incorrect keyspace name. When using `do_with_cql_env`, a "ks" keyspace is created by default, and "tests" keyspace is not. The tests for reverse schemas in `schema_change_test` were using the "tests" keyspace, so in order to make the tests work after the future changes, they now use the existing "ks" keyspace.	2023-03-10 11:02:32 +01:00
Kefu Chai	3ae11de204	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:53 +08:00
Tomasz Grabiec	c8e2bf1596	db: schema_tables: Optimize schema merge Currently, applying a schema change on a replica works like this: Collect all affected keyspaces from incoming mutations Read current state of schema Apply the mutations Read new state of schema The "Read ... state of schema" step reads all kinds of schema objects. In particular, to read the "table" objects, it does the following: for every affected keyspace k: read all mutations from system_schema.tables for k extract all existing table names from those mutations for every existing table: read mutations from {tables, columns, indexes, view_virtual_columns, ...} for that table As you can see, the number of reads performed is O(nr tables in a keyspace), not O(nr tables in a change). This means that making a sequence of schema changes, like adding a table, is quadratic. Another aspect which magnifies this is that we don't read those tables using a single scan, but issue individual queries for each table separately. This patch optimizes this by considering only affected tables when reading schema for the purpose of diff calculation. When mutations contain multi-table deletions, we still read the set of tables, like before. This could be optimized by looking at the database to get the list, but it's not part of the patch. I tested this using a test case provided by Kamil (kbr-scylla@53fe154) ./test.py --mode debug test_many_schema_changes -s The test bootstraps a cluster and then creates about 40 schema changes. Then a new node is bootstrapped and replays those changes via group0. In debug mode, each change takes roughly 2s to process before the patch, and 0.5s after the patch. The whole replay is reduced to 56% of what was before: Before (1m19s) : INFO 2023-01-20 19:44:35,848 [shard 0] raft_group0 - setup_group0: ensuring that the cluster has fully upgraded to use Raft... INFO 2023-01-20 19:45:54,844 [shard 0] raft_group0 - setup_group0: waiting for peers to synchronize state... After (45s): INFO 2023-01-20 22:02:51,869 [shard 0] raft_group0 - setup_group0: ensuring that the cluster has fully upgraded to use Raft... INFO 2023-01-20 22:03:36,834 [shard 0] raft_group0 - setup_group0: waiting for peers to synchronize state... Closes #12592 Closes #12592	2023-02-21 17:26:57 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Kamil Braun	bed555d1e5	db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config' Make it clear that the table stores the snapshot configuration, which is not necessarily the currently operating configuration (the last one appended to the log). In the future we plan to have a separate virtual table for showing the currently operating configuration, perhaps we will call it `system.raft_config`.	2023-01-12 16:21:26 +01:00
Tomasz Grabiec	ae8d2a550d	db: schema_tables: Make table creation shadow earlier concurrent changes Issuing two CREATE TABLE statements with a different name for one of the partition key columns leads to the following assertion failure on all replicas: scylla: schema.cc:363: schema::schema(const schema::raw_schema&, std::optional<raw_view_info>): Assertion `!def.id \|\| def.id == id - column_offset(def.kind)' failed. The reason is that once the create table mutations are merged, the columns table contains two entries for the same position in the partition key tuple. If the schemas were the same, or not conflicting in a way which leads to abort, the current behavior would be to drop the older table as if the last CREATE TABLE was preceded by a DROP TABLE. The proposed fix is to make CREATE TABLE mutation include a tombstone for all older schema changes of this table, effectively overriding them. The behavior will be the same as if the schemas were not different, older table will be dropped. Fixes #11396	2022-08-29 12:06:02 +02:00
Benny Halevy	2b017ce285	schema, everywhere: define and use table_schema_version as a strong type Define table_schema_version as a distinct tagged_uuid class, So it can be differentiated from other uuid-class types, in particular table_id. Added reversed(table_schema_version) for convenience and uniformity since the same logic is currently open coded in several places. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:45 +03:00
Nadav Har'El	cc69177dcc	config: fix printing of experimental feature list Recently we noticed a regression where with certain versions of the fmt library, SELECT value FROM system.config WHERE name = 'experimental_features' returns string numbers, like "5", instead of feature names like "raft". It turns out that the fmt library keep changing their overload resolution order when there are several ways to print something. For enum_option<T> we happen to have to conflicting ways to print it: 1. We have an explicit operator<<. 2. We have an implicit convertor to the type held by T. We were hoping that the operator<< always wins. But in fmt 8.1, there is special logic that if the type is convertable to an int, this is used before operator<<()! For experimental_features_t, the type held in it was an old-style enum, so it is indeed convertible to int. The solution I used in this patch is to replace the old-style enum in experimental_features_t by the newer and more recommended "enum class", which does not have an implicit conversion to int. I could have fixed it in other ways, but it wouldn't have been much prettier. For example, dropping the implicit convertor would require us to change a bunch of switch() statements over enum_option (and not just experimental_features_t, but other types of enum_option). Going forward, all uses of enum_option should use "enum class", not "enum". tri_mode_restriction_t was already using an enum class, and now so does experimental_features_t. I changed the examples in the comments to also use "enum class" instead of enum. This patch also adds to the existing experimental_features test a check that the feature names are words that are not numbers. Fixes #11003. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11004	2022-07-11 09:17:30 +02:00
Piotr Sarna	151d8f7c58	test: regenerate schema_change_test for storage options case Keyspace storage options series adds a new schema table: system_schema.scylla_keyspaces. The regenerated cases ensure that this new table is taken into account when the schema feature is available.	2022-04-08 09:17:01 +02:00
Piotr Sarna	4705a5fa42	test: improve output of schema_change_test regeneration Schema change test operates on pre-generated sstables, and sometimes this set of sstables needs to be regenerated. In order to make the regeneration process more ergonomic, the output is now directly copyable as valid C++ representation of UUIDs.	2022-04-08 09:17:01 +02:00
Nadav Har'El	7be3129458	cdc: don't need current keyspace to create the log table CDC registers to the table-creation hook (before_create_column_family) to add a second table - the CDC log table - to the same keyspace. The handler function (on_before_update_column_family() in cdc/log.cc) wants to retrieve the keyspace's definition, but that does NOT WORK if we create the keyspace and table in one operation (which is exactly what we intend to do in Alternator to solve issue #9868) - because at the time of the hook, the keyspace does not yet exist in the schema. It turns out that on_before_update_column_family() does not REALLY need the keyspace. It needed it to pass it on to make_create_table_mutations() but that function doesn't use the keyspace parameter passed to it! All it needs is the keyspace's name - which is in the schema anyway and doesn't need to be looked up. So in this patch we fix make_create_table_mutations() to not require the unused keyspace parameter - and fix the CDC code not to look for the keyspace that is no longer needed. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220215162342.622509-1-nyh@scylladb.com>	2022-02-16 08:38:56 +02:00
Kamil Braun	a664ac7ba5	treewide: require `group0_guard` when performing schema changes `announce` now takes a `group0_guard` by value. `group0_guard` can only be obtained through `migration_manager::start_group0_operation` and moved, it cannot be constructed outside `migration_manager`. The guard will be a method of ensuring linearizability for group 0 operations.	2022-01-24 15:20:35 +01:00
Kamil Braun	283ac7fefe	treewide: pass mutation timestamp from call sites into `migration_manager::prepare_*` functions The functions which prepare schema change mutations (such as `prepare_new_column_family_announcement`) would use internally generated timestamps for these mutations. When schema changes are managed by group 0 we want to ensure that timestamps of mutations applied through Raft are monotonic. We will generate these timestamps at call sites and pass them into the `prepare_` functions. This commit prepares the APIs.	2022-01-24 15:12:50 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Gleb Natapov	f0a41c102a	test: move schema_change_test.cc to new schema announcement api	2022-01-13 23:10:18 +02:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Gleb Natapov	38e1f85959	migration_manager: drop view_ptr array from announce_column_family_update() No users pass it any longer.	2021-12-11 12:31:07 +02:00
Botond Dénes	3f4f408bcf	schema: add get_reversed() A variant of make_reversed() which goes through the schema registry, teaching the schema to the registry if necessary. This effectively caches the result of the reversing and as an added bonus double reversing yields the very same schema C++ object that was the starting point. Closes #9365	2021-09-22 18:55:25 +03:00
Tomasz Grabiec	83113d8661	Merge "raft: new schema for storing raft snapshots" from Pavel Solodovnikov Previously, the layout for storing raft snapshot descriptors contained a `config` field, which had `blob` data type. That means `raft::configuration` for the snapshot was serialized as a whole in binary form. It's convenient to implement and is the most compact form of representing the data, but: 1. Hard to debug due to the need to de-serialize the data. 2. Plants a time bomb wrt. changing data layout and also the documentation in the future. Remove the `config` field from `system.raft_snapshots` and extract it to a separate `system.raft_config` table to store the data in exploded form. Also, modify the schema of `system.raft_snapshots` table in the following way: add a `server_id` field as a part of composite partition key ((group_id, server_id)) to be able to start multiple raft servers belonging to one raft group on the same scylla node. Rename `id` field in `raft_snapshots` to `snapshot_id` so it's self-documenting. Rename `snapshot_id` from clustering key since a given server can have only one snapshot installed at a time. Note that the `raft::server_address` stucture contains an opaque `info` member, which is `bytes`, but in the `raft_config` table we use `ip_addr inet` field, instead. We always know that the corresponding member field is going to contain an IP address (either v4 or v6) of a given raft server. So, now the snapshots schema looks like this: CREATE TABLE raft_snapshots ( group_id timeuuid, server_id uuid, snapshot_id uuid, idx int, term int, -- no `config` field here, moved to `raft_config` table PRIMARY KEY ((group_id, server_id)) ) CREATE TABLE raft_config ( group_id timeuuid, my_server_id uuid, server_id uuid, disposition text, -- can be either 'CURRENT` or `PREVIOUS' can_vote bool, ip_addr inet, PRIMARY KEY ((group_id, my_server_id), server_id, disposition) ); This way it's much easier to extend the schema with new fields, very easy to debug and inspect via CQL, and it's much more descriptive in terms of self-documentation. Tests: unit(dev) * manmanson/raft_snapshots_new_schema_v2: test: adjust `schema_change_test` to include new `system.raft_config` table raft: new schema for storing raft snapshots raft: pass server id to `raft_sys_table_storage` instance	2021-09-10 20:41:59 +02:00
Botond Dénes	f200c8104a	schema: introduce make_reversed() `make_revered()` creates a schema identical to the schema instance it is called on, with clustering order reversed. To distinguish the reverse schema from the original one, the node-id part of its version UUID is bit-flipped. This ensures that reversing a schema twice will result in the identical schema to the original one (although a different C++ object). This reversed schema will be used in reversed reads, so intermediate layers can be ignorant of the fact that the read happens in reverse.	2021-09-09 11:49:05 +03:00

1 2

75 Commits