scylladb

Author	SHA1	Message	Date
Kamil Braun	283ac7fefe	treewide: pass mutation timestamp from call sites into `migration_manager::prepare_*` functions The functions which prepare schema change mutations (such as `prepare_new_column_family_announcement`) would use internally generated timestamps for these mutations. When schema changes are managed by group 0 we want to ensure that timestamps of mutations applied through Raft are monotonic. We will generate these timestamps at call sites and pass them into the `prepare_` functions. This commit prepares the APIs.	2022-01-24 15:12:50 +01:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Pavel Emelyanov	00de5f4876	validation: Make validate_column_family use data_dictionary::database And instantly convert the validate_keyspace() as it's not called from anywhere but the validate_column_family(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-14 13:00:53 +03:00
Pavel Emelyanov	b6bc7a9b29	client_state: Make has_column_family_access use data_dictionary::database Straightforward replacement. Internals of the has_column_family_access() temporarily get .real_database(), but it will be changed soon. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-14 12:55:15 +03:00
Asias He	a8ad385ecd	repair: Get rid of the gc_grace_seconds The gc_grace_seconds is a very fragile and broken design inherited from Cassandra. Deleted data can be resurrected if cluster wide repair is not performed within gc_grace_seconds. This design pushes the job of making the database consistency to the user. In practice, it is very hard to guarantee repair is performed within gc_grace_seconds all the time. For example, repair workload has the lowest priority in the system which can be slowed down by the higher priority workload, so that there is no guarantee when a repair can finish. A gc_grace_seconds value that is used to work might not work after data volume grows in a cluster. Users might want to avoid running repair during a specific period where latency is the top priority for their business. To solve this problem, an automatic mechanism to protect data resurrection is proposed and implemented. The main idea is to remove the tombstone only after the range that covers the tombstone is repaired. In this patch, a new table option tombstone_gc is added. The option is used to configure tombstone gc mode. For example: 1) GC a tombstone after gc_grace_seconds cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'timeout'} ; This is the default mode. If no tombstone_gc option is specified by the user. The old gc_grace_seconds based gc will be used. 2) Never GC a tombstone cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'disabled'}; 3) GC a tombstone immediately cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'immediate'}; 4) GC a tombstone after repair cqlsh> ALTER TABLE ks.cf WITH tombstone_gc = {'mode':'repair'}; In addition to the 'mode' option, another option 'propagation_delay_in_seconds' is added. It defines the max time a write could possibly delay before it eventually arrives at a node. A new gossip feature TOMBSTONE_GC_OPTIONS is added. The new tombstone_gc option can only be used after the whole cluster supports the new feature. A mixed cluster works with no problem. Tests: compaction_test.py, ninja test Fixes #3560 [avi: resolve conflicts vs data_dictionary]	2022-01-04 19:48:14 +02:00
Pavel Emelyanov	70ad1d9933	create_\|alter_table_statement: Make check_restricted_table_properties() accept query_processor Patch check_restricted_table_properties() and its callers Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:54:28 +03:00
Pavel Emelyanov	b990ca5550	cql3: Make .validate() and .check_access() accept query_processor This is mostly a sed script that replaces methods' first argument plus fixes of compiler-generated errors. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-12-23 10:53:44 +03:00
Avi Kivity	d768e9fac5	cql3, related: switch to data_dictionary Stop using database (and including database.hh) for schema related purposes and use data_dictionary instead. data_dictionary::database::real_database() is called from several places, for these reasons: - calling yet-to-be-converted code - callers with a legitimate need to access data (e.g. system_keyspace) but with the ::database accessor removed from query_processor. We'll need to find another way to supply system_keyspace with data access. - to gain access to the wasm engine for testing whether used defined functions compile. We'll have to find another way to do this as well. The change is a straightforward replacement. One case in modification_statement had to change a capture, but everything else was just a search-and-replace. Some files that lost "database.hh" gained "mutation.hh", which they previously had access to through "database.hh".	2021-12-15 13:54:23 +02:00
Gleb Natapov	730171f4df	cql3: drop schema_altering_statement::announce_migration() It is no longer used.	2021-12-11 12:31:07 +02:00
Gleb Natapov	af6b3d985d	cql3: move ALTER TABLE statement to prepare_schema_mutations() api	2021-12-11 12:31:07 +02:00
Gleb Natapov	688efff6b5	cql3: factor our mutation creation code into a separate function for ALTER TABLE The function will be used in the next patch.	2021-12-11 12:31:07 +02:00
Juliusz Stasiewicz	5a8741a1ca	cdc: Throw when ALTERing cdc options without "enabled":"..." The problem was that such a command: ``` alter table ks.cf with cdc={'ttl': 120}; ``` would assume that "enabled" parameter is the default ("false") and, in effect, disable CDC on that table. This commit forces the user to specify that key. Fixes #6475 Closes #9720	2021-12-07 17:37:44 +02:00
Avi Kivity	9424f6e12f	cql3: replace seastar::sprint() with fmt::format() sprint() is obsolete. Note some calls where to helper functions that use sprint(), not to sprint() directly, so both the helpers and the callers were modified.	2021-10-27 17:02:00 +03:00
Nadav Har'El	4d7f55a29f	cql: add configurable restriction of DateTieredCompactionStrategy DateTieredCompactionStrategy (DTCS) has been un-recommended for a long time (users should use TimeWindowCompactionStrategy, TWCS, instead). This patch adds a new configuration option - restrict_dtcs - which can be used to restrict the ability to use DTCS in CREATE TABLE or ALTER TABLE statements. This is part of a "safe mode" effort to allow an installation to restrict operations which are un-recommended or dangerous. The new restrict_dtcs option has three values: "true", "false", and "warn": For the time being, "false" is still the default, and means DTCS is not restricted and can still be used freely. We can easily change this default in a followup patch. Setting a value of "true" means that DTCS is restricted - trying to create a a table or alter a table with it will fail with an error. Setting a value of "warn" will allow the create or alter operation, but will warn the user - both with a warning message which will immediately appear in cqlsh (for example), and with a log message. Fixes #8914. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210624122411.435361-1-nyh@scylladb.com>	2021-06-24 20:59:27 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Sarna	7e6beabf27	migration_manager: allow table updates with timestamp In order to avoid needless schema disagreements, a way of announcing a schema change with fixed timestamp is added. That way, when nodes update schemas of their internal tables (e.g. during updates), it's possible for all nodes to use an identical timestamp for this operation, which in turn makes their digests identical.	2021-05-10 10:10:38 +02:00
Avi Kivity	daeddda7cc	treewide: remove inclusions of storage_proxy.hh from headers storage_proxy.hh is huge and includes many headers itself, so remove its inclusions from headers and re-add smaller headers where needed (and storage_proxy.hh itself in source files that need it). Ref #1.	2021-04-20 21:23:00 +03:00
Pavel Emelyanov	12e4269dce	cql3: Get database directly from query processor After previous patches some places in cql3 code take a long path to get database reference: query processor -> storage proxy -> database The query processor can provide the database reference by itself, so take this chance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:36:04 +03:00
Pavel Emelyanov	464e58abf7	cql3: Use query_processor::get_migration_manager() (trivial cases) Most of the schema altering statements implementations can now stop calling for global migration manager instance and get it from the query processor. Here are the trivial cases when the query processor is just avaiable at the place where it's needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:36 +03:00
Pavel Emelyanov	1e8f0963f9	cql3: Pass query processor to announce_migration:s Now when the only call to .announce_migration gas the query processor at hands -- pass it to the real statements. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:00:33 +03:00
Gleb Natapov	c9392095ce	cql3: store cf_prop_defs as optional instead of shared_ptr It been a shard_ptr is a remnant of translation from Java. Message-Id: <20210216123931.80280-3-gleb@scylladb.com>	2021-02-16 15:58:38 +02:00
Gleb Natapov	805da054e7	cql3: store cf_name as optional in cf_statement instead of shared_ptr It been a shard_ptr is a remnant of translation from Java. Message-Id: <20210216123931.80280-2-gleb@scylladb.com>	2021-02-16 15:58:37 +02:00
Gleb Natapov	d3aa17591c	migration_manager: drop announce_locally flag It looks like the history of the flag begins in Cassandra's https://issues.apache.org/jira/browse/CASSANDRA-7327 where it is introduced to speedup tests by not needing to start the gossiper. The thing is we always start gossiper in our cql tests, so the flag only introduce noise. And, of course, since we want to move schema to use raft it goes against the nature of the raft to be able to apply modification only locally, so we better get rid of the capability ASAP. Tests: units(dev, debug) Message-Id: <20201230111101.4037543-2-gleb@scylladb.com>	2021-01-03 13:58:09 +02:00
Pavel Emelyanov	b0c4a9087d	client_state: Add database& arg to has_column_family_access It is called from cql3/statements' check_access methods and from thrift handlers. The former have proxy argument from which they can get the database. The latter already have the database itself on board. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-12-11 18:49:16 +03:00
Dejan Mircevski	1beb57ad9d	auth: Permit ALTER options on system_auth tables These alterations cannot break the database irreparably, so allow them. Expand command_desc as required. Add a type (rather than command_desc) parameter to has_column_family_access() to minimize code changes. Fixes #7057 Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2020-11-16 22:32:32 -05:00
Avi Kivity	3daa49f098	Merge "materialized views: Fix undefined behavior on base table schema changes" from Tomasz " The view_info object, which is attached to the schema object of the view, contains a data structure called "base_non_pk_columns_in_view_pk". This data structure contains column ids of the base table so is valid only for a particular version of the base table schema. This data structure is used by materialized view code to interpret mutations of the base table, those coming from base table writes, or reads of the base table done as part of view updates or view building. The base table schema version of that data structure must match the schema version of the mutation fragments, otherwise we hit undefined behavior. This may include aborts, exceptions, segfaults, or data corruption (e.g. writes landing in the wrong column in the view). Before this patch, we could get schema version mismatch here after the base table was altered. That's because the view schema did not change when the base table was altered. Another problem was that view building was using the current table's schema to interpret the fragments and invoke view building. That's incorrect for two reasons. First, fragments generated by a reader must be accessed only using the reader's schema. Second, base_non_pk_columns_in_view_pk of the recorded view ptrs may not longer match the current base table schema, which is used to generate the view updates. Part of the fix is to extract base_non_pk_columns_in_view_pk into a third entity called base_dependent_view_info, which changes both on base table schema changes and view schema changes. It is managed by a shared pointer so that we can take immutable snapshots of it, just like with schema_ptr. When starting the view update, the base table schema_ptr and the corresponding base_dependent_view_info have to match. So we must obtain them atomically, and base_dependent_view_info cannot change during update. Also, whenever the base table schema changes, we must update base_dependent_view_infos of all attached views (atomically) so that it matches the base table schema. Fixes #7061. Tests: - unit (dev) - [v1] manual (reproduced using scylla binary and cqlsh) " * tag 'mv-schema-mismatch-fix-v2' of github.com:tgrabiec/scylla: db: view: Refactor view_info::initialize_base_dependent_fields() tests: mv: Test dropping columns from base table db: view: Fix incorrect schema access during view building after base table schema changes schema: Call on_internal_error() when out of range id is passed to column_at() db: views: Fix undefined behavior on base table schema changes db: views: Introduce has_base_non_pk_columns_in_view_pk()	2020-08-26 17:37:52 +03:00
Raphael S. Carvalho	1c29f0a43d	cql3/statements: verify that counter column cannot be added into non-counter table A check, to validate that counter column cannot be added into non-counter table, is missing for alter table statement. Validation is performed when building new schema, but it's limited to checking that a schema will not contain both counter and non-counter columns. Due to lack of validation, the added counter column could be incorrectly persisted to the schema, but this results in a crash when setting the new schema to its table. On restart, it can be confirmed that the schema change was indeed persisted when describing the table. This problem is fixed by doing proper validation for the alter table statement, which consists of making sure a new counter column cannot be added to a non-counter table. The test cdc_disallow_cdc_for_counters_test is adjusted because one of its tests was built on the assumption that counter column can be added into a non-counter table. Fixes #7065. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20200824155709.34743-1-raphaelsc@scylladb.com>	2020-08-25 10:41:54 +03:00
Tomasz Grabiec	dc18117b82	db: views: Introduce has_base_non_pk_columns_in_view_pk() In preparation for pushing _base_non_pk_columns_in_view_pk deeper.	2020-08-20 14:53:07 +02:00
Piotr Jastrzebski	c001374636	codebase wide: replace count with contains C++20 introduced `contains` member functions for maps and sets for checking whether an element is present in the collection. Previously `count` function was often used in various ways. `contains` does not only express the intend of the code better but also does it in more unified way. This commit replaces all the occurences of the `count` with the `contains`. Tests: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>	2020-08-15 20:26:02 +03:00
Piotr Sarna	a544ca64e2	cql3: refuse to change schema internally for distributed tables Changing the schemas via internal calls to CQL is dangerous, since the changes are not propagated to other nodes. Thus, it should never be used for regular distributed tables. The guarding code was already added for ALTER TABLE statement and it's now expanded to cover all schema altering statements. Tests: unit(dev) Fixes #6700	2020-07-07 09:32:33 +02:00
Piotr Sarna	835734c99d	cql3: disallow altering non-local tables with local queries The database has a mechanism of performing internal CQL queries, mainly to edit its own local tables. Unfortunately, it's easy to use the interface incorrectly - e.g. issuing an `ALTER TABLE` statement on a non-local table will result in not propagating the schema change to other nodes, which in turn leads to inconsistencies. In order to avoid such mistakes (one of them was a root cause of #6513), when an attempt to alter a distributed table via a local interface is performed, it results in an error. Tests: unit(dev) Fixes #6700 Message-Id: <61be3defb57be79f486e6067ceff4f4c965e34cb.1592990796.git.sarna@scylladb.com>	2020-06-24 12:51:40 +03:00
Nadav Har'El	7922b9eb8f	materialized views: reduce recompilation when db/view/view.hh changes. Before this patch, when db/view/view.hh was modified, 89 source files had to be recompiled. After this patch, this number is down to 5. Most of the irrelevant source files got view.hh by including database.hh, which included view.hh just for the definition of statistics. So in this patch we split the view statistics to a separate header file, view_stats.hh, and database.hh only includes that. A few source files which included only database.hh and also needed view.hh (for materialized-view related functions) now need to include view.hh explicitly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200319121031.540-1-nyh@scylladb.com>	2020-03-19 15:46:14 +02:00
Rafael Ávila de Espíndola	c0072eab30	everywhere: Be more explicit that we don't want std::make_shared If sstring is made an alias to std::string ADL causes std::make_shared to be found. Explicitly ask for ::make_shared. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2020-03-10 13:13:48 -07:00
Piotr Dulikowski	e98766dd81	alter_table_statement: fix indentation	2020-03-05 16:11:21 +01:00
Piotr Dulikowski	828077be5e	cf_prop_defs: initialize schema extensions externally Moves initialization of schema extensions outside of cf_prop_defs. This allows to construct these extensions once, and use them several times in cd_prop_defs' methods without caching or recalculating them several times.	2020-03-05 16:11:21 +01:00
Piotr Dulikowski	260c47d758	cf_prop_defs: pass database& to ::validate, not db::extensions& Changes cf_prop_defs::validate function to take database& as an argument instead of db::extensions&. This change will allow us to move the check which asserts that the cluster supports CDC from `apply_to_builder` to `validate` method.	2020-03-05 16:11:21 +01:00
Pavel Emelyanov	60bdf0685c	cql3: Clean cql3/ from remaining storage_service mentionings These are several #include-s and the no longer valid comment. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-24 11:17:47 +03:00
Pavel Emelyanov	6892dbdde7	cql3: Add storage_proxy argument to .check_access method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-24 11:17:19 +03:00
Pavel Solodovnikov	a46f235092	cql3: prefer passing schema as const ref instead of shared_ptr De-pointerize cql3 code APIs further: change some call sites to pass `schema` as const-ref instead of `shared_ptr`. Affected functions known to be expecting always non-null pointer to schema and don't store or pass the pointer somewhere else, assuming it's safe to give them just a reference. Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20200218142338.69824-1-pa.solodovnikov@scylladb.com>	2020-02-18 20:13:10 +02:00
Pavel Solodovnikov	abb3a7e218	cql3: minor sweeps through the cql layer code to reduce shared_ptrs count Convert some more helper functions to accept const reference to column_specification and column_identifier instead of shared_ptr. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2020-02-16 17:24:26 +03:00
Pavel Emelyanov	abe588888d	database: Use feature service Keep local feature_service reference on database. This relaxes the circular storage_service <-> database reference, but not removes it completely. This needs some args tossing in apply_to_builder, but it's rather straightforward, so comes in the same patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2020-02-03 15:16:23 +03:00
Piotr Sarna	155a47cc55	view: handle multiple regular base columns in view pk Previous assumption was that there can only be one regular base column in the view key. The assumption is still correct for tables created via CQL, but it's internally possible to create a view with multiple such columns - the new assumption is that if there are multiple columns, they share their liveness. This patch is vital for indexing to work properly on alternator, so it would be best to solve the issue upstream. I strived to leave the existing semantics intact as long as only up to one regular column is part of the materialized view primary key, which is the case for Scylla's materialized views. For alternator it may not be true, but all regular columns in alternator share liveness info (since alternator does not support per-column TTL), which is sufficient to compute view updates in a consistent way. Fixes #5006 Tests: unit(dev), alternator(test_gsi_update_second_regular_base_column, tic-tac-toe demo) Message-Id: <c9dec243ce903d3a922ce077dc274f988bcf5d57.1567604945.git.sarna@scylladb.com>	2020-01-07 12:18:39 +01:00
Calle Wilund	cb0117eb44	cdc: Handle schema changes via migration manager callbacks This allows us to create/alter/drop log and desc tables "atomically" with the base, by including these mutations in the original mutation set, i.e. batch create/alter tables. Note that population does not happen until types are actually already put into database (duh), thus there _is_ still a gap between creating cdc and it being truly usable. This may or may not need handling later.	2019-12-09 14:35:04 +00:00
Konstantin Osipov	90346236ac	cql: propagate const property through prepared statement tree. cql_statement is a class representing a prepared statement in Scylla. It is used concurrently during execution, so it is important that its change is not changed by execution. Add const qualifier to the execution methods family, throghout the cql hierarchy. Mark a few places which do mutate prepared statement state during execution as mutable. While these are not affecting production today, as code ages, they may become a source of latent bugs and should be moved out of the prepared state or evaluated at prepare eventually: cf_property_defs::_compaction_strategy_class list_permissions_statement::_resource permission_altering_statement::_resource property_definitions::_properties select_statement::_opts	2019-11-26 14:18:17 +03:00
Piotr Jastrzebski	96c800ed0b	modification_statement: log in cdc partition key of a change Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-17 11:28:23 +02:00
Piotr Jastrzebski	a45c894032	alter_table_statement: handle 'with cdc =' Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-17 11:28:23 +02:00
Nadav Har'El	a45b6e41a0	materialized views and secondary index: sometimes allow dropping base columns Until this patch, dropping columns from a table was completely forbidden if this table has any materialized views or secondary indexes. However, this is excessively harsh, and not compatible with Cassandra which does allow dropping columns from a base table which has a secondary index on other columns. This incompatibility was raised in the following Stackoverflow question: https://stackoverflow.com/questions/55757273/error-while-dropping-column-from-a-table-with-secondary-index-scylladb/55776490 In this patch, we allow dropping a base table column if none of its materialized views needs this column. Columns selected by a view (as regular or key columns) are needed by it, of course, but when virtual columns are used (namely, there is a view with same key columns as the base), all columns are needed by the view, so unfortunately none of the columns may be dropped. After this patch, when a base-table column cannot be dropped because one of the materialized views needs it, the error message will look like: exceptions::invalid_request_exception: Cannot drop column a from base table ks.cf: a materialized view cf_a_idx_index needs this column. This patch also includes extensive testing for the cases where dropping columns are now allowed, and not allowed. The secondary-index tests are especially interesting, because they demonstrate that now usually (when a non-key column is being indexed) dropping columns will be allowed, which is what originally bothered the Stackoverflow user. Fixes #4448. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190429214805.2972-1-nyh@scylladb.com>	2019-04-30 12:13:10 +01:00
Rafael Ávila de Espíndola	53ab298957	Turn cql3_type into a trivial wrapper over data_type Both cql3_type and abstract_type are normally used inside shared_ptr. This creates a problem when an abstract_type needs to refer to a cql3_type as that creates a cycle. To avoid warnings from asan, we were using a std::unordered_map to store one of the edges of the cycle. This avoids the warning, but wastes even more memory. Even before this patch cql3_type was a fairly light weight structure. This patch pushes in that direction and now cql3_type is a struct with a single member variable, a data_type. This avoids the reference cycle and is easier to understand IMHO. Tests: unit (dev) Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>	2019-03-20 14:10:28 -07:00
Avi Kivity	c3ef99f84f	schema_tables: remove #include of database.hh Distribute in source files (and one header - table_helper.hh) that need it.	2019-01-05 15:43:07 +02:00
Avi Kivity	d2dae3af86	cql3: reduce dependencies on db/config.hh Instead of accessing extensions via config, access it via database::extensions(). This reduces recompilations when configuration is extended.	2018-12-21 20:15:43 +00:00

1 2

81 Commits