scylladb

Author	SHA1	Message	Date
Botond Dénes	4281d18c2e	Merge 'schema: Apply `sstable_compression_user_table_options` to CQL aux and Alternator tables' from Nikos Dragazis In PR `5b6570be52` we introduced the config option `sstable_compression_user_table_options` to allow adjusting the default compression settings for user tables. However, the new option was hooked into the CQL layer and applied only to CQL base tables, not to the whole spectrum of user tables: CQL auxiliary tables (materialized views, secondary indexes, CDC log tables), Alternator base tables, Alternator auxiliary tables (GSIs, LSIs, Streams). This gap also led to inconsistent default compression algorithms after we changed the option’s default algorithm from LZ4 to LZ4WithDicts (`adf9c426c2`). This series introduces a general “schema initializer” mechanism in `schema_builder` and uses it to apply the default compression settings uniformly across all user tables. This ensures that all base and aux tables take their default compression settings from config. Fixes #26914. Backport justification: LZ4WithDicts is the new default since 2025.4, but the config option exists since 2025.2. Based on severity, I suggest we backport only to 2025.4 to maintain consistency of the defaults. Closes scylladb/scylladb#27204 * github.com:scylladb/scylladb: db/config: Update sstable_compression_user_table_options description schema: Add initializer for compression defaults schema: Generalize static configurators into schema initializers schema: Initialize static properties eagerly db: config: Add accessor for sstable_compression_user_table_options test: Check that CQL and Alternator tables respect compression config	2026-01-22 06:50:48 +02:00
Botond Dénes	122b7847e5	Merge 'index: Accept view properties in CREATE INDEX' from Dawid Mędrek Problem ------- Secondary indexes are implemented via materialized views under the hood. The way an index behaves is determined by the configuration of the view. Currently, it can be modified by performing the CQL statement `ALTER MATERIALIZED VIEW` on it. However, that raises some concerns. Consider, for instance, the following scenario: 1. The user creates a secondary index on a table. 2. In parallel, the user performs writes to the base table. 3. The user modifies the underlying materialized view, e.g. by setting the `synchronous_updates` to `true` [1]. Some of the writes that happened before step 3 used the default value of the property (which is `false`). That had an actual consequence on what happened later on: the view updates were performed asynchronously. Only after step 3 had finished did it change. Unfortunately, as of now, there is no way to avoid a situation like that. Whenever the user wants to configure a secondary index they're creating, they need to do it in another schema change. Since it's not always possible to control how the database is manipulated in the meantime, it leads to problems like the one described. That's not all, though. The fact that it's not possible to configure secondary indexes is inconsistent with other schema entities. When it comes to tables or materialized views, the user always have a means to set some or even all of the properties during their creation. Solution -------- The solution to this problem is extending the `CREATE INDEX` CQL statement by view properties. The syntax is of form: ``` > CREATE INDEX <index name> > .. ON <keyspace>.<table> (<columns>) > .. WITH <properties> ``` where `<properties>` corresponds to both index-specific and view properties [2, 3]. View properties can only be used with indexes implemented with materialized views; for example, it will be impossible to create a vector index when specifying any view property (see examples below). When a view property is provided, it will be applied when creating the underlying materialized view. The behavior should be similar to how other CQL statements responsible for creating schema entities work. High-level implementation strategy ---------------------------------- 1. Make auxiliary changes. 2. Introduce data structures representing the new set of index properties: both index-specific and those corresponding to the underlying view. 3. Extend `CREATE INDEX` to accept view properties. 4. Extend `DESCRIBE INDEX` and other `DESCRIBE` statements to include view properties in their output. User documentation is also updated at the steps to reflect the corresponding changes. Implementation considerations ----------------------------- There are a number of schema properties that are now obsolete. They're accepted by other CQL statements, but they have no effect. They include: * `index_interval` * `replicate_on_write` * `populate_io_cache_on_flush` * `read_repair_chance` * `dclocal_read_repair_chance` If the user tries to create a secondary index specifying any of those keywords, the statement will fail with an appropriate error (see examples below). Unlike materialized views, we forbid specifying the clustering order when creating a secondary index [4]. This limitation may be lifted later on, but it's a detail that may or may not prove troublesome. It's better to postpone covering it to when we have a better perspective on the consequences it would bring. Examples -------- Good examples ``` > CREATE INDEX idx ON ks.t (v); > CREATE INDEX idx ON ks.t (v) WITH comment = 'ok view property'; > CREATE INDEX idx ON ks.t (v) .. WITH comment = 'multiple view properties are ok' .. AND synchronous_updates = true; > CREATE INDEX idx ON ks.t (v) .. WITH comment = 'default value ok' .. AND synchronous_updates = false; ``` Bad examples ``` > CREATE INDEX idx ON ks.t (v) WITH replicate_on_write = true; SyntaxException: Unknown property 'replicate_on_write' > CREATE INDEX idx ON ks.t (v) .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot specify options for a non-CUSTOM index" > CREATE CUSTOM INDEX idx ON ks.t (v) .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="CUSTOM index requires specifying the index class" > CREATE CUSTOM INDEX idx ON ks.t (v) .. USING 'vector_index' .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="You cannot use view properties with a vector index" > CREATE INDEX idx ON ks.t (v) WITH CLUSTERING ORDER BY (v ASC); InvalidRequest: Error from server: code=2200 [Invalid query] message="Indexes do not allow for specifying the clustering order" ``` and so on. For more examples, see the relevant tests. References: [1] https://docs.scylladb.com/manual/branch-2025.4/cql/cql-extensions.html#synchronous-materialized-views [2] https://docs.scylladb.com/manual/branch-2025.4/cql/secondary-indexes.html#create-index [3] https://docs.scylladb.com/manual/branch-2025.4/cql/mv.html#mv-options [4] https://docs.scylladb.com/manual/branch-2025.4/cql/dml/select.html#ordering-clause Fixes scylladb/scylladb#16454 Backport: not needed. This is an enhancement. Closes scylladb/scylladb#24977 * github.com:scylladb/scylladb: cql3: Extend DESC INDEX by view properties cql3: Forbid using CLUSTERING ORDER BY when creating index cql3: Extend CREATE INDEX by MV properties cql3/statements/create_index_statement: Allow for view options cql3/statements/create_index_statement: Rename member cql3/statements/index_prop_defs: Re-introduce index_prop_defs cql3/statements/property_definitions: Add extract_property() cql3/statements/index_prop_defs.cc: Add namespace cql3/statements/index_prop_defs.hh: Rename type cql3/statements/view_prop_defs.cc: Move validation logic into file cql3/statements: Introduce view_prop_defs.{hh,cc} cql3/statements/create_view_statement.cc: Move validation of ID schema/schema.hh: Do not include index_prop_defs.hh	2026-01-14 09:54:27 +02:00
Nikos Dragazis	1e37781d86	schema: Add initializer for compression defaults In PR `5b6570be52` we introduced the config option `sstable_compression_user_table_options` to allow adjusting the default compression settings for user tables. However, the new option was hooked into the CQL layer and applied only to CQL base tables, not to the whole spectrum of user tables: CQL auxiliary tables (materialized views, secondary indexes, CDC log tables), Alternator base tables, Alternator auxiliary tables (GSIs, LSIs, Streams). Fix this by moving the logic into the `schema_builder` via a schema initializer. This ensures that the default compression settings are applied uniformly regardless of how the table is created, while also keeping the logic in a central place. Register the initializer at startup in all executables where schemas are being used (`scylla_main()`, `scylla_sstable_main()`, `cql_test_env`). Finally, remove the ad-hoc logic from `create_table_statement` (redundant as of this patch), remove the xfail markers from the relevant tests and adjust `test_describe_cdc_log_table_create_statement` to expect LZ4WithDicts as the default compressor. Fixes #26914. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2026-01-13 20:45:59 +02:00
Nikos Dragazis	d5ec66bc0c	schema: Generalize static configurators into schema initializers Extend the `static_configurator` mechanism to support initialization of arbitrary schema properties, not only static ones, by passing a `schema_builder` reference to the configurator interface. As part of this change, rename `static_configurator` to `schema_initializer` to better reflect its broader responsibility. Add a checkpoint/restore mechanism to allow de-registering an initializer (useful for testing; will be used in the next patch). Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2026-01-13 20:45:59 +02:00
Nikos Dragazis	5b4aa4b6a6	schema: Initialize static properties eagerly Schemas maintain a set of so-called "static properties". These are not user-visible schema properties; they are internal values carried by in-memory `schema` objects for convenience (`349bc1a9b6`, https://github.com/scylladb/scylladb/pull/13170#issuecomment-1469848086). Currently, the initialization of these properties happens when a `schema_builder` builds a schema (`schema_builder::build()`), by invoking all registered "static configurators". This patch moves the initialization of static properties into the `schema_builder` constructor. With this change, the builder initializes the properties once, stores them in a data member, and reuses them for all schema objects that it builds. This doesn't affect correctness as the values produced by static configurators are "static" by nature; they do not depend on runtime state. In the next patch, we will replace the "static configurator" pattern with a more general pattern that also supports initialization of regular schema properties, not just static ones. Regular properties cannot be initialized in `build()` because users may have already explicitly set values via setters, and there is no way to distinguish between default values and explicitly assigned ones. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2026-01-13 20:45:55 +02:00
Michael Litvak	3a06c32749	schema_registry: fix learning a schema with cdc schema When learning a schema that has a linked cdc schema, we need to learn also the cdc schema, and at the end the schema should point to the learned cdc schema. This is needed because the linked cdc schema is used for generating cdc mutations, and when we process the mutations later it is assumed in some places that the mutation's schema has a schema registry entry. We fix a scenario where we could end up with a schema that points to a cdc schema that doesn't have a schema registry entry. This could happen for example if the schema is loaded before it is learned, so when we learn it we see that it already has an entry. In that case, we need to set the cdc schema to the learned cdc schema as well, because it could have been loaded previously with a cdc schema that was not learned. Fixes scylladb/scylladb#27610 Closes scylladb/scylladb#27704	2025-12-17 20:01:00 +02:00
Dawid Mędrek	df0830044d	cql3: Extend DESC INDEX by view properties We're extending the logic of DESCRIBE INDEX to include properties of the underlying materialized view. Tests are provided to ensure the implementation works as intended.	2025-12-16 11:43:38 +01:00
Dawid Mędrek	11c109c623	schema/schema.hh: Do not include index_prop_defs.hh One of the upcoming commits will lead to a cyclic dependency of headers because `schema.hh` includes `index_prop_defs.hh`. To prevent that, we remove the include and replace it with a manually added alias. This is not a perfect solution, but doing it properly would require comprehensive changes. We can do that in a separate task.	2025-12-15 13:18:48 +01:00
Radosław Cybulski	d589e68642	Add precompiled headers to CMakeLists.txt Add precompiled header support to CMakeLists.txt and configure.py - it improves compilation time by approximately 10%. New header `stdafx.hh` is added, don't include it manually - the compiler will include it for you. The header contains includes from external libraries used by Scylla - seastar, standard library, linux headers and zlib. The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER` or configure.py --disable-precompiled-header to disable. The feature should be disabled, when trying to check headers - otherwise you might get false negatives on missing includes from seastar / abseil and so on. Note: following configuration needs to be added to ccache.conf: sloppiness = pch_defines,time_macros,include_file_mtime,include_file_ctime Closes scylladb/scylladb#26617	2025-11-21 12:27:41 +02:00
Dawid Mędrek	991c0f6e6d	schema/schema_builder.hh: Add set_properties We add a method used for overwriting the properties of a schema. It will be used to create a new schema based on another.	2025-11-17 11:46:32 +01:00
Dawid Mędrek	76b21d7a5a	schema: Add getter for schema::user_properties The getter will be used later to access the user properties and copy them to a fresh `schema_builder`.	2025-11-17 11:46:24 +01:00
Dawid Mędrek	3856c9d376	schema: Remove underscores in fields of schema::user_properties The fields are public, so according to the style guide, they should not start with an underscore.	2025-11-17 11:46:15 +01:00
Dawid Mędrek	5a0fddc9ee	schema: Extract user properties out of raw_schema The properties can be directly manipulated by the user via statements like `ALTER TABLE`. To better organize the structure of `raw_schema`, we encapsulate that data in the form of a dedicated struct. This change will be later used for applying multiple properties to `schema_builder` in one go.	2025-11-17 11:46:07 +01:00
Piotr Dulikowski	7f482c39eb	Merge '[schema] Speculative retry rounding fix' from Dario Mirovic This patch series re-enables support for speculative retry values `0` and `100`. These values have been supported some time ago, before [schema: fix issue 21825: add validation for PERCENTILE values in speculative_retry configuration. #21879 ](https://github.com/scylladb/scylladb/pull/21879). When that PR prevented using invalid `101PERCENTILE` values, valid `100PERCENTILE` and `0PERCENTILE` value were prevented too. Reproduction steps from [[Bug]: drop schema and all tables after apply speculative_retry = '99.99PERCENTILE' #26369](https://github.com/scylladb/scylladb/issues/26369) are unable to reproduce the issue after the fix. A test is added to make sure the inclusive border values `0` and `100` are supported. Documentation is updated to give more information to the users. It now states that these border values are inclusive, and also that the precision, with automatic rounding, is 1 decimal digit. Fixes #26369 This is a bug fix. If at any time a client tries to use value >= 99.5 and < 100, the raft error will happen. Backport is needed. The code which introduced inconsistency is introduced in 2025.2, so no backporting to 2025.1. Closes scylladb/scylladb#26909 * github.com:scylladb/scylladb: test: cqlpy: add test case for non-numeric PERCENTILE value schema: speculative_retry: update exception type for sstring ops docs: cql: ddl.rst: update speculative-retry-options test: cqlpy: add test for valid speculative_retry values schema: speculative_retry: allow 0 and 100 PERCENTILE values	2025-11-13 15:27:45 +01:00
Dario Mirovic	85f059c148	schema: speculative_retry: update exception type for sstring ops Change speculative_retry::to_sstring and speculative_retry::from_sstring to throw exceptions::configuration_exception instead of std::invalid_argument. These errors can be triggered by CQL, so appropriate CQL exception should be used. Reference: https://github.com/scylladb/scylladb/issues/24748#issuecomment-3025213304 Refs #26369	2025-11-09 13:55:57 +01:00
Dario Mirovic	da2ac90bb6	schema: speculative_retry: allow 0 and 100 PERCENTILE values This patch allows specifying 0 and 100 PERCENTILE values in speculative_retry. It was possible to specify these values before #21825. #21825 prevented specifying invalid values, like -1 and 101, but also prevented using 0 and 100. On top of that, speculative_retry::to_sstring function did rounding when formatting the string, which introduced inconsistency. Fixes #26369	2025-11-09 12:26:27 +01:00
Michael Litvak	ac96e40f13	schema: add pointer to CDC schema Add to the schema object a member that points to the CDC schema object that is compatible with this schema, if any. The compatible CDC schema is created and altered with its base schema in the same group0 operation. When generating CDC log mutations for some base mutation we want them to be created using a compatible schema thas has a CDC column corresponding to each base column. This change will allow us to find the right CDC schema given a base mutation. We also update the relevant structures in the schema registry that are related to learning about schemas and transporting schemas across shards or nodes. When transporting a schema as frozen_schema, we need to transport the frozen cdc schema as well, and set it again when unfreezing and reconstructing the schema. When adding a schema to the registry, we need to ensure its CDC schema is added to the registry as well. Currently we always set the CDC schema to nullptr and maintain the previous behavior. We will change it in a later commit. Until then, we mark all places where CDC schema is passed clearly so we don't forget it.	2025-10-21 14:13:43 +02:00
Michael Litvak	60f5c93249	schema_registry: remove base_info from global_schema_ptr remove the _base_info member from global_schema_ptr, and used the base_info we have stored in the schema registry entry instead. Currently when constructing a global_schema_ptr from a schema_ptr it extracts and stores the base_info from the schema_ptr. Later it uses it to reconstruct the schema_ptr, together with the frozen schema from the schema registry entry. But we can use the base_info that is already stored in the schema registry entry.	2025-10-21 14:13:43 +02:00
Michael Litvak	085abef05d	schema_registry: use extended_frozen_schema in schema load Change the schema loader type in the schema_registry to return a extended_frozen_schema instead of view_schema_and_base_info, and remove view_schema_and_base_info which is not used anymore. The casting between them is trivial.	2025-10-21 14:13:43 +02:00
Michael Litvak	8c7c1db14b	schema_registry: replace frozen_schema+base_info with extended_frozen_schema The schema_registry_entry holds a frozen_schema and a base_info. The base_info is extracted from the schema_ptr on load of a schema_ptr, and it is used when unfreezing the schema. But this is exactly what extended_frozen_schema is doing, so we can just store an object of this type in the schema_registry_entry. This makes the code simpler because the schema registry doesn't need to be aware of the base_info.	2025-10-21 14:13:43 +02:00
Michael Litvak	278801b2a6	frozen_schema: extract info from schema_ptr in the constructor Currently we construct a frozen schema with base info in few places, and the caller is responsible for constructing the frozen schema and extracting the base info if it's a view table. We change it to make it simpler and remove the burden from the caller. The caller can simply pass the schema_ptr, and the constructor for extended_frozen_schema will construct the frozen schema and extract the additional info it needs. This will make it easier to add additional fields, and reduces code duplication. We also make temporary castings between extended_frozen_schema and view_schema_and_base_info for the transition, which are trivial, until they are combined to a single type.	2025-10-21 14:13:42 +02:00
Michael Litvak	154d5c40c8	frozen_schema: rename frozen_schema_with_base_info to extended_frozen_schema This commit starts a series of refactoring commits of the frozen_schema to reduce duplication and make it easier to extend. Currently there are two essentially identical types, frozen_schema_with_base_info and view_schema_and_base_info in the schema_registry that hold a frozen_schema together with a base_info for view schemas. Their role is to pass around a frozen schema together with additional info that is extracted from the schema and passed around with it when transporting it across shards or nodes, and is needed for reconstructing it, and it is not part of the schema mutations. Our goal is to combine them to a single type that we will call extended_frozen_schema.	2025-10-21 14:13:42 +02:00
Tomasz Grabiec	b6df186e54	schema: Use definition from the header instead of open-coding it	2025-10-01 16:06:52 +02:00
Botond Dénes	86ed627fc4	compaction: move code to namespace compaction The namespace usage in this directory is very inconsistent, with files and classes scattered in: * global namespace * namespace compaction * namespace sstables With cases, where all three used in the same file. This code used to live in sstables/ and some of it still retains namespace sstables as a heritage of that time. The mismatch between the dir (future module) and the namespace used is confusing, so finish the migration and move all code in compaction/ to namespace compaction too. This patch, although large, is mechanic and only the following kind of changes are made: * replace namespace sstable {} with namespace compaction {} * add namespace compaction {} * drop/add sstables:: * drop/add compaction:: * move around forward-declarations so they are in the correct namespace context This refactoring revealed some awkward leftover coupling between sstables and compaction, in sstables/sstable_set.cc, where the make_sstable_set() methods of compaction strategies are implemented.	2025-09-25 15:03:56 +03:00
Ernest Zaslavsky	5ba5aec1f8	treewide: Move mutation related files to a `mutation` directory As requested in #22104, moved the files and fixed other includes and build system. Moved files: - combine.hh - collection_mutation.hh - collection_mutation.cc - converting_mutation_partition_applier.hh - converting_mutation_partition_applier.cc - counters.hh - counters.cc - timestamp.hh Fixes: #22104 This is a cleanup, no need to backport Closes scylladb/scylladb#25085	2025-09-24 13:23:38 +03:00
Ernest Zaslavsky	a1f18a8883	treewide: Move schema related files to a `schema` directory As requested in #22111 , moved the files and fixed other includes and build system. Moved files: - frozen_schema.hh - frozen_schema.cc - schema_mutations.hh - schema_mutations.cc - column_computation.hh Fixes: #22111 Closes scylladb/scylladb#25089	2025-09-17 17:31:05 +03:00
Radosław Cybulski	c242234552	Revert "build: add precompiled headers to CMakeLists.txt" This reverts commit `01bb7b629a`. Closes scylladb/scylladb#25735	2025-09-03 09:46:00 +03:00
Radosław Cybulski	01bb7b629a	build: add precompiled headers to CMakeLists.txt Add precompiled header support to CMakeLists.txt and configure.py - it improves compilation time by approximately 10%. New header `stdafx.hh` is added, don't include it manually - the compiler will include it for you. The header contains includes from external libraries used by Scylla - seastar, standard library, linux headers and zlib. The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER` or configure.py --disable-precompiled-header to disable. The feature should be disabled, when trying to check headers - otherwise you might get false negatives on missing includes from seastar / abseil and so on. Note: following configuration needs to be added to ccache.conf: sloppiness = pch_defines,time_macros Closes #25182	2025-08-27 21:37:54 +03:00
Calle Wilund	43f7eecf9e	compress: move compress.cc/hh to sstables/compressor Fixes #22106 Moves the shared compress components to sstables, and rename to match class type. Adjust includes, removing redundant/unneeded ones where possible. Closes scylladb/scylladb#25103	2025-07-31 13:10:41 +03:00
Ernest Zaslavsky	d2c5765a6b	treewide: Move keys related files to a new keys directory As requested in #22102, #22103 and #22105 moved the files and fixed other includes and build system. Moved files: - clustering_bounds_comparator.hh - keys.cc - keys.hh - clustering_interval_set.hh - clustering_key_filter.hh - clustering_ranges_walker.hh - compound_compat.hh - compound.hh - full_position.hh Fixes: #22102 Fixes: #22103 Fixes: #22105 Closes scylladb/scylladb#25082	2025-07-25 10:45:32 +03:00
Marcin Maliszkiewicz	19bc6ffcb0	replica: make truncate_table_on_all_shards get whole schema from table_shards Before for views and indexes it was fetching base schema from db (and couple other properties). This is a problem once we introduce atomic tables and views deletion (in the following commit). Because once we delete table it can no longer be fetched from db object, and truncation is performed after atomically deleting all relevant tables/views/indexes. Now the whole relevant schema will be fetched via global_table_ptr (table_shards) object.	2025-07-10 10:40:43 +02:00
Dawid Mędrek	ac9062644f	cql3: Represent create_statement using managed_string When describing a table, we need to do it carefully: if some columns were dropped, we must specify that explicitly by ``` ALTER TABLE {table} DROP {column} USING TIMESTAMP ... ``` in the result of the DESCRIBE statement. Failing to do so could lead to data resurrection. However, if a table has been altered many, many times, we might end up with a huge create statement. Constructing it could, in turn, trigger an oversized allocation. Some tests ran into that very problem in fact. In this commit, we want to mitigate the problem: instead of allocating a contiguous chunk of memory for the create statement, we use `fragmented_ostringstream` and `managed_string` to possibly keep data scattered in memory. It makes handling `cql3::description` less convenient in the code, but since the struct is pretty much immediately serialized after creating it, it's a very good trade-off. We provide a reproducer. It consistently passes with this commit, while having about 50% chance of failure before it (based on my own experiments). Playing with the parameters of the test doesn't seem to improve that chance, so let's keep it as-is. Fixes scylladb/scylladb#24018	2025-07-01 12:58:02 +02:00
Karol Nowacki	4577c66a04	cql, schema: Extend name length limit from 48 to 192 bytes This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes. The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389) and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint. This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases. The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data. When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID. For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name. The directory name for this log table becomes the longest possible representation. Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas. To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows: 255 bytes (common filesystem limit for a path component) - 32 bytes (for the 32-character UUID string) - 1 byte (for the '-' separator) - 15 bytes (for the '_scylla_cdc_log' suffix) - 15 bytes (reserved for future use) ---------- = 192 bytes (Maximum allowed name length) This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038). This patch also updates/adds all associated tests to validate the new 192-byte limit. The documentation has been updated accordingly.	2025-06-18 14:08:38 +02:00
Avi Kivity	cd79a8fc25	Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz" This reverts commit `0b516da95b`, reversing changes made to `30199552ac`. It breaks cluster.random_failures.test_random_failures.test_random_failures in debug mode (at least). Fixes #24513	2025-06-16 22:38:12 +03:00
Marcin Maliszkiewicz	a27776b4ff	replica: make truncate_table_on_all_shards get whole schema from table_shards Before for views and indexes it was fetching base schema from db (and couple other properties). This is a problem once we introduce atomic tables and views deletion (in the following commit). Because once we delete table it can no longer be fetched from db object, and truncation is performed after atomically deleting all relevant tables/views/indexes. Now the whole relevant schema will be fetched via global_table_ptr (table_shards) object.	2025-06-06 08:50:33 +02:00
Nadav Har'El	d2844055ad	Merge 'index: implement schema management layer for vector search indexes' from null This pull request adds support for creating custom indexes (at a metadata level) as long as a supported custom class is provided (currently only vector search). The patch contains: - a change in CREATE INDEX statement that allows for the USING keyword to be present as long as one of the supported classes is used - support for describing custom indexes in the DESCRIBE statement - unit tests Co-authored by: @Balwancia Closes scylladb/scylladb#23720 * github.com:scylladb/scylladb: test/cqlpy: add custom index tests index: support storing metadata for custom indices	2025-05-22 12:19:36 +03:00
Michał Hudobski	05daa8dded	index: support storing metadata for custom indices Added function returning custom index class name. Added printing custom index class name when using DESCRIBE. Changed validation to reflect current support of indices.	2025-05-14 09:32:00 +02:00
Wojciech Mitros	d77f11d436	base_info: remove the lw_shared_ptr variant The base_dependent_view_info is no longer needed to be shared or modified in the view_info, so we no longer need to keep it as a shared pointer.	2025-04-24 01:08:40 +02:00
Wojciech Mitros	d7bd86591e	view_info: don't re-set base_info after construction In the previous commits we made sure that the base info is not dependent on the base schema version, and the info dependent on the base schema version is calculated when it's needed. In this patch we remove the unnecessary re-setting of the base_info. The set_base_info method isn't removed completely, because it also has a secondary function - zeroing the view_info fields other than base_info. Because of this, in this patch we rename it accordingly and limit its use to the updates caused by a base schema change.	2025-04-24 01:08:40 +02:00
Wojciech Mitros	ad55935411	base_info: remove base schema from the base_info The base info now only contains values which are not reliant on the base schema version. We remove the the base schema from the base info to make it immutable regardless of base schema version, at the point of this patch it's also not needed anywhere - the new base info can replace the base schema in most places, and in the few (view_updates) where we need it, we pull the most recent base schema version from the database. After this change, the base info no longer changes in a view schema after creation, so we'll no longer get errors when we try generating view updates with a base_info that's incompatible with a specific base schema version. Fixes #9059 Fixes #21292 Fixes #22410	2025-04-24 01:08:39 +02:00
Wojciech Mitros	05fce91945	schema_registry: store base info instead of base schema for view entries In the following patch we plan to remove the base schema from the base_info to make the base_info immutable. To do that, we first prepare the schema registry for the change; we need to be able to create view schemas from frozen schemas there and frozen schemas have no information about the base table. Unless we do this change, after base schemas are removed from the base info, we'll no longer be able to load a view schema to the schema registry without looking up the base schema in the database. This change also required some updates to schema building: * we add a method for unfreezing a view schema with base info instead of a base schema * we make it possible to use schema_builder with a base info instead of a base schema * we add a method for creating a view schema from mutations with a base info instead of a base schema * we add a view_info constructor withat base info instead of a base schema * we update the naming in schema_registry to reflect the usage of base info instead of base schema	2025-04-24 01:08:39 +02:00
Wojciech Mitros	900687c818	view_info: set base info on construction Currently, the base_info may or may not be set in view schemas. Even when it's set, it may be modified. This necessitates extra checks when handling view schemas, as well as potentially causing errors when we forget to set it at some point. Instead, we want to make the base info an immutable member of view schemas (inside view_info). The first step towards that is making sure that all newly created schemas have the base info set. We achieve that by requiring a base schema when constructing a view schema. Unfortunately, this adds complexity each time we're making a view schema - we need to get the base schema as well. In most cases, the base schema is already available. The most problematic scenario is when we create a schema from mutations: - when parsing system tables we can get the schema from the database, as regular tables are parsed before views - when loading a view schema using the schema loader tool, we need to load the base additionally to the view schema, effectively doubling the work - when pulling the schema from another node - in this case we can only get the current version of the base schema from the local database Additionally, we need to consider the base schema version - when we generate view updates the version of the base schema used for reads should match the version of the base schema in view's base info. This is achieved by selecting the correct (old or new) schema in `db::schema_tables::merge_tables_and_views` and using the stored base schema in the schema_registry.	2025-04-24 01:08:39 +02:00
Avi Kivity	a62ab824e6	schema: deprecate schema_extension schema_extension allows making invisible changes to system_schema that evade upgrade rollback tests. They appear in system_schema as an encoded blob which reduces serviceability, as they cannot be read. Deprecate it and point users to adding explicit columns in scylla_tables. We could probably make use of the data structure, after we teach it to encode its payload into proper named and typed columns instead of using IDL. Closes scylladb/scylladb#23151	2025-03-19 20:36:16 +02:00
Pavel Emelyanov	529ff3efa5	Merge 'Alternator: implement UpdateTable operation to add or delete GSI' from Nadav Har'El In this series we implement the UpdateTable operation to add a GSI to an existing table, or remove a GSI from a table. As the individual commit messages will explained, this required changing how Alternator stores materialized view keys - instead of insisting that these key must be real columns (that is not the case when adding a GSI to an existing table), the materialized view can now take as its key any Alternator attribute serialized inside the ":attrs" map holding all non-key attributes. Fixes #11567. We also fix the IndexStatus and Backfilling attributes returned by DescribeTable - as DynamoDB API users use this API to discover when a newly added GSI completed its "backfilling" (what we call "view building") stage. Fixes #11471. This series should not be backported lightly - it's a new feature and required fairly large and intrusive changes that can introduce bugs to use cases that don't even use Alternator or its UpdateTable operations - every user of CQL materialized views or secondary indexes, as well as Alternator GSI or LSI, will use modified code. It should be backported to 2025.1, though - this version was actually branched long after this PR was sent, and it provides a feature that was promised for 2025.1. Closes scylladb/scylladb#21989 * github.com:scylladb/scylladb: alternator: fix view build on oversized GSI key attribute mv: clean up do_delete_old_entry test/alternator: unflake test for IndexStatus test/alternator: work around unrelated bug causing test flakiness docs/alternator: adding a GSI is no longer an unimplemented feature test/alternator: remove xfail from all tests for issue 11567 alternator: overhaul implementation of GSIs and support UpdateTable mv: support regular_column_transformation key columns in view alternator: add new materialized-view computed column for item in map build: in cmake build, schema needs alternator build: build tests with Alternator alternator: add function serialized_value_if_type() mv: introduce regular_column_transformation, a new type of computed column alternator: add IndexStatus/Backfilling in DescribeTable alternator: add "LimitExceededException" error type docs/alternator: document two more unimplemented Alternator features	2025-02-11 10:02:01 +03:00
Nikita Kurashkin	025bb379a4	cql: remove expansion of "SELECT " in DESC MATERIALIZED VIEW This patch removes expansion of "SELECT " in DESC MATERIALIZED VIEW. Instead of explicitly printing each column, DESC command will now just use SELECT *, if view was created with it. Also, adds a correspodning test. Fixes #21154 Closes scylladb/scylladb#21962	2025-02-10 15:01:23 +02:00
Nadav Har'El	ea87b9fff0	alternator: add new materialized-view computed column for item in map This patch adds a new computed column class for materialized views, extract_from_attrs_column_computation which is Alternator-specific and knows how to extract a value (of a known type) from an attribute stored in Alternator's map-of-all-nonkey- attributes ":attrs". We'll use this new computed column in the next patch to reimplement GSI. The new computed-column class is based on regular_column_transformation introduced in the previous patch. It is not yet wired to anything: The MV code cannot handle any regular_column_transformation yet, and Alternator will not yet use it to create a GSI. We'll do those things in the following patches. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:48 +01:00
Nadav Har'El	e8d1e8a515	build: in cmake build, schema needs alternator This patch is to cmake what the previous patch was to configure.py. In the next patch we want to make schema/schema.o depend on alternator/executor.o - because when the schema has an Alternator computed column, the schema code needs to construct the computed column object (extract_from_attrs_column_computation) and that lives in alternator/executor.o. In the cmake-based build, all the schema/* objects are put into one library "libschema.a". But code that uses this library (e.g., tests) can't just use that library alone, because it depends on other code not in schema/. So CMakeLists.txt lists other "libraries" that libschema.a depends on - including for example "cql3". We now need to add "alternator" to this dependency list. The dependency is marked "PRIVATE" - schema needs alternator for its own internal uses, but doesn't need to export alternator's APIs to its own users. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-02-06 09:59:48 +01:00
Benny Halevy	c5668d99c9	schema: add per-table tablet options Unlike with vnodes, each tablet is served only by a single shard, and it is associated with a memtable that, when flushed, it creates sstables which token-range is confined to the tablet owning them. On one hand, this allows for far better agility and elasticity since migration of tablets between nodes or shards does not require rewriting most if not all of the sstables, as required with vnodes (at the cleanup phase). Having too few tablets might limit performance due not being served by all shards or by imbalance between shards caused by quantization. The number of tabelts per table has to be a power of 2 with the current design, and when divided by the number of shards, some shards will serve N tablets, while others may serve N+1, and when N is small N+1/N may be significantly larger than 1. For example, with N=1, some shards will serve 2 tablet replicas and some will serve only 1, causing an imbalance of 100%. Now, simply allocating a lot more tablets for each table may theoretically address this problem, but practically: a. Each tablet has memory overhead and having too many tablets in the system with many tables and many tablets for each of them may overwhelm the system's and cause out-of-memory errors. b. Too-small tablets cause a proliferation of small sstables that are less efficient to acces, have higher metadata overhead (due to per-sstable overhead), and might exhaust the system's open file-descriptors limitations. The options introduced in this change can help the user tune the system in two ways: 1. Sizing the table to prevent unnecessary tablet splits and migrations. This can be done when the table is created, or later on, using ALTER TABLE. 2. Controlling min_per_shard_tablet_count to improve tablet balancing, for hot tables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:55:51 +02:00
aberry-21	69a0431cce	schema: add validation for PERCENTILE values in `speculative_retry` configuration This commit addresses issue #21825, where invalid PERCENTILE values for the `speculative_retry` setting were not properly handled, causing potential server crashes. The valid range for PERCENTILE is between 0 and 100, as defined in the documentation for speculative retry options, where values above 100 or below 0 are invalid and should be rejected. The added validation ensures that such invalid values are rejected with a clear error message, improving system stability and user experience. Fixes #21825 Closes scylladb/scylladb#21879	2025-01-30 11:34:46 +02:00
Avi Kivity	a23a3110b5	utils: config_file: forward_declare boost::program_options classes Avoid pulling in boost dependencies when all we need is the class name. Closes scylladb/scylladb#22453	2025-01-27 10:45:43 +03:00

1 2 3

136 Commits