scylladb

Author	SHA1	Message	Date
Botond Dénes	122b7847e5	Merge 'index: Accept view properties in CREATE INDEX' from Dawid Mędrek Problem ------- Secondary indexes are implemented via materialized views under the hood. The way an index behaves is determined by the configuration of the view. Currently, it can be modified by performing the CQL statement `ALTER MATERIALIZED VIEW` on it. However, that raises some concerns. Consider, for instance, the following scenario: 1. The user creates a secondary index on a table. 2. In parallel, the user performs writes to the base table. 3. The user modifies the underlying materialized view, e.g. by setting the `synchronous_updates` to `true` [1]. Some of the writes that happened before step 3 used the default value of the property (which is `false`). That had an actual consequence on what happened later on: the view updates were performed asynchronously. Only after step 3 had finished did it change. Unfortunately, as of now, there is no way to avoid a situation like that. Whenever the user wants to configure a secondary index they're creating, they need to do it in another schema change. Since it's not always possible to control how the database is manipulated in the meantime, it leads to problems like the one described. That's not all, though. The fact that it's not possible to configure secondary indexes is inconsistent with other schema entities. When it comes to tables or materialized views, the user always have a means to set some or even all of the properties during their creation. Solution -------- The solution to this problem is extending the `CREATE INDEX` CQL statement by view properties. The syntax is of form: ``` > CREATE INDEX <index name> > .. ON <keyspace>.<table> (<columns>) > .. WITH <properties> ``` where `<properties>` corresponds to both index-specific and view properties [2, 3]. View properties can only be used with indexes implemented with materialized views; for example, it will be impossible to create a vector index when specifying any view property (see examples below). When a view property is provided, it will be applied when creating the underlying materialized view. The behavior should be similar to how other CQL statements responsible for creating schema entities work. High-level implementation strategy ---------------------------------- 1. Make auxiliary changes. 2. Introduce data structures representing the new set of index properties: both index-specific and those corresponding to the underlying view. 3. Extend `CREATE INDEX` to accept view properties. 4. Extend `DESCRIBE INDEX` and other `DESCRIBE` statements to include view properties in their output. User documentation is also updated at the steps to reflect the corresponding changes. Implementation considerations ----------------------------- There are a number of schema properties that are now obsolete. They're accepted by other CQL statements, but they have no effect. They include: * `index_interval` * `replicate_on_write` * `populate_io_cache_on_flush` * `read_repair_chance` * `dclocal_read_repair_chance` If the user tries to create a secondary index specifying any of those keywords, the statement will fail with an appropriate error (see examples below). Unlike materialized views, we forbid specifying the clustering order when creating a secondary index [4]. This limitation may be lifted later on, but it's a detail that may or may not prove troublesome. It's better to postpone covering it to when we have a better perspective on the consequences it would bring. Examples -------- Good examples ``` > CREATE INDEX idx ON ks.t (v); > CREATE INDEX idx ON ks.t (v) WITH comment = 'ok view property'; > CREATE INDEX idx ON ks.t (v) .. WITH comment = 'multiple view properties are ok' .. AND synchronous_updates = true; > CREATE INDEX idx ON ks.t (v) .. WITH comment = 'default value ok' .. AND synchronous_updates = false; ``` Bad examples ``` > CREATE INDEX idx ON ks.t (v) WITH replicate_on_write = true; SyntaxException: Unknown property 'replicate_on_write' > CREATE INDEX idx ON ks.t (v) .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot specify options for a non-CUSTOM index" > CREATE CUSTOM INDEX idx ON ks.t (v) .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="CUSTOM index requires specifying the index class" > CREATE CUSTOM INDEX idx ON ks.t (v) .. USING 'vector_index' .. WITH OPTIONS = {'option1': 'value1'} .. AND comment = 'some text'; InvalidRequest: Error from server: code=2200 [Invalid query] message="You cannot use view properties with a vector index" > CREATE INDEX idx ON ks.t (v) WITH CLUSTERING ORDER BY (v ASC); InvalidRequest: Error from server: code=2200 [Invalid query] message="Indexes do not allow for specifying the clustering order" ``` and so on. For more examples, see the relevant tests. References: [1] https://docs.scylladb.com/manual/branch-2025.4/cql/cql-extensions.html#synchronous-materialized-views [2] https://docs.scylladb.com/manual/branch-2025.4/cql/secondary-indexes.html#create-index [3] https://docs.scylladb.com/manual/branch-2025.4/cql/mv.html#mv-options [4] https://docs.scylladb.com/manual/branch-2025.4/cql/dml/select.html#ordering-clause Fixes scylladb/scylladb#16454 Backport: not needed. This is an enhancement. Closes scylladb/scylladb#24977 * github.com:scylladb/scylladb: cql3: Extend DESC INDEX by view properties cql3: Forbid using CLUSTERING ORDER BY when creating index cql3: Extend CREATE INDEX by MV properties cql3/statements/create_index_statement: Allow for view options cql3/statements/create_index_statement: Rename member cql3/statements/index_prop_defs: Re-introduce index_prop_defs cql3/statements/property_definitions: Add extract_property() cql3/statements/index_prop_defs.cc: Add namespace cql3/statements/index_prop_defs.hh: Rename type cql3/statements/view_prop_defs.cc: Move validation logic into file cql3/statements: Introduce view_prop_defs.{hh,cc} cql3/statements/create_view_statement.cc: Move validation of ID schema/schema.hh: Do not include index_prop_defs.hh	2026-01-14 09:54:27 +02:00
tomek7667	19313d67e3	docs/cql/ddl.rst: fix formatting of deprecated initial sub-option Closes scylladb/scylladb#26852	2026-01-13 08:55:24 +02:00
Botond Dénes	60570d7114	Merge 'topology coordinator: restrict node join/remove to preserve RF-rack validity' from Michael Litvak Allow creating materialized views and secondary indexes in a tablets keyspace only if it's RF-rack-valid, and enforce RF-rack-validity while the keyspace has views by restricting some operations: * Altering a keyspace's RF if it would make the keyspace RF-rack-invalid * Adding a node in a new rack * Removing / Decommissioning the last node in a rack Previously the config option `rf_rack_valid_keyspaces` was required for creating views. We now remove this restriction - it's not needed because we always maintain RF-rack-validity for keyspaces with views. The restrictions are relevant only for keyspaces with numerical RF. Keyspace with rack-list-based RF are always RF-rack-valid. Fixes scylladb/scylladb#23345 Fixes https://github.com/scylladb/scylladb/issues/26820 backport to relevant versions for materialized views with tablets since it depends on rf-rack validity Closes scylladb/scylladb#26354 * github.com:scylladb/scylladb: docs: update RF-rack restrictions cql3: don't apply RF-rack restrictions on vector indexes cql3: add warning when creating mv/index with tablets about rf-rack service/tablet_allocator: always allow tablet merge of tables with views locator: extend rf-rack validation for rack lists test: test rf-rack validity when creating keyspace during node ops locator: fix rf-rack validation during node join/remove test: test topology restrictions for views with tablets test: add test_topology_ops_with_rf_rack_valid topology coordinator: restrict node join/remove to preserve RF-rack validity topology coordinator: add validation to node remove locator: extend rf-rack validation functions view: change validate_view_keyspace to allow MVs if RF=Racks db: enforce rf-rack-validity for keyspaces with views replica/db: add enforce_rf_rack_validity_for_keyspace helper db: remove enforce parameter from check_rf_rack_validity test: adjust test to not break rf-rack validity	2026-01-09 10:01:23 +02:00
Nadav Har'El	384e394ff0	Merge 'Add similarity functions to calculate similarity of given vectors' from Dawid Pawlik It should be possible to return the similarity of vectors in CQL statements following the [Cassandra compatible syntax](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql): ``` SELECT comment, similarity_cosine(comment_vector, [0.1, 0.15, 0.3, 0.12, 0.05]) FROM cycling.comments_vs; ``` Although the calculations are slow, and we already have calculated results returned via Vector Store API, we need the functionality as it allows us to calculate similarity of vectors not stored in vector indexes. It will be needed for [quantization and rescoring](https://scylladb.atlassian.net/wiki/spaces/RND/pages/195985800/Quantization+and+Rescoring). The feature is also a nice-to-have in testing as requested many times by testing and CX teams. The optimized version utilizing already calculated distances from Vector Store without a need of rescoring will be coming soon after via https://github.com/scylladb/scylladb/pull/27991. --- The patch adds functions: - `similarity_cosine(<vector>, <vector>)`, - `similarity_euclidean(<vector>, <vector>)`, - `similarity_dot_product(<vector>, <vector>)` Where `<vector>` is either a column of type `VECTOR<FLOAT, N>` or a vector of floats literal. These functions can be called with every `SELECT` query, not only ANN vector queries as opposed to https://github.com/scylladb/scylladb/pull/25993. The similarity calculations are implemented inspired by [USearch's implementation]( `a2f1759910/include/usearch/index_plugins.hpp (L1304-L1385)`) and made compatible with [Cassandra's documentation](https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/functions.html#vector-similarity-functions). That would guarantee the results in ScyllaDB are calculated using the exact same algorithms as used in Vector Store indexes. --- Fixes: SCYLLADB-88 Fixes: SCYLLADB-89 New feature, should land into 2026.1 Closes scylladb/scylladb#27524 * github.com:scylladb/scylladb: docs: add vector similarity functions documentation test/cqlpy: add similarity functions correctness tests test/cqlpy: add similarity functions invalid call tests cql3: introduce similarity functions syntax vector_similarity_fcts: introduce similarity functions vector_similarity_fcts: retrieve similarity function argument types vector_similarity_fcts: add calculating similarity between vectors	2026-01-05 18:28:10 +02:00
Dawid Pawlik	c0b06a7fc6	docs: add vector similarity functions documentation Add documentation in `functions.rst` as the CQL reference for a vector similarity functions. This includes the syntax, example usage, and prerequisites for the parameters.	2026-01-02 13:02:59 +01:00
Nikos Dragazis	20ff2fcc18	docs: Amend limitations for keyspace RF changes The doc about DDL statements claims that an `ALTER KEYSPACE` will fail in the presence of an ongoing global topology operation. This limitation was specifically referring to RF changes, which Scylla implements as global topology requests (`keyspace_rf_change`), and it was true when it was first introduced (`1b913dd880`) because there was no global topology request queue at that time, so only one ongoing global request was allowed in the cluster. This limitation was lifted with the introduction of the global topology request queue (`6489308ebc`), and it was re-introduced again very recently (`2e7ba1f8ce`) in a slightly different form; it now applies only to RF changes (not to any request type) and only those that affect the same keyspace. None of these two changes were ever reflected in the doc. Synchronize the doc with the current state. Fixes #27776. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#27786	2025-12-22 20:02:40 +02:00
Michael Litvak	9f8aea21e3	docs: update RF-rack restrictions Update the documentation about restrictions to tablets keyspaces related to RF-rack. * MV/SI require the keyspace to be RF-rack-valid * topology operations are restricted if a keyspace has views to preserve RF-rack-validity	2025-12-22 09:21:07 +01:00
Michael Litvak	33f7bc28da	docs: document restrictions of colocated tables Currently some things are not supported for colocated tables: it's not possible to repair a colocated table, and due to this it's also not possible to use the tombstone_gc=repair mode on a colocated table. Extend the documentation to explain what colocated tables are and document these restrictions. Fixes scylladb/scylladb#27261 Closes scylladb/scylladb#27516	2025-12-18 15:38:29 +01:00
Dawid Mędrek	d0a42852c0	cql3: Forbid using CLUSTERING ORDER BY when creating index This is a temporary solution as handling this property may require a bit more attention or at least a bit more focus. For now, let's forbid using it so it's clear it won't get applied. A simple test is provided to cover it. We document the restriction.	2025-12-16 11:43:38 +01:00
Dawid Mędrek	6541362a0b	cql3: Extend CREATE INDEX by MV properties After the previous patch that extended the grammar and provided basic functionalities to accommodate properties of materialized views in indexes, this commit takes another step and actually applies them to the underlying view when it's being created. We're providing validation tests for each property, with the single exception of CLUSTERING ORDER BY. That one will be handled separately in an upcoming commit. We also update the user documentation.	2025-12-16 11:43:38 +01:00
Botond Dénes	83f46fa7f5	doc: add video link for TTL Closes: #26210	2025-12-16 10:43:03 +02:00
Botond Dénes	9d2f7c3f52	Merge 'mv: allow setting concurrency in PRUNE MATERIALIZED VIEW' from Wojciech Mitros The PRUNE MATERALIZED VIEW statement is performed as follows: 1. Perform a range scan of the view table from the view replicas based on the ranges specified in the statement. 2. While reading the paged scan above, for each view row perform a read from all base replicas at the corresponding primary key. If a discrepancy is detected, delete the row in the view table. When reading multiple rows, this is very slow because for each view row we need to performe a single row query on multiple replicas. In this patch we add an option to speed this up by performing many of the single base row reads concurrently, at the concurrency specified in the USING CONCURRENCY clause. Aside from the unit test, I checked manually on a 3-node cluster with 10M rows, using vnodes. There were actually no ghost rows in the test, but we still had to iterate over all view rows and read the corresponding base rows. And actual ghost rows, if there are any, should be a tiny fraction of all rows. I compared concurrencies 1,2,10,100 and the results were: * Pruning with concurrency 1 took total 1416 seconds * Pruning with concurrency 2 took total 731 seconds * Pruning with concurrency 10 took total 234 seconds * Pruning with concurrency 100 took total 171 seconds So after a concurrency of 10 or so we're hitting diminishing returns (at least in this setup). At that point we may be no longer bottlenecked by the reads, but by CPU on the shard that's handling the PRUNE Fixes https://github.com/scylladb/scylladb/issues/27070 Closes scylladb/scylladb#27097 * github.com:scylladb/scylladb: mv: allow setting concurrency in PRUNE MATERIALIZED VIEW cql: add CONCURRENCY to the USING clause	2025-12-04 11:47:41 +02:00
Wojciech Mitros	323e5cd171	mv: allow setting concurrency in PRUNE MATERIALIZED VIEW The PRUNE MATERALIZED VIEW statement is performed as follows: 1. Perform a range scan of the view table from the view replicas based on the ranges specified in the statement. 2. While reading the paged scan above, for each view row perform a read from all base replicas at the corresponding primary key. If a discrepancy is detected, delete the row in the view table. When reading multiple rows, this is very slow because for each view row we need to performe a single row query on multiple replicas. In this patch we add an option to speed this up by performing many of the single base row reads concurrently, at the concurrency specified in the USING CONCURRENCY clause. Fixes https://github.com/scylladb/scylladb/issues/27070	2025-11-27 00:02:28 +01:00
tomek7667	9bbdd487b4	docs: insert.rst: Update insert example by removing 'year' column Closes scylladb/scylladb#26862	2025-11-26 06:55:28 +02:00
tomek7667	2138ab6b0e	docs: insert.rst: fix INSERT statement for NerdMovies example Closes scylladb/scylladb#26863	2025-11-26 06:53:45 +02:00
tomek7667	90a6aa8057	docs: ddl.rst: Fix formatting of null value note Closes scylladb/scylladb#26853	2025-11-26 06:52:18 +02:00
Anna Stuchlik	724dc1e582	doc: fix the info about object storage This commit fixes the information about object storage: - Object storage configuration is no longer marked as experimental. - Redundant information has been removed from the description. - Information related to object storage for SStabels has been removed as the feature is not working. Fixes https://github.com/scylladb/scylladb/issues/26985 Closes scylladb/scylladb#26987	2025-11-24 17:16:33 +03:00
Szymon Wasik	f714876eaf	Add documentation about lack of returning similarity distances This patch adds the missing warning about the lack of possibility to return the similarity distance. This will be added in the next iteration. Fixes #27086 It has to be backported to 2025.4 as this is the limitation in 2025.4. Closes scylladb/scylladb#27096	2025-11-18 13:50:36 +01:00
Piotr Dulikowski	7f482c39eb	Merge '[schema] Speculative retry rounding fix' from Dario Mirovic This patch series re-enables support for speculative retry values `0` and `100`. These values have been supported some time ago, before [schema: fix issue 21825: add validation for PERCENTILE values in speculative_retry configuration. #21879 ](https://github.com/scylladb/scylladb/pull/21879). When that PR prevented using invalid `101PERCENTILE` values, valid `100PERCENTILE` and `0PERCENTILE` value were prevented too. Reproduction steps from [[Bug]: drop schema and all tables after apply speculative_retry = '99.99PERCENTILE' #26369](https://github.com/scylladb/scylladb/issues/26369) are unable to reproduce the issue after the fix. A test is added to make sure the inclusive border values `0` and `100` are supported. Documentation is updated to give more information to the users. It now states that these border values are inclusive, and also that the precision, with automatic rounding, is 1 decimal digit. Fixes #26369 This is a bug fix. If at any time a client tries to use value >= 99.5 and < 100, the raft error will happen. Backport is needed. The code which introduced inconsistency is introduced in 2025.2, so no backporting to 2025.1. Closes scylladb/scylladb#26909 * github.com:scylladb/scylladb: test: cqlpy: add test case for non-numeric PERCENTILE value schema: speculative_retry: update exception type for sstring ops docs: cql: ddl.rst: update speculative-retry-options test: cqlpy: add test for valid speculative_retry values schema: speculative_retry: allow 0 and 100 PERCENTILE values	2025-11-13 15:27:45 +01:00
Dario Mirovic	aba4c006ba	docs: cql: ddl.rst: update speculative-retry-options Clarify how the value of `XPERCENTILE` is handled: - Values 0 and 100 are supported - The percentile value is rounded to the nearest 0.1 (1 decimal place) Refs #26369	2025-11-09 13:23:29 +01:00
Tomasz Grabiec	28f6bdc99b	cql3: ks_prop_defs: Expand numeric RF to rack list Auto-exands numeric RF in CREATE/ALTER KEYSPACE statements for new DCs specified in the statement. Doesn't auto-expand existing options, as the rack choice may not be in line with current replica placement. This requires co-locating tablet replicas, and tracking of co-location state, which is not implemented yet. Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>	2025-10-29 23:32:59 +01:00
Gleb Natapov	c255740989	schema: Allow configuring consistency setting for a keyspace We want to add strongly consistent tables as an option. We will have two kind of strongly consistent tables: globally consistent and locally consistent. The former means that requests from all DCs will be globally linearisable while the later - only requests to the same DCs will be linearisable. To allow configuring all the possibilities the patch adds new parameter to a keyspace definition "consistency" that can be configured to be `eventual`, `global` or `local`. Non eventual setting is supported for tablets enabled keyspaces only. Since we want to start with implementing local consistency configuring global consistency will result in an error for now.	2025-10-16 13:34:49 +03:00
Tomasz Grabiec	1d34614421	doc: Document rack-list replication factor Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>	2025-10-02 19:45:00 +02:00
Szymon Wasik	0194a53659	docs: Add CQL documentation for vector queries using SELECT ANN This patch adds the missing documentation for the SELECT ... ANN statement that allows performing vector queries. This is just the basic explanation of the grammar and how to use it. More comprehensive documentation about vector search will be added separately in Scylla Cloud documentation and features description. Links to this additional documentation will be added as part of VECTOR-244. Fixes: VECTOR-247.	2025-09-26 15:07:00 +02:00
Szymon Wasik	7c4ef9aae7	docs: Add documentation for creating vector search indexes This patch adds CQL documentation about creating vector search indexes. It includes the syntax and description of parameters. It does not cover VECTOR type that is already supported and documented and it does not cover querying vectors which will be covered by a separate PR. Fixes: VECTOR-217 Closes scylladb/scylladb#26233	2025-09-25 14:49:50 +02:00
Dario Mirovic	ef83d6b970	docs: cql: default create keyspace syntax This patch updates the create keyspace statement docs. It explains how the `replication` option in the create keyspace statement is now optional, and behaves the same as if we specified an empty set as following: `WITH replication = {}`. An example with no `replication` option specified has also been added. Refs #25145	2025-09-08 15:25:30 +02:00
Dario Mirovic	587a877718	docs/cql: update documentation for default replication factor Update create-keyspace-statement section of ddl.rst since replication factor is no longer mandatory. Add an example for keyspace creation without specifying replication factor. Add an example for keyspace creation without specifying both `class` and replication factor. Refs: #16028	2025-08-28 01:42:34 +02:00
Dario Mirovic	2ac37b4fde	docs/cql: update documentation for default replication strategy Update create-keyspace-statement section of ddl.rst since `class` is no longer mandatory. Add an example for keyspace creation without specifying `class`. Refs: #16029	2025-08-13 01:52:00 +02:00
Botond Dénes	37ef9efb4e	docs: cql/types.rst: remove reference to frozen-only UDTs ScyllaDB supports non-frozen UDTs since 3.2, no need to keep referencing this limitation in the current docs. Replace the description of the limitation with general description of frozen semantics for UDTs. Fixes: #22929 Closes scylladb/scylladb#24763	2025-07-01 16:19:18 +03:00
Anna Stuchlik	b7683d0eba	doc: remove duplicated content This commit removes the Non-Reserved CQL Keywords and Reserved CQL Keywords pages-keyword as that content is already covered on the Appendices page. Redirections are added to avoid 404s for the removed pages. In addition, the Appendices page title is extended with "Reserved CQL Keywords and Types" to help users understand what those appendices are about. Fixes https://github.com/scylladb/scylladb/issues/24319 Closes scylladb/scylladb#24320	2025-06-30 10:30:13 +03:00
Karol Nowacki	4577c66a04	cql, schema: Extend name length limit from 48 to 192 bytes This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes. The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389) and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint. This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases. The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data. When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID. For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name. The directory name for this log table becomes the longest possible representation. Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas. To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows: 255 bytes (common filesystem limit for a path component) - 32 bytes (for the 32-character UUID string) - 1 byte (for the '-' separator) - 15 bytes (for the '_scylla_cdc_log' suffix) - 15 bytes (reserved for future use) ---------- = 192 bytes (Maximum allowed name length) This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038). This patch also updates/adds all associated tests to validate the new 192-byte limit. The documentation has been updated accordingly.	2025-06-18 14:08:38 +02:00
Andrzej Jackowski	2d4acb623e	docs: cql: add explicit explanation how mixing IFs works in LWT There is a difference how ScyllaDB and Cassandra handle conditional batches with different IF statements (such as "IF EXISTS" and "IF NOT EXISTS"). This commit explicitly documents the differences in the behavior. Refs: #13011	2025-05-26 15:13:01 +02:00
Avi Kivity	5e1cf90a51	build: replace tools/java submodule with packaged cassandra-stress We no longer use tools/java (scylladb/scylla-tools-java.git) for nodetool or cqlsh; only cassandra-stress. Since that is available in package form install that and excise the tools/java submodule from the source tree. pgo/ is adjusted to use the packaged cassandra-stress (and the cqlsh submodule). A few jmx references are dropped as well. Frozen toolchain regenerated. Optimized clang from https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-aarch64.tar.gz https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-x86_64.tar.gz Closes scylladb/scylladb#23698	2025-04-15 10:11:28 +03:00
Karol Baryła	df64985a4e	Docs: Describe driver issue with tablet RF increase Current protocol extension that sends tablet info to drivers only does that if the driver selects a non-replica coordinator for a routable request. It works well if some node on the replica list is replaced by other node, or if some replicas are removed from the list. Driver will at some point send a request to stale replica, and receive new list in response. The issue is with extending the list with new replicas. In that case old replicas are all still correct, so driver will not select any wrong replica, and will not receive the new list. As far as I know that only scenario where this could happen is RF increase. It could be to some degree worked around in the drivers, but it would add significant complexity (definitely more than any other invalidations we introduced) while still not being ideal solution. This scenario should be rare enough, and the consequences of not handling it minor enough (new replicas not being used as coordinators) that it does not warrant driver-side solution. Instead this commit adds info about this to documentation, advising users to restart applications after replica lists are extended. It is worth noting that if new tablet feedback protocol extension is implemented then this problem goes away. See issue #21664. Closes scylladb/scylladb#23447	2025-04-11 13:48:40 +02:00
Dawid Mędrek	32879ec0d5	db/config: Introduce RF-rack-valid keyspaces We introduce a new term in the glossary: RF-rack-valid keyspace. We also highlight in our user documentation that all keyspaces must remain RF-rack-valid throughout their lifetime, and failing to guarantee that may result in data inconsistencies or other issues. We base that information on our experience with materialized views in keyspaces using tablets, even though they remain an experimental feature. Along with the new term, we introduce a new configuration option called `rf_rack_valid_keyspaces`, which, when enabled, will enforce preserving all keyspaces RF-rack-valid. That functionality will be implemented in upcoming commits. For now, we materialize the restriction in form of a named requirement: a function verifying that the passed keyspace is RF-rack-valid. The option is disabled by default. That will change once we adjust the existing tests to the new semantics. Once that is done, the option will first be enabled by default, and then it will be removed. Fixes scylladb/scylladb#20356	2025-03-19 14:46:35 +01:00
Paweł Zakrzewski	d483051e44	cql3/select_statement: reject aggregate functions when PER PARTITION LIMIT is present Before this patch we silently allowed and ignored PER PARTITION LIMIT. While using aggregate functions in conjunction with PER PARTITION LIMIT can make sense, we want to disable it until we can offer proper implementation, see #9879 for discussion. We want to match Cassandra, and for queries with aggregate functions it behaves as follows: - it silently ignores PER PARTITION LIMIT if GROUP BY is present, which matches our previous implementation. - rejects PER PARTITION LIMIT when GROUP BY is not present. This patch adds rejection of the second group. Fixes #9879 Closes scylladb/scylladb#23086	2025-03-13 10:29:53 +02:00
Anna Stuchlik	562b5db5b8	doc: Remove "experimental" from ALTER KEYSPACE with Tablets Altering a keyspace with tablets is no longer experimental. This commit removes the "Experimental" label from the feature. Fixes https://github.com/scylladb/scylladb/issues/23166 Closes scylladb/scylladb#23183	2025-03-12 17:41:36 +02:00
Anna Stuchlik	0999fad279	doc: add information about tablets limitation to the CQL page This commit adds a link to the Limitations section on the Tablets page to the CQL pag, the tablets option. This is actually the place where the user will need the information: when creating a keyspace. In addition, I've reorganized the section for better readability (otherwise, the section about limitations was easy to miss) and moved the section up on the page. Note that I've removed the updated content from the `_common` folder (which I deleted) to the .rst page - we no longer split OSS and Enterprise, so there's no need to keep using the `scylladb_include_flag` directive to include OSS- and Ent-specific content. Fixes https://github.com/scylladb/scylladb/issues/22892 Fixes https://github.com/scylladb/scylladb/issues/22940 Closes scylladb/scylladb#22939	2025-02-27 15:11:19 +03:00
Kefu Chai	7bf7817e8a	docs/cql: s/wasm32-wasi/wasm32-wasip1/ Rust's WASI target of wasm32-wasi was renamed to wasm32-wasip1, see https://blog.rust-lang.org/2024/04/09/updates-to-rusts-wasi-targets.html. and our building system has been adapted to this change. let's update the document to reflect this change. Fixes scylladb/scylladb#20878 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21184	2025-02-24 11:06:46 +01:00
Anna Stuchlik	a28bbc22bd	doc: remove references to Enterprise This commit removes the redundant references to Enterprise, which are no longer valid. Fixes https://github.com/scylladb/scylladb/issues/22927 Closes scylladb/scylladb#22930	2025-02-20 11:24:34 +02:00
Avi Kivity	9712390336	Merge 'Add per-table tablet options in schema' from Benny Halevy This series extends the table schema with per-table tablet options. The options are used as hints for initial tablet allocation on table creation and later for resize (split or merge) decisions, when the table size changes. * New feature, no backport required Closes scylladb/scylladb#22090 * github.com:scylladb/scylladb: tablets: resize_decision: get rid of initial_decision tablet_allocator: consider tablet options for resize decision tablet_allocator: load_balancer: table_size_desc: keep target_tablet_size as member network_topology_strategy: allocate_tablets_for_new_table: consider tablet options network_topology_strategy: calculate_initial_tablets_from_topology: precalculate shards per dc using for_each_token_owner network_topology_strategy: calculate_initial_tablets_from_topology: set default rf to 0 cql3: data_dictionary: format keyspace_metadata: print "enabled":true when initial_tablets=0 cql3/create_keyspace_statement: add deprecation warning for initial tablets test: cqlpy: test_tablets: add tests for per-table tablet options schema: add per-table tablet options feature_service: add TABLET_OPTIONS cluster schema feature	2025-02-08 20:32:19 +02:00
Avi Kivity	861fb58e14	Merge 'vector: add support for vector type' from Dawid Pawlik This pull request is an implementation of vector data type similar to one used by Apache Cassandra. The patch contains: - implementation of vector_type_impl class - necessary functionalities similar to other data types - support for serialization and deserialization of vectors - support for Lua and JSON format - valid CQL syntax for `vector<>` type - `type_parser` support for vectors - expression adjustments such as: - add `collection_constructor::style_type::vector` - rename `collection_constructor::style_type::list` to `collection_constructor::style_type::list_or_vector` - vector type encoding (for drivers) - unit tests - cassandra compatibility tests - necessary documentation Co-authored-by: @janpiotrlakomy Fixes https://github.com/scylladb/scylladb/issues/19455 Closes scylladb/scylladb#22488 * github.com:scylladb/scylladb: docs: add vector type documentation cassandra_tests: translate tests covering the vector type type_codec: add vector type encoding boost/expr_test: add vector expression tests expression: adjust collection constructor list style expression: add vector style type test/boost: add vector type cql_env boost tests test/boost: add vector type_parser tests type_parser: support vector type cql3: add vector type syntax types: implement vector_type_impl	2025-02-06 20:36:50 +02:00
Benny Halevy	32c2f7579f	network_topology_strategy: allocate_tablets_for_new_table: consider tablet options Use the keyspace initial_tablets for min_tablet_count, if the latter isn't set, then take the maximum of the option-based tablet counts: - min_tablet_count - and expected_data_size_in_gb / target_tablet_size - min_per_shard_tablet_count (via calculate_initial_tablets_from_topology) If none of the hints produce a positive tablet_count, fall back to calculate_initial_tablets_from_topology * initial_scale. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:59:32 +02:00
Benny Halevy	1054e05491	cql3/create_keyspace_statement: add deprecation warning for initial tablets Per-table hints should be used instead. Note: the warning is produced by check_against_restricted_replication_strategies which is called also from alter_keyspace_statement. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:55:51 +02:00
Benny Halevy	c5668d99c9	schema: add per-table tablet options Unlike with vnodes, each tablet is served only by a single shard, and it is associated with a memtable that, when flushed, it creates sstables which token-range is confined to the tablet owning them. On one hand, this allows for far better agility and elasticity since migration of tablets between nodes or shards does not require rewriting most if not all of the sstables, as required with vnodes (at the cleanup phase). Having too few tablets might limit performance due not being served by all shards or by imbalance between shards caused by quantization. The number of tabelts per table has to be a power of 2 with the current design, and when divided by the number of shards, some shards will serve N tablets, while others may serve N+1, and when N is small N+1/N may be significantly larger than 1. For example, with N=1, some shards will serve 2 tablet replicas and some will serve only 1, causing an imbalance of 100%. Now, simply allocating a lot more tablets for each table may theoretically address this problem, but practically: a. Each tablet has memory overhead and having too many tablets in the system with many tables and many tablets for each of them may overwhelm the system's and cause out-of-memory errors. b. Too-small tablets cause a proliferation of small sstables that are less efficient to acces, have higher metadata overhead (due to per-sstable overhead), and might exhaust the system's open file-descriptors limitations. The options introduced in this change can help the user tune the system in two ways: 1. Sizing the table to prevent unnecessary tablet splits and migrations. This can be done when the table is created, or later on, using ALTER TABLE. 2. Controlling min_per_shard_tablet_count to improve tablet balancing, for hot tables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-02-06 08:55:51 +02:00
Dawid Pawlik	a68bf6dcc1	docs: add vector type documentation Add missing vector type documentation including: definition of vector, adjustment of term definition, JSON encoding, Lua and cql3 type mapping, vector dimension limit, and keyword specification.	2025-01-28 21:14:49 +01:00
Anna Stuchlik	b2a718547f	doc: remove Enterprise labels and directives This PR removes the now redundant Enterprise labels and directives from the ScyllDB documentation. Fixes https://github.com/scylladb/scylladb/issues/22432 Closes scylladb/scylladb#22434	2025-01-27 16:01:48 +02:00
Benny Halevy	5c77956205	docs: ddl: document the deprecation of compact tables Add a paragraph documenting the decision to deprecate the COMPACT STORAGE feature, and instruct the user how to enable the feature despite that. Note that we don't have an official migration strategy for users like `DROP COMPACT STORAGE`, which is not implemented at this time (See #3882). Fixes #16375 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-01-20 08:14:39 +02:00
Nadav Har'El	15c252fd8f	Merge 'docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD' from Dawid Mędrek As part of #18750, we added a CQL statement CREATE ROLE WITH SALTED HASH that prevented hashing a password when creating a role, effectively leading to inserting a hash given by the user directly into the database. In #21350, we noticed that Cassandra had implemented a CQL statement of similar semantics but different syntax. We decided to rename Scylla's statement to be compatible with Cassandra. Unfortunately, we didn't notice one more difference between what we had in Scylla and what was part of Cassandra. Scylla's statement was originally supposed to only be used when restoring the schema and the user needn't have to be aware of its existence at all: the database produced a sequence of CQL statements that the user saved to a file and when a need to restore the schema arose, they would execute the contents of the file. That's why that although we documented the feature, it was only done in the necessary places. Those that weren't related to the backup & restore procedure were deliberately skipped. Cassandra, on the other hand, added the statement for a different purpose (for details, see the relevant issue) and it was supposed to be used by the user by design. The statement is also documented as such. Since we want to preserve compatibility with Cassandra, we document the statement and its semantics in the user documentation, explicitly implying that it can be used by the user. We also add a test verifying that logging in works correctly. Fixes scylladb/scylladb#21691 Backport: not needed. The relevant code didn't make it to 6.2 or any previous version of OSS. Closes scylladb/scylladb#21752 * github.com:scylladb/scylladb: docs: Update documentation on CREATE ROLE WITH HASHED PASSWORD test/boost: Add test for creating roles with hashed passwords	2025-01-14 15:33:30 +02:00
Kefu Chai	23729beeb5	docs: remove "ScyllaDB Enterprise" labels remove the "ScyllaDB Enterprise" labels in document. because there is no need to differentiate ScyllaDB Enterprise from its OSS variant, let's stop adding the "ScyllaDB Enterprise" labels to enterprise-only features. this helps to reduce the confusion. as we are still in the process of porting the enterprise features to this repo, this change does not fix scylladb/scylladb#22175. we will review the document again when completing the migration. we also take this opportunity to stop referencing "Enterprise" in the changed paragraph. Refs scylladb/scylladb#22175 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22177	2025-01-08 09:02:52 +02:00

1 2 3 4

156 Commits